Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Detailed Action
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  
As discussed within the previous communication, the rejection of amended Claims 1-10 & 13-19 under 35 USC 101 has been maintained, where the claimed invention continues to be directed to an abstract idea without significantly more, and contains limitations that can practically be performed in the human mind, and as such, under the broadest reasonable interpretation they amount to a mental process.  Updated rationale has been provided below.  Claims 1-10 & 13-19 remain pending in this application.  Applicant's submission filed on 16 November 2021 has been entered.
 




Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-10 & 13-19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The limitations of Claim 1 are “identifying...one or more subjects in the query, the one or more subjects including a particular subject”, “identifying one or more sections of a recording relating to the particular subject of the query utilizing...a natural language processing system or an image processing system, wherein the one or more sections of the recording relating to the particular subject of the query include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject” and “determining that the particular subject is in at least one of the one or more sections”, and contain limitations that can practically be performed in the human mind, and as such, under the broadest reasonable interpretation they amount to a mental process.  
Claim 1 as currently provided, fails to recite additional elements that integrate the judicial exception disclosed above into a practical application.  The “a query module”, “memory”, “a natural language processing system or an image processing system”, “a graphical user interface” and “computer program product comprising a non-transitory a computer readable storage medium” are recited at a high-level of generality (i.e., as a generic computer, memory and processor for performing generic computer functions of identifying sections of a recording, grouping the sections into subdivisions, determining subjects of the sections and displaying the information to the user) such that they amount to no more than mere instructions to apply the exception using one or more generic computer components.  The plurality of functions, such as the “displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user” limitation listed above, are performed, “by one or more computing devices” and are similarly not indicative of integration into a practical application because it constitutes the performance of one or more abstract ideas using well-understood, routine, and conventionally used techniques utilizing mere computer components.  The mere execution of these tasks using conventionally used hardware components does not integrate an abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.      The limitation in which “...portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject...” within one or more subdivisions or the one or more sections, is merely providing further classification of data {i.e. “portions...that reference the particular subject”/”portions...that provide context”} for presentation using an extra-solution activity for the displaying of content.  The mere presentation of data does not integrate an abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
The claim as currently presented does not recite additional elements that amount to significantly more than the judicial exception analyzed above. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of, “receiving...”, merely provides for the gathering of data using at least a computer component and memory device, which is treated as insignificant extra solution activity.  The claim language of, “generating condensed snippets of the recording by grouping the one or more sections...”, has been treated similarly as insignificant extra solution activity, and merely provides for the utilization of one or more techniques for condensing and grouping snippets of a recording using mere computer components.  The utilization of components such as “a query module”, “a natural language processing system or an image processing system”, “a memory”, “a graphical user interface” and “a computer program product comprising a non-transitory a computer readable storage medium” amount to no more than mere instructions to apply the exception using generic computer components. The “memory”, “a query module”, “a natural language processing system or an image processing system”, “a graphical user interface” and “non-transitory a computer readable storage medium” components, used singularly or in combination, do not amount to significantly more than the judicial exception.  Applicant’s disclosure, at least within paragraphs [0112-0116] provide the mere use of a computer and its components for the implementation of Applicant’s claimed functions utilizing well-understood, routine and conventionally used techniques.   In regards to the language of, “a natural language processing system or an image processing system”, it appears to amount to the mere addition of a technological environment. The utilization of “a natural language processing system or an image processing system” amounts to the utilization of mere recognition techniques which can be performed mentally and instantiated by at least a computer and one or more of a memory, and do not amount to significantly more than the judicial exception.  Claim limitations that serve to limit an invention to a particular technological environment will not be sufficient in constituting an inventive concept.  What is needed is a claim element that improves a technological process or otherwise improves the functioning of a computer.  Within Applicant’s “identifying one or more sections...” limitation of  Independent Claim 1, there is a reference to “...the query utilizing...a natural language processing system or an image processing system”, which merely makes note of the general use of one or more technological processes without providing any level of specificity into the utilization of the one or more aforementioned components, and it would be helpful to add further specificity to the use of the “natural language processing system” and/or “image processing system” in order to better teach the improvement in a technological process or its functionality.  Applicant’s limitation providing for, “displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user”, amounts to mere data output, similar to activities found by the courts to be well-understood, routine, and conventional activity when they are claimed as insignificant extra-solution activity (see OIP Techs, 788 F.3d at 1362-63, 115 USPQ2d at 1092-93.)  
Independent Claims 9 & 17 have been analyzed using similar rationale and are similarly not patent eligible as they are directed to an abstract idea without significantly more.  
Claim 9 as currently provided, fails to recite additional elements that integrate the judicial exception disclosed above into a practical application.  The “a query module”, “memory”, “a natural language processing system or an image processing system”, “a graphical user interface” and “graphical user interface” are recited at a high-level of generality (i.e., as a generic computer, memory and processor for performing generic computer functions of identifying one or more subjects in the query, identifying sections of a recording, grouping the sections into subdivisions, determining subjects of the sections and displaying the information to the user) such that they amount to no more than mere instructions to apply the exception using one or more generic computer components.  The plurality of functions, such as the “displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user” limitation listed above, are performed, “by one or more computing devices” and are similarly not indicative of integration into a practical application because it constitutes the performance of one or more abstract ideas using well-understood, routine, and conventionally used techniques utilizing mere computer components.  The mere execution of these tasks using conventionally used hardware components does not integrate an abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.  The claim language of, “generating one or more partitions...”, “identifying a total number of the one or more sections”, “correlating the total number of the one or more sections...”, “correlating each subsequent partition...” and “determining which of the one or more subdivisions to display...” merely provides for the utilization of one or more techniques for condensing and grouping snippets of a recording using mere computer components, and are considered insignificant extra-solution activity incorporating well-understood, routine and conventional techniques for the mere gathering of data, similar to Selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display, Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016).  The “receiving, from a user, a query input into the query module” and “displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user” limitations are similarly not indicative of integration into a practical application because it constitutes insignificant extra-solution activity by receiving user input and displaying content that has been derived using techniques utilizing mere computer components.  The portion of the limitation in which “...the portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject...” within one or more subdivisions or the one or more sections, is merely providing further classification of data {i.e. “portions...that reference the particular subject”/”portions...that provide context”} for presentation using an extra-solution activity for the displaying of content.  The mere presentation of data does not integrate an abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
The claim as currently presented does not recite additional elements that amount to significantly more than the judicial exception analyzed above. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of using “a query module”, “a natural language processing system or an image processing system”, “a memory”, “a graphical user interface” and “graphical user interface” amount to no more than mere instructions to apply the exception using generic computer components.  The “memory”, “a query module”, “a natural language processing system or an image processing system”, and “a graphical user interface” components, used singularly or in combination, do not amount to significantly more than the judicial exception.  Applicant’s disclosure, at least within paragraphs [0112-0116] provide the mere use of a computer and its components for the implementation of Applicant’s claimed functions utilizing well-understood, routine and conventionally used techniques.   In regards to the amended language of, “a natural language processing system or an image processing system”, it appears to amount to the mere addition of a technological environment. Claim limitations that serve to limit an invention to a particular technological environment will not be sufficient in constituting an inventive concept. What is needed is a claim element that improves a technological process or otherwise improves the functioning of a computer.  Within Applicant’s “identifying one or more sections...” limitation of Independent Claim 9, there is a reference to “...the query utilizing...a natural language processing system or an image processing system”, which merely makes note of the general use of one or more technological processes without providing any level of specificity into the utilization of the one or more aforementioned components, and it would be helpful to add further specificity to the use of the “natural language processing system” and/or “image processing system” in order to better teach the improvement in a technological process or its functionality.  Applicant’s limitations for providing for, “receiving...a query input into the query module” and “displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user”, amounts to mere reception of input data and data output, similar to activities found by the courts to be well-understood, routine, and conventional activity when they are claimed as insignificant extra-solution activity (see OIP Techs, 788 F.3d at 1362-63, 115 USPQ2d at 1092-93.)
Claim 17 as currently provided, fails to recite additional elements that integrate the judicial exception disclosed above into a practical application.  The “query module”, “a natural language processing system or an image processing system”, “a graphical user interface” and “computer program product comprising a non-transitory computer readable storage medium” are recited at a high-level of generality (i.e., as a generic computer, computer program product comprising a non-transitory computer readable storage medium and graphical user interface for performing generic computer functions of identifying one or more subjects in the query, identifying sections of a recording, grouping the sections into subdivisions, determining subjects of the sections, determining subdivisions to display and displaying the one or more subdivisions to the user) such that they amount to no more than mere instructions to apply the exception using one or more generic computer components.  The claim language of, “wherein the one or more sections of the recording relating to the particular subject of the query include portions of the recording...”, “determining which of the one or more subdivisions to display...” and “wherein the portions that provide the context surrounding the particular subject are emphasized”, using well-understood, routine and conventionally used techniques that amount to insignificant extra solution activity utilizing one or more of a general computing component, such as a display device, similar to Selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display, Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016).  The “receiving...a query input into the query module” and “displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user” limitations are similarly not indicative of integration into a practical application because it constitutes insignificant extra-solution activity by receiving user input and displaying content that has been derived using techniques utilizing mere computer components.  The amended portion of the limitation in which “...the portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject...” within one or more subdivisions or the one or more sections, is merely providing further classification of data {i.e. “portions...that reference the particular subject”/”portions...that provide context”} for presentation using an extra-solution activity for the displaying of content.  The mere presentation of data does not integrate an abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
The claim as currently presented does not recite additional elements that amount to significantly more than the judicial exception analyzed above. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of using “a query module”, “a natural language processing system or an image processing system”, “a graphical user interface” and “computer program product comprising a non-transitory computer readable storage medium” amount to no more than mere instructions to apply the exception using generic computer components. The “a query module”, “a natural language processing system or an image processing system”, “a graphical user interface” components and “computer program product comprising a non-transitory computer readable storage medium”, used singularly or in combination, do not amount to significantly more than the judicial exception.  Applicant’s disclosure, at least within paragraphs [0112-0116] provide the mere use of a computer and its components for the implementation of Applicant’s claimed functions utilizing well-understood, routine and conventionally used techniques. In regards to the amended language of, “a natural language processing system or an image processing system”, it appears to amount to the mere addition of a technological environment. Claim limitations that serve to limit an invention to a particular technological environment will not be sufficient in constituting an inventive concept. What is needed is a claim element that improves a technological process or otherwise improves the functioning of a computer.  Within Applicant’s “identifying one or more sections...” limitation of  Independent Claim 17, there is a reference to “...the query utilizing...a natural language processing system or an image processing system”, which merely makes note of the general use of one or more technological processes without providing any level of specificity into the utilization of the one or more aforementioned components, and it would be helpful to add further specificity to the use of the “natural language processing system” and/or “image processing system” in order to better teach the improvement in a technological process or its functionality.  Applicant’s limitations providing for “receiving...a query input into the query module” and “displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user” amount to receiving input data and providing data output, similar to activities found by the courts to be well-understood, routine, and conventional activity when they are claimed as insignificant extra-solution activity (see OIP Techs, 788 F.3d at 1362-63, 115 USPQ2d at 1092-93.)
 Dependent Claims 2-8, 10, 13-16 & 18 & 19 recite similar limitations in which the identifying and grouping of one or more sections of a recording, identified by discerned subjects of a received query, can be mentally performed or conceived in the mind.  As further elaborated upon below, the dependent claims present well-understood, routine and conventionally used techniques which provide insignificant extra-solution activity for performing identifying, grouping and displaying functions, while utilizing mere computer device components.
Dependent Claim 2 provides for the same abstract idea discussed for Claim 1, and provides for the “determining that the at least the one or more sections of the recording include at least a common feature”, which is considered well-understood, routine and conventional techniques commonly used within the application of a computer, which is merely performing the step of organizing features into sections.  The claim limitations amount to mental processes that can practically be performed in the human mind while utilizing insignificant extra-solution activity for incorporating well-understood, routine and conventional techniques involving the mere gathering of data, similar to Consulting and updating an activity log, Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754., making them not enough to turn the abstract idea of the Independent Claims into a practical application.  Corresponding Claim 10 recites similar limitations and are rejected in view of similar rationale provided above.

Dependent Claim 3 provides for the same abstract idea discussed for Claim 1, and provide for “generating one or more partitions...” by “identifying a total number of one or more sections”, “correlating the total number of the one or more sections to a first partition, wherein the first partition includes all of the one or more sections and is associated with a first granularity” and “correlating each subsequent partition to one minus the previous number of the one or more sections, wherein each subsequent partition is associated with a corresponding granularity”, which is considered well-understood, routine and conventional techniques commonly used within the application of a computer, where the claim language is merely the performance of identifying sections and correlating those identified sections to one or more partitions.  The claim limitations amount to mental processes that can practically be performed in the human mind while utilizing insignificant extra-solution activity for incorporating well-understood, routine and conventional techniques involving the mere gathering of data, similar to Consulting and updating an activity log, Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754., making them not enough to turn the abstract idea of the Independent Claims into a practical application.  Corresponding Claim 19 recites similar limitations and are rejected in view of similar rationale provided above.
   
 Dependent Claim 5 provides for of determining one or more sections of the recording, which include further steps for “analyzing...one or more sections”, and “identifying a similar acoustic within one or more sections, wherein the similar acoustic is a sound identified to be above an identical acoustic threshold”, as discussed within Claim 5, amount to mental processes that can practically be performed in the human mind while utilizing insignificant extra-solution activity proving the mere gathering of data, similar to Consulting and updating an activity log, Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754.  The use of “natural language processing” for the performance of the “analyzing...” within dependent Claim 5 is recited at a high level of generality (i.e., as a means for processing acoustic data into partitions for further analysis) and amounts to insignificant extra-solution activity. The analysis of acoustic data performed by natural language processing merely automates identifies and compares data against an applied threshold (i.e., the analysis and partitioning of acoustic data based on subject), and is, likewise, insignificant extra-solution activity. See Revised Guidance, 84 Fed. Reg. at 55; MPEP § 2106.05(g). Claims 13 & 18 recite similar limitations and are rejected in view of rationale provided above.
Dependent Claim 6 provides for the determining one or more sections of the recording, which include further steps for “analyzing each of the one or more sections” and “tagging each of the one or more sections in which the particular subject is identified as an indicator”, amounting to mere mental processes  for analyzing and assigning  labels to one or more sections, which are that can practically be performed in the human mind while utilizing insignificant extra-solution activity, which is considered the mere gathering of data, similar to Consulting and updating an activity log, Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754.  Claim 14 recites similar limitations and are rejected in view of rationale provided above.
Claim 4 provide limitations for displaying subdivisions of sections of media pertaining to particular subject matter by determining and emphasizing which subdivisions to display using well-understood, routine and conventionally used techniques that amount to insignificant extra solution activity utilizing one or more of a general computing component, such as a display device, similar to Selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display, Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016).
Claim 7 provides teachings for identifying the particular subject in each of the one or more sections having further steps of further “accessing a subject repository, wherein the subject repository includes each image and textual representation of dialogue of the one or more sections” and “examining a subject repository for one or more images and textual representations of dialogue associated with the particular subject”, which is the mere accessing and analysis of data items within a collection of items, amounting to mental processes that can practically be performed in the human mind while utilizing insignificant extra-solution activity, which is the mere gathering of data, similar to Selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display, Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016).  Corresponding Claim 15 recites similar limitations and are rejected in view of rationale provided above.  
Claim 8 provides teachings for identifying the particular subject in each of the one or more sections having further steps of further “automatically identifying all of the subdivisions with at least one subject”, “scoring the subdivisions according to the number of times the subject is identified in a subdivision”, “comparing each score to a subject identification threshold”, and “generating a new recording of the subdivisions that have a score exceeding the subject identification threshold”, which is the performance of steps which identify, compare and score one or more of a plurality of subdivisions of data against a threshold, amounting to mental processes that can practically be performed in the human mind while utilizing insignificant extra-solution activity, similar to Consulting and updating an activity log, Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754. Corresponding Claim 16 recites similar limitations and are rejected in view of rationale provided above.

	

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1 is rejected under 35 U.S.C. 103 as being unpatentable over Yeo et al (USPG Pub No. 20180322103A1; Yeo hereinafter) in view of Shetty et al (USPG Pub No. 20160070962A1; Shetty hereinafter). 

As for Claim 1, Yeo teaches, A computer-implemented method comprising: 
receiving, from a user, a query input into a query module (see pp. [0003], [0029-0030]; e.g., the reference of Yeo provides for the extraction of audiovisual features from digital components and teaches of utilizing a recognition engine, which functions in an equivalent fashion as Applicant’s amended “query module”, by identifying text labels associated with videos, images and multimedia elements that are query terms input by a user, reading on the amended limitation);
identifying, by the query module, one or more subjects in the query, the one or more subjects including a particular subject (see pp. [0003], [0028-0030]; e.g., the recognition engine of Yeo can receive a first request and retrieve image data for each of a plurality of images, if the input query is one or more images, for example.  Candidate images from a plurality of images can be selected by determining matches of image features between the first query image and the candidate images.  According to paragraphs [0028-0030], a user issues a request to a data processing system for one or more digital components associated with one or more digital component multimedia elements, where features of the one or more digital component multimedia elements are identified and associated with one or more digital component keywords and one or more text labels.  Paragraph [0158] further teaches that features, such as image features, can be identification of the subject matter contained within the images); and 
identifying one or more sections of a recording relating to the particular subject of the query utilizing, by the query module, a natural language processing system or an image processing system (see pp. [0037-0040]; e.g., the reference of Yeo teaches of utilizing an interface with at least one service provider natural language processor component and a service provider interface to facilitate back-and-forth real-time voice and audio-based conversation/session between a computing devices.  A pre-processor can be configured to detect a keyword and perform an action based on the keyword, thus, identifying a section of an audio recording submitted by an end user as one or more voice queries/audio input and acting on that recognized keyword.  The pre-processor can filter out one or more terms or modify terms prior to submitting for further processing and convert analog audio signals into digital audio signals).
The reference of Yeo does not appear to explicitly recite the limitations of, “wherein the one or more sections of the recording relating to the particular subject of the query include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject”,  “generating condensed snippets of the recording by grouping the one or more sections into one or more subdivisions relating to the particular subject of the query using the query module”; “determining that the particular subject is in at least one of the one or more sections”; and “displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user”.
The reference of Shetty recites the limitations of, 
“wherein the one or more sections of the recording relating to the particular subject of the query include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject” (see pp. [0007-0011], [0061]; e.g., the reference of Shetty serves as an enhancement to the teachings of the Yeo reference, and provides for a video hosting service having the ability to identify segments within video based on the features of the video, where each segment identifies a portion of consecutive frames of the video that are to be summarized together.  Segments of frames of video are considered equivalent to Applicant’s “one or more sections of the recording”.  Video may be segmented in multiple different ways by the various segment sets, where a representative frame for each of the segments of each segment set is determined, increasing the likelihood that the representative frames capture alternative portions of the video {i.e. considered equivalent to Applicant’s “portions of the recording that provide context surrounding the particular subject”}.  The request to summarize the video may be based on a search query associated with the request, and the video hosting system identifies segments in a “segment table” that are relevant to the request by comparing the semantic concepts {i.e. particular subjects} of the segments with the semantic concepts associated with the request {i.e. considered equivalent to Applicant’s “relating to the particular subject of the query include portions of the recording that reference the particular subject”}. Semantic concepts associated with the request are determined by analysis of a search query, user interest information, or by identifying semantic concepts associated with metadata of the video, as the associated metadata can provide additional context); 
“generating condensed snippets of the recording by grouping the one or more sections into one or more subdivisions relating to the particular subject of the query using the query module” (see pp. [0010-0011], [0064]; e.g., paragraphs [0010-0011] refer to the determination of representative segments amongst a plurality of relevant segments of one or more videos based on relevance scores determined on the match between the relevant segment and semantic concepts associated with the query/request.  The user is presented with relevant segments/relevant frames while the video plays, adjacent to one another. According to at least paragraph [0064], another portion of the video preview interface, known as the “scene preview”, is displayed to the user, and is shown in addition to the relevant videos or may be a separate interface/display for displaying a thumbnail of the relevant search results.  The default thumbnail is replaced with a representative frame for each video having the highest relevance score.  The scene preview presents each video summarized by the representative frame that best summarizes the video relative to the search query entered by the user. As clearly stated within the cited paragraph [0064], “In addition, while the scene preview 630 is shown here as a portion of the video preview interface 600, in this embodiment an interface element 650 permits a user to view additional videos summarized by representative frames. This interface element 650 provides the user with additional search results that also have default thumbnails replaced with query- or user-specific representative frames”, thus, providing results with additional context pertaining to the request and its conceptual elements); 
“determining that the particular subject is in at least one of the one or more sections” (see pp. [0061]; e.g., the reference of Shetty teaches of the selection of “representative segments” determined to be relevant to the received request, where the segments are scored and selected based on relevance to the video metadata and the user’s context {i.e. the user search query or user interests}.  Segments relevant to the request are scored based on a match between the segment and the semantic concepts associated with the query, reading on Applicant’s claimed limitation, as subject matter of the received request/query is considered when determining and scoring relevant segments.  Representative frames associated for the selected representative segments can be determined from the segment table); and 
“displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user” (see pp. [0061-0062]; e.g., according to at least paragraphs [0061-0062], a “video summary module” is utilized for generating a video summary using the representative frames for the selected representative segments, where the video summary chronologically combines the representative frames and may present a series of the representative frames to the user in a “storyboard”, for example, where the user may determine whether or not to view the entire video.  A “video preview interface” is then utilized for providing to the client device representative frames of video for browsing, and determining whether to view the video in full based on the video preview.  Additionally, and according to at least paragraph [0064], another portion of the video preview interface, known as the “scene preview”, is displayed to the user, and is shown in addition to the relevant videos or may be a separate interface/display for displaying a thumbnail of the relevant search results.  The default thumbnail is replaced with a representative frame for each video having the highest relevance score.  The scene preview presents each video summarized by the representative frame that best summarizes the video relative to the search query entered by the user. As clearly stated within the cited paragraph [0064], “In addition, while the scene preview 630 is shown here as a portion of the video preview interface 600, in this embodiment an interface element 650 permits a user to view additional videos summarized by representative frames. This interface element 650 provides the user with additional search results that also have default thumbnails replaced with query- or user-specific representative frames”, thus, providing results with additional context pertaining to the request and its conceptual elements).
The combined Yeo and Shetty references are considered analogous art for being within the same field of endeavor, which is presenting representative video summaries to a user.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the segmentation and presentation of relevant video segments based on conceptual elements of the received request, as taught by Shetty, with the Yeo reference, because selection of a preview of longer form video fails to accurately represent the full content of the video and a user is not able to quickly distinguish whether a particular video has the desired content without watching the video itself. (Shetty; [0005])


Claim 2-10 & 13-19 are rejected under 35 U.S.C. 103 as being unpatentable over Yeo et al (USPG Pub No. 20180322103A1; Yeo hereinafter) in view of Shetty et al (USPG Pub No. 20160070962A1; Shetty hereinafter) further in view of Miller et al (US Patent No. 9633696B1; e.g., Miller hereinafter). 

As for Claim 9, Yeo teaches, A system comprising:
a memory (see pp. [0036]; e.g., memory); and
a query module in communication with the memory (see pp. [0030]; e.g., query terms are received as input from a user within a mechanism of a data processing system), the query module being configured to perform operations comprising:
receiving, from a user, a query input into the query module (see pp. [0030], [0060]; e.g., query terms are received as input from a user within a mechanism of a data processing system such as a “direct action API”. Yeo provides for the extraction of audiovisual features from digital components and teaches of utilizing a recognition engine, which functions in an equivalent fashion as Applicant’s amended “query module”, by identifying text labels associated with videos, images and multimedia elements that are query terms input by a user, reading on the amended limitation);
identifying, by the query module, one or more subjects in the query, the one or more subjects including a particular subject (see pp. [0003], [0028-0030]; e.g., the recognition engine of Yeo can receive a first request and retrieve image data for each of a plurality of images, if the input query is one or more images, for example.  Candidate images from a plurality of images can be selected by determining matches of image features between the first query image and the candidate images.  According to paragraphs [0028-0030], a user issues a request to a data processing system for one or more digital components associated with one or more digital component multimedia elements, where features of the one or more digital component multimedia elements are identified and associated with one or more digital component keywords and one or more text labels.  Paragraph [0158] further teaches that features, such as image features, can be identification of the subject matter contained within the images); and
identifying one or more sections of a recording relating to the particular subject of the query utilizing, by the query module, a natural language processing system or an image processing system (see pp. [0037-0040]; e.g., the reefernce of Yeo teaches of utilizing an interface with at least one service provider natural language processor component and a service provider interface to facilitate back-and-forth real-time voice and audio based conversation/session between a computing devices.  A pre-processor can be configured to detect a keyword and perform an action based on the keyword, thus, identifying a section of an audio recording submitted by an end user as one or more voice queries/audio input and acting on that recognized keyword.  The pre-processor can filter out one or more terms or modify terms prior to submitting for further processing and convert analog audio signals into digital audio signals).
The reference of Yeo does not appear to recite the limitations of, “grouping the one or more sections into one or more subdivisions relating to the particular subject of the query using the query module”, “determining that the particular subject is in at least one of the one or more sections”, “generating one or more partitions, wherein the one or more partitions are generated by: identifying a total number of the one or more sections”, “correlating the total number of the one or more sections to a first partition, wherein the first partition includes all of the one or more sections and is associated with a first granularity”, “correlating each subsequent partition to one minus the previous number of the one or more sections, wherein each subsequent partition is associated with a corresponding granularity”, “determining which of the one or more subdivisions to display to the user based on a selected granularity of the partitions” and “displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user”.
The reference of Shetty teaches, “grouping the one or more sections into one or more subdivisions relating to the particular subject of the query using the query module” (see pp. [0010-0011], [0064]; e.g., the reference of Shetty serves as an enhancement to the teachings of the Yeo reference, and provides for a video hosting service having the ability to identify segments within video based on the features of the video, where each segment identifies a portion of consecutive frames of the video that are to be summarized together.  Paragraphs [0010-0011] refer to the determination of representative segments amongst a plurality of relevant segments of one or more videos based on relevance scores determined on the match between the relevant segment and semantic concepts associated with the query/request.  The user is presented with relevant segments/relevant frames while the video plays, adjacent to one another. According to at least paragraph [0064], another portion of the video preview interface, known as the “scene preview”, is displayed to the user, and is shown in addition to the relevant videos or may be a separate interface/display for displaying a thumbnail of the relevant search results.  The default thumbnail is replaced with a representative frame for each video having the highest relevance score.  The scene preview presents each video summarized by the representative frame that best summarizes the video relative to the search query entered by the user. As clearly stated within the cited paragraph [0064], “In addition, while the scene preview 630 is shown here as a portion of the video preview interface 600, in this embodiment an interface element 650 permits a user to view additional videos summarized by representative frames. This interface element 650 provides the user with additional search results that also have default thumbnails replaced with query- or user-specific representative frames”, thus, providing results with additional context pertaining to the request and its conceptual elements);
 “determining that the particular subject is in at least one of the one or more sections” (see pp. [0061]; e.g., the reference of Shetty teaches of the selection of “representative segments” determined to be relevant to the received request, where the segments are scored and selected based on relevance to the video metadata and the user’s context {i.e. the user search query or user interests}.  Segments relevant to the request are scored based on a match between the segment and the semantic concepts associated with the query, reading on Applicant’s claimed limitation, as subject matter of the received request/query is considered when determining and scoring relevant segments.  Representative frames associated for the selected representative segments can be determined from the segment table); and
“displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user” (see pp. [0061-0062]; e.g., according to at least paragraphs [0061-0062], a “video summary module” is utilized for generating a video summary using the representative frames for the selected representative segments, where the video summary chronologically combines the representative frames and may present a series of the representative frames to the user in a “storyboard”, for example, where the user may determine whether or not to view the entire video.  A “video preview interface” is then utilized for providing to the client device representative frames of video for browsing, and determining whether to view the video in full based on the video preview.  Additionally, and according to at least paragraph [0064], another portion of the video preview interface, known as the “scene preview”, is displayed to the user, and is shown in addition to the relevant videos or may be a separate interface/display for displaying a thumbnail of the relevant search results.  The default thumbnail is replaced with a representative frame for each video having the highest relevance score.  The scene preview presents each video summarized by the representative frame that best summarizes the video relative to the search query entered by the user. As clearly stated within the cited paragraph [0064], “In addition, while the scene preview 630 is shown here as a portion of the video preview interface 600, in this embodiment an interface element 650 permits a user to view additional videos summarized by representative frames. This interface element 650 provides the user with additional search results that also have default thumbnails replaced with query- or user-specific representative frames”, thus, providing results with additional context pertaining to the request and its conceptual elements).
The combined Yeo and Shetty references are considered analogous art for being within the same field of endeavor, which is presenting representative video summaries to a user.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the segmentation and presentation of relevant video segments based on conceptual elements of the received request, as taught by Shetty, with the Yeo reference, because selection of a preview of longer form video fails to accurately represent the full content of the video and a user is not able to quickly distinguish whether a particular video has the desired content without watching the video itself. (Shetty; [0005])
The references of Yeo and Shetty do not explicitly recite the limitations of, “generating one or more partitions, wherein the one or more partitions are generated by: identifying a total number of the one or more sections”, “correlating the total number of the one or more sections to a first partition, wherein the first partition includes all of the one or more sections and is associated with a first granularity”, “correlating each subsequent partition to one minus the previous number of the one or more sections, wherein each subsequent partition is associated with a corresponding granularity”, and “determining which of the one or more subdivisions to display to the user based on a selected granularity of the partitions”.
The reference of Miller recites the limitations of, “generating one or more partitions, wherein the one or more partitions are generated by:
identifying a total number of the one or more sections” (see col. 3, lines 30-39, 59-67; col. 4, lines 1-11; e.g., the reference of Miller serves as an enhancement to the teachings of Teo and Shetty, and teaches of partitioning the derived content template into a plurality of template elements, reading on Applicant’s identifying step), 
“correlating the total number of the one or more sections to a first partition, wherein the first partition includes all of the one or more sections and is associated with a first granularity” (see col. 42, lines 58-67; col. 43, line 1; col. 44, lines 62-67; col. 45, lines 1-6; e.g., the cited portions of Miller teach of utilizing one or more similarity metrics such as correlation, coefficients, distance measure, etc., as alignment measures which align the elements of the derived content template with one or more reference templates.  The alignment procedures are utilized to identify portions of the derived content template that match portions of the reference content and associate these matched portions into a map of alignment information, reading on Applicant’s claimed “correlating” limitation);
“correlating each subsequent partition to one minus the previous number of the one or more sections, wherein each subsequent partition is associated with a corresponding granularity” (see col. 42, lines 58-67; col. 43, line 1, 18-22; col. 44, lines 62-67; col. 45, lines 1-6; e.g., the cited portions of Miller teach of utilizing one or more similarity metrics such as correlation, coefficients, distance measure, etc., as alignment measures which align the elements of the derived content template with one or more reference templates.  The alignment procedures are utilized to identify portions of the derived content template that match portions of the reference content and associate these matched portions into a map of alignment information.  According to column 43, line 1, 18-22, the Pearson correlation coefficient appears to be performing the equivalent method for correlating partitions and sections); and
“determining which of the one or more subdivisions to display to the user based on a selected granularity of the partitions” (see Fig. 12; see col. 21, lines 13-60; e.g., the cited reference teaches of utilizing a customer interface, which helps to determine information to be displayed to a user based on a plurality of factors pertaining to the particular request being received.  As stated previously, highlighting text, for example, is a function used in order to further “emphasize” areas derived from one or more media files and associated with synchronized derived content. Graphical representations of locations within the media file where portions of a clip may be found is presented to a customer through a customer interface, as illustrated within the cited Figure 12.  The customer interface provides a user interface to the customer displaying grouped media file information such as one or more media files, descriptive attributes of the one or more media files, annotations, semantic tagging, and advertising related to the one or more media files, amongst other potential groupings of information.  At least column 42, lines 58-67 through column 43, line 1and column 44, lines 62-67 through column 45, lines 1-6 provide teachings for utilizing one or more similarity metrics such as correlation, coefficients, distance measure, etc., as alignment measures which align the elements of the derived content template with one or more reference templates to identify portions of the derived content template that match portions of the reference content and associate these matched portions into a map of alignment information.  One or more alignment procedures can be utilized, such as a “Partition-and-Place Procedure” and/or “Seed-and-Grow Procedure” amongst a plurality of procedures.  The “Partition-and-Place Procedure” causes the synchronization engine to divide derived content template into a plurality of template elements having constant length {i.e. 100 video frames} or variable length, with length defined by a configurable parameter selected to balance execution speed with matching accuracy, as discussed within column 44, lines 21-33 for further elaboration, providing adjustable levels of matching accuracy for the presentation of results pertaining to the retrieval of content).
The combined Yeo, Shetty and Miller are considered analogous art for being within the same field of endeavor, which is finding relationships between multiple portions of content and the extraction of features from input requests.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the grouping and displaying of content by subject matter/topic information extracted from user input, as taught by Miller, with Shetty and Yeo in order to create synchronized content derived from reference content. (Miller; col. 2, lines 60-66)

As for Claim 2, the reference of Yeo provides for the extraction of audiovisual features from digital components, and the Shetty reference teaches of providing for a video hosting service having the ability to identify segments within video based on the features of the video, where each segment identifies a portion of consecutive frames of the video that are to be summarized together.
The combined Yeo and Shetty references are considered analogous art for being within the same field of endeavor, which is presenting representative video summaries to a user.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the segmentation and presentation of relevant video segments based on conceptual elements of the received request, as taught by Shetty, with the Yeo reference, because selection of a preview of longer form video fails to accurately represent the full content of the video and a user is not able to quickly distinguish whether a particular video has the desired content without watching the video itself. (Shetty; [0005])
The references of Yeo and Shetty do not explicitly recite the limitation of, “wherein grouping the one or more sections into the one or more subdivisions further comprises: determining that at least one of the one or more sections of the recording include at least one common feature”.
Miller teaches, “wherein grouping the one or more sections into the one or more subdivisions further comprises:
determining that at least one of the one or more sections of the recording include at least one common feature” (see col. 41, lines 44-59; col. 48, lines 20-56; e.g., the reference of Miller serves as an enhancement to the combined Yeo and Shetty references, and teaches of providing for a sampling period by a synchronization engine in the creation of a derived content template, in which the synchronization engine constructs a derived content template using any number of common features such as “total energy envelope” and/or Fourier transform vector sequence, and computing the feature vectors sequentially across an entire audio track corresponding to a video frame, clip, or clip real, at a desired sampling frequency, with each sample/”feature vector” associated with a time code according to sampling frequency, reading on Applicant’s claimed limitation).
The combined Yeo, Shetty and Miller are considered analogous art for being within the same field of endeavor, which is finding relationships between multiple portions of content and the extraction of features from input requests.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the grouping and displaying of content by subject matter/topic information extracted from user input, as taught by Miller, with Shetty and Yeo in order to create synchronized content derived from reference content. (Miller; col. 2, lines 60-66)

As for Claim 3, the reference of Yeo provides for the extraction of audiovisual features from digital components, and the Shetty reference teaches of providing for a video hosting service having the ability to identify segments within video based on the features of the video, where each segment identifies a portion of consecutive frames of the video that are to be summarized together.
The references of Yeo and Shetty do not explicitly recite the limitations of, “generating one or more partitions, wherein the one or more partitions are generated by: identifying a total number of the one or more sections”; “correlating the total number of the one or more sections to a first partition, wherein the first partition includes all of the one or more sections and is associated with a first granularity”; and “correlating each subsequent partition to one minus the previous number of the one or more sections, wherein each subsequent partition is associated with a corresponding granularity”.
Miller teaches, “generating one or more partitions, wherein the one or more partitions are generated by: 
identifying a total number of the one or more sections” (see col. 3, lines 30-39, 59-67; col. 4, lines 1-11; e.g., the reference of Miller teaches of partitioning the derived content template into a plurality of template elements, reading on Applicant’s identifying step)
“correlating the total number of the one or more sections to a first partition, wherein the first partition includes all of the one or more sections and is associated with a first granularity” (see col. 42, lines 58-67; col. 43, line 1; col. 44, lines 62-67; col. 45, lines 1-6; e.g., the cited portions of Miller teach of utilizing one or more similarity metrics such as correlation, coefficients, distance measure, etc., as alignment measures which align the elements of the derived content template with one or more reference templates.  The alignment procedures are utilized to identify portions of the derived content template that match portions of the reference content and associate these matched portions into a map of alignment information, reading on Applicant’s claimed “correlating” limitation); and
“correlating each subsequent partition to one minus the previous number of the one or more sections, wherein each subsequent partition is associated with a corresponding granularity” (see col. 42, lines 58-67; col. 43, line 1, 18-22; col. 44, lines 62-67; col. 45, lines 1-6; e.g., the cited portions of Miller teach of utilizing one or more similarity metrics such as correlation, coefficients, distance measure, etc., as alignment measures which align the elements of the derived content template with one or more reference templates.  The alignment procedures are utilized to identify portions of the derived content template that match portions of the reference content and associate these matched portions into a map of alignment information.  According to column 43, line 1, 18-22, the Pearson correlation coefficient appears to be performing the equivalent method for correlating partitions and sections).
The combined Yeo, Shetty and Miller are considered analogous art for being within the same field of endeavor, which is finding relationships between multiple portions of content and the extraction of features from input requests.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the grouping and displaying of content by subject matter/topic information extracted from user input, as taught by Miller, with Shetty and Yeo in order to create synchronized content derived from reference content. (Miller; col. 2, lines 60-66)

As for Claim 4, the reference of Yeo provides for the extraction of audiovisual features from digital components, and the Shetty reference teaches of providing for a video hosting service having the ability to identify segments within video based on the features of the video, where each segment identifies a portion of consecutive frames of the video that are to be summarized together.
The references of Yeo and Shetty do not explicitly recite the limitations of, “wherein displaying the one or more subdivisions of the one or more sections that include the particular subject to the user further comprises: determining which of the one or more subdivisions to display to the user based on a selected granularity of the partitions”; and “emphasizing one or more specific areas of the recording associated with the determined subdivisions to display to the user”.
Miller teaches, “wherein displaying the one or more subdivisions of the one or more sections that include the particular subject to the user further comprises:
determining which of the one or more subdivisions to display to the user based on a selected granularity of the partitions” (see col. 21, lines 1-60; e.g., the cited reference teaches of utilizing a customer interface, which helps to determine information to be displayed to a user based on a plurality of factors pertaining to the particular request being received.  Highlighting text, for example, is a method used in order to further “emphasize” areas derived from one or more media files and associated with synchronized derived content. Graphical representations of locations within the media file where portions of a clip may be found is presented to a customer through a customer interface); and
“emphasizing one or more specific areas of the recording associated with the determined subdivisions to display to the user” (see col. 21, lines 1-31; e.g., the cited reference teaches of highlighting text, for example, in order to further “emphasize” areas derived from one or more media files and associated with synchronized derived content. Graphical representations of locations within the media file where portions of a clip may be found is presented to a customer through a customer interface).
The combined Yeo, Shetty and Miller are considered analogous art for being within the same field of endeavor, which is finding relationships between multiple portions of content and the extraction of features from input requests.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the grouping and displaying of content by subject matter/topic information extracted from user input, as taught by Miller, with Shetty and Yeo in order to create synchronized content derived from reference content. (Miller; col. 2, lines 60-66)

As for Claim 5, the reference of Yeo provides for the extraction of audiovisual features from digital components, and the Shetty reference teaches of providing for a video hosting service having the ability to identify segments within video based on the features of the video, where each segment identifies a portion of consecutive frames of the video that are to be summarized together.
The references of Yeo and Shetty do not explicitly recite the limitations of,  “wherein determining that at least one of the one or more sections of the recording include at least one common feature further comprises: analyzing, using natural language processing, the one or more sections” and “identifying a similar acoustic within at least one of the one or more sections, wherein the similar acoustic is a sound identified to be above an identical acoustic threshold”.
Miller teaches, “wherein determining that at least one of the one or more sections of the recording include at least one common feature further comprises:
analyzing, using natural language processing, the one or more sections” (see col. 10, lines 45-64; col. 40, lines 3-28; e.g., utilizing natural language processing techniques by the synchronization engine for further analyzing generated templates of media files and the derived content); and 
“identifying a similar acoustic within at least one of the one or more sections, wherein the similar acoustic is a sound identified to be above an identical acoustic threshold” (see col. 11, lines 53-67; e.g., the cited portions teach of considering average acoustic power within the metadata of each matched region, as one of the plurality of factors under consideration in influencing individual match scores.  According to column 3, lines 59-67 through column 4, lines 1-11, during the alignment process, a plurality of feature vectors are considered in order to determine a similarity metric, whereas acoustic power is one of a plurality of features under consideration and monitored for transgressing at least one threshold value).
The combined Yeo, Shetty and Miller are considered analogous art for being within the same field of endeavor, which is finding relationships between multiple portions of content and the extraction of features from input requests.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the grouping and displaying of content by subject matter/topic information extracted from user input, as taught by Miller, with Shetty and Yeo in order to create synchronized content derived from reference content. (Miller; col. 2, lines 60-66)

As for Claim 6, the reference of Yeo provides for the extraction of audiovisual features from digital components, and the Shetty reference teaches of providing for a video hosting service having the ability to identify segments within video based on the features of the video, where each segment identifies a portion of consecutive frames of the video that are to be summarized together.
The references of Yeo and Shetty do not explicitly recite the limitations of,  “wherein determining that the particular subject is in at least one of the one or more sections further comprises: analyzing each of the one or more sections” and “tagging each of the one or more sections in which the particular subject is identified with an indicator”.
Miller teaches, “wherein determining that the particular subject is in at least one of the one or more sections further comprises:
analyzing each of the one or more sections” (see col. 21, lines 65-67; col. 22, lines 1-8; e.g., Column 21, lines 65-67 through column 22, lines 1-8 describe a process of receiving media file information from a user interface to a customer interface indicating a domain of the subject matter of the content included in the media file or a project to be associated with the media file from which the domain may be derived, equivalent to receiving a query from a user and identifying information such as domain and subject matter, reading on Applicant’s claim language.  Additionally, column 22, lines 9-35 teaches of a customer interface, equivalent in function to a processor, providing media file information in response to information received from the user at a user interface, with the received information being unique identifiers and attributes of these files); and
“tagging each of the one or more sections in which the particular subject is identified with an indicator” (see col. 21, lines 32-54; e.g., the primary reference teaches of a customer interface having the ability to incorporate information descriptive of the attributes of the one or more media files, utilizing semantic tagging associated with the media file and identifiers pertaining to a domain of the subject matter presented in the content).
The combined Yeo, Shetty and Miller are considered analogous art for being within the same field of endeavor, which is finding relationships between multiple portions of content and the extraction of features from input requests.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the grouping and displaying of content by subject matter/topic information extracted from user input, as taught by Miller, with Shetty and Yeo in order to create synchronized content derived from reference content. (Miller; col. 2, lines 60-66)

As for Claim 7, the reference of Yeo provides for the extraction of audiovisual features from digital components, and the Shetty reference teaches of providing for a video hosting service having the ability to identify segments within video based on the features of the video, where each segment identifies a portion of consecutive frames of the video that are to be summarized together.
The references of Yeo and Shetty do not explicitly recite the limitations of,  “wherein identifying the particular subject in each of the one or more sections that the particular subject appears further comprises: accessing a subject repository, wherein the subject repository includes each image and textual representation of dialogue of the one or more sections” and “examining the subject repository for one or more images and textual representations of dialogue associated with the particular subject”.
Miller teaches, “wherein identifying the particular subject in each of the one or more sections that the particular subject appears further comprises:
accessing a subject repository, wherein the subject repository includes each image and textual representation of dialogue of the one or more sections” (see col. 21, lines 65-67; col. 22, lines 1-8; e.g., the cited portions teach of accessing media file information exchanged between a user interface to a customer interface indicating a domain of the subject matter of the content included in the media file or a project to be associated with the media file from which the domain may be derived, equivalent to receiving a query from a user and identifying information such as domain and subject matter, reading on Applicant’s claim language.  Additionally, column 22, lines 9-35 teaches of a customer interface, equivalent in function to a processor, providing media file information in response to information received from the user at a user interface, with the received information being unique identifiers and attributes of these files.  Within previous text, column 9, lines 40-55 provide teachings into the access and management of a transcription job market within the transcription system by at least a market engine.  The market engine exchanges information with a plurality of components such as the customer interface, synchronization engine, market data storage and media file storage.  The media file storage contains audio and video content); and
“examining the subject repository for one or more images and textual representations of dialogue associated with the particular subject” (see col. 9, lines 40-67; col. 10, lines 1-19; e.g., within previous text, column 9, lines 40-55 provide teachings into the access and management of a transcription job market within the transcription system by at least a market engine.  The market engine exchanges information with a plurality of components such as the customer interface, synchronization engine, market data storage and media file storage.  The media file storage contains audio and video content).
The combined Yeo, Shetty and Miller are considered analogous art for being within the same field of endeavor, which is finding relationships between multiple portions of content and the extraction of features from input requests.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the grouping and displaying of content by subject matter/topic information extracted from user input, as taught by Miller, with Shetty and Yeo in order to create synchronized content derived from reference content. (Miller; col. 2, lines 60-66)

As for Claim 8, the reference of Yeo provides for the extraction of audiovisual features from digital components, and the Shetty reference teaches of providing for a video hosting service having the ability to identify segments within video based on the features of the video, where each segment identifies a portion of consecutive frames of the video that are to be summarized together.
The references of Yeo and Shetty do not explicitly recite the limitations of,  “wherein the identifying one or more subjects in the query comprises performing natural language processing, the method further comprising: automatically identifying all of the subdivisions with at least one subject”; “scoring the subdivisions according to the number of times the subject is identified in a subdivision”; “comparing each score to a subject identification threshold”; generating a new recording of the subdivisions that have a score exceeding the subject identification threshold”.
Miller teaches, “wherein the identifying one or more subjects in the query comprises performing natural language processing, the method further comprising:
automatically identifying all of the subdivisions with at least one subject” (see col. 26, lines 29-66; e.g., the reference of Miller teaches of utilizing an editor interface in order to provide a “preview screen” to a user which includes the media file content, draft transcription information associated with the media file, as well as job information for processing a transcription job by a user, and subject matter, in order to better match editors with transcription jobs an improve transcription quality); 
“scoring the subdivisions according to the number of times the subject is identified in a subdivision” (see col. 4, lines 24-38; col. 11, lines 12-60; e.g., the cited portions teach of at least the synchronization engine generating a confidence document, which includes confidence scores associated with each stage of the synchronization process, such as scores pertaining to the metadata associated with each matched region and each unmatched region and scores pertaining to a listing of matched time regions or caption frames in the output);
“comparing each score to a subject identification threshold” (see col. 12, lines 1-26; e.g., the cited paragraph teaches the process of comparing one or more of a plurality of derived scores to a determined first and second threshold value, with the determined score being one of a plurality of feature values);
“generating a new recording of the subdivisions that have a score exceeding the subject identification threshold” (see col.12, lines 1-26; e.g., the cited paragraph teaches the process of comparing one or more of a plurality of derived scores to a determined first and second threshold value, with the determined score being one of a plurality of feature values.  As stated previously, column 3, lines 59-67 through column 4, lines 1-11 teach that during the alignment process, a plurality of feature vectors are considered in order to determine a similarity metric, whereas acoustic power is one of a plurality of features under consideration and monitored for transgressing at least one threshold value).
The combined Yeo, Shetty and Miller are considered analogous art for being within the same field of endeavor, which is finding relationships between multiple portions of content and the extraction of features from input requests.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the grouping and displaying of content by subject matter/topic information extracted from user input, as taught by Miller, with Shetty and Yeo in order to create synchronized content derived from reference content. (Miller; col. 2, lines 60-66)

Claims 10 & 13-16 amount to a system comprising instructions that, when executed by one or more processors, performs the method of Claims 2 & 5-8, respectively.  Accordingly, Claims 10 & 13-16 are rejected for substantially the same reasons as presented above for Claims 2 & 5-8 and based on the references’ disclosure of the necessary supporting hardware and software (Miller; see col. 7, lines 10-64; e.g., method for implementation integrating hardware and software components).

As for Claim 17, Miller teaches, A computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a query module to cause the query module to perform a method, the method comprising:
receiving, from a user, a query input into a query module (see pp. [0003], [0029-0030]; e.g., the reference of Yeo provides for the extraction of audiovisual features from digital components and teaches of utilizing a recognition engine, which functions in an equivalent fashion as Applicant’s amended “query module”, by identifying text labels associated with videos, images and multimedia elements that are query terms input by a user, reading on the amended limitation);
identifying, by the query module, one or more subjects in the query, the one or more subjects including a particular subject (see pp. [0003], [0028-0030]; e.g., the recognition engine of Yeo can receive a first request and retrieve image data for each of a plurality of images, if the input query is one or more images, for example.  Candidate images from a plurality of images can be selected by determining matches of image features between the first query image and the candidate images.  According to paragraphs [0028-0030], a user issues a request to a data processing system for one or more digital components associated with one or more digital component multimedia elements, where features of the one or more digital component multimedia elements are identified and associated with one or more digital component keywords and one or more text labels.  Paragraph [0158] further teaches that features, such as image features, can be identification of the subject matter contained within the images); and
identifying one or more sections of a recording relating to the particular subject of the query utilizing, by the query module, a natural language processing system or an image processing system (see pp. [0037-0040]; e.g., the reefernce of Yeo teaches of utilizing an interface with at least one service provider natural language processor component and a service provider interface to facilitate back-and-forth real-time voice and audio based conversation/session between a computing devices.  A pre-processor can be configured to detect a keyword and perform an action based on the keyword, thus, identifying a section of an audio recording submitted by an end user as one or more voice queries/audio input and acting on that recognized keyword.  The pre-processor can filter out one or more terms or modify terms prior to submitting for further processing and convert analog audio signals into digital audio signals).
The reference of Yeo does not appear to explicitly recite the limitations of, “wherein the one or more sections of the recording relating to the particular subject of the query include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject”,  “generating condensed snippets of the recording by grouping the one or more sections into one or more subdivisions relating to the particular subject of the query using the query module”; “determining that the particular subject is in at least one of the one or more sections”; and “displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user, wherein the portions that provide the context surrounding the particular subject are emphasized”.
The reference of Shetty recites the limitation of, “wherein the one or more sections of the recording relating to the particular subject of the query include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject” (see pp. [0007-0011], [0061]; e.g., the reference of Shetty serves as an enhancement to the teachings of the Yeo reference, and provides for a video hosting service having the ability to identify segments within video based on the features of the video, where each segment identifies a portion of consecutive frames of the video that are to be summarized together.  Segments of frames of video are considered equivalent to Applicant’s “one or more sections of the recording”.  Video may be segmented in multiple different ways by the various segment sets, where a representative frame for each of the segments of each segment set is determined, increasing the likelihood that the representative frames capture alternative portions of the video {i.e. considered equivalent to Applicant’s “portions of the recording that provide context surrounding the particular subject”}.  The request to summarize the video may be based on a search query associated with the request, and the video hosting system identifies segments in a “segment table” that are relevant to the request by comparing the semantic concepts {i.e. particular subjects} of the segments with the semantic concepts associated with the request {i.e. considered equivalent to Applicant’s “relating to the particular subject of the query include portions of the recording that reference the particular subject”}. Semantic concepts associated with the request are determined by analysis of a search query, user interest information, or by identifying semantic concepts associated with metadata of the video, as the associated metadata can provide additional context); 
“grouping the one or more sections into one or more subdivisions relating to the particular subject of the query using the query module” (see pp. [0010-0011], [0064]; e.g., paragraphs [0010-0011] refer to the determination of representative segments amongst a plurality of relevant segments of one or more videos based on relevance scores determined on the match between the relevant segment and semantic concepts associated with the query/request.  The user is presented with relevant segments/relevant frames while the video plays, adjacent to one another. According to at least paragraph [0064], another portion of the video preview interface, known as the “scene preview”, is displayed to the user, and is shown in addition to the relevant videos or may be a separate interface/display for displaying a thumbnail of the relevant search results.  The default thumbnail is replaced with a representative frame for each video having the highest relevance score.  The scene preview presents each video summarized by the representative frame that best summarizes the video relative to the search query entered by the user. As clearly stated within the cited paragraph [0064], “In addition, while the scene preview 630 is shown here as a portion of the video preview interface 600, in this embodiment an interface element 650 permits a user to view additional videos summarized by representative frames. This interface element 650 provides the user with additional search results that also have default thumbnails replaced with query- or user-specific representative frames”, thus, providing results with additional context pertaining to the request and its conceptual elements);
determining that the particular subject is in at least one of the one or more sections (see pp. [0061]; e.g., the reference of Shetty teaches of the selection of “representative segments” determined to be relevant to the received request, where the segments are scored and selected based on relevance to the video metadata and the user’s context {i.e. the user search query or user interests}.  Segments relevant to the request are scored based on a match between the segment and the semantic concepts associated with the query, reading on Applicant’s claimed limitation, as subject matter of the received request/query is considered when determining and scoring relevant segments.  Representative frames associated for the selected representative segments can be determined from the segment table); and
“displaying the condensed snippets that display the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user, wherein the portions that provide the context surrounding the particular subject are emphasized” (see pp. [0061-0062]; e.g., according to at least paragraphs [0061-0062], a “video summary module” is utilized for generating a video summary using the representative frames for the selected representative segments, where the video summary chronologically combines the representative frames and may present a series of the representative frames to the user in a “storyboard”, for example, where the user may determine whether or not to view the entire video.  A “video preview interface” is then utilized for providing to the client device representative frames of video for browsing, and determining whether to view the video in full based on the video preview.  Additionally, and according to at least paragraph [0064], another portion of the video preview interface, known as the “scene preview”, is displayed to the user, and is shown in addition to the relevant videos or may be a separate interface/display for displaying a thumbnail of the relevant search results.  The default thumbnail is replaced with a representative frame for each video having the highest relevance score.  The scene preview presents each video summarized by the representative frame that best summarizes the video relative to the search query entered by the user. As clearly stated within the cited paragraph [0064], “In addition, while the scene preview 630 is shown here as a portion of the video preview interface 600, in this embodiment an interface element 650 permits a user to view additional videos summarized by representative frames. This interface element 650 provides the user with additional search results that also have default thumbnails replaced with query- or user-specific representative frames”, thus, providing results with additional context pertaining to the request and its conceptual elements).
The combined Yeo and Shetty references are considered analogous art for being within the same field of endeavor, which is presenting representative video summaries to a user.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the segmentation and presentation of relevant video segments based on conceptual elements of the received request, as taught by Shetty, with the Yeo reference, because selection of a preview of longer form video fails to accurately represent the full content of the video and a user is not able to quickly distinguish whether a particular video has the desired content without watching the video itself. (Shetty; [0005])
The references of Yeo and Shetty do not explicitly recite the limitation of, “determining which of the one or more subdivisions to display to the user based on a selected granularity of the partitions”.
The reference of Miller recites the limitation of, “determining which of the one or more subdivisions to display to the user based on a selected granularity of the partitions” (see Fig. 12; see col. 21, lines 13-60; e.g., the cited reference teaches of utilizing a customer interface, which helps to determine information to be displayed to a user based on a plurality of factors pertaining to the particular request being received.  As stated previously, highlighting text, for example, is a function used in order to further “emphasize” areas derived from one or more media files and associated with synchronized derived content. Graphical representations of locations within the media file where portions of a clip may be found is presented to a customer through a customer interface, as illustrated within the cited Figure 12.  The customer interface provides a user interface to the customer displaying grouped media file information such as one or more media files, descriptive attributes of the one or more media files, annotations, semantic tagging, and advertising related to the one or more media files, amongst other potential groupings of information.  At least column 42, lines 58-67 through column 43, line 1and column 44, lines 62-67 through column 45, lines 1-6 provide teachings for utilizing one or more similarity metrics such as correlation, coefficients, distance measure, etc., as alignment measures which align the elements of the derived content template with one or more reference templates to identify portions of the derived content template that match portions of the reference content and associate these matched portions into a map of alignment information.  One or more alignment procedures can be utilized, such as a “Partition-and-Place Procedure” and/or “Seed-and-Grow Procedure” amongst a plurality of procedures.  The “Partition-and-Place Procedure” causes the synchronization engine to divide derived content template into a plurality of template elements having constant length {i.e. 100 video frames} or variable length, with length defined by a configurable parameter selected to balance execution speed with matching accuracy, as discussed within column 44, lines 21-33 for further elaboration, providing adjustable levels of matching accuracy for the presentation of results pertaining to the retrieval of content);
The combined Yeo, Shetty and Miller are considered analogous art for being within the same field of endeavor, which is finding relationships between multiple portions of content and the extraction of features from input requests.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the grouping and displaying of content by subject matter/topic information extracted from user input, as taught by Miller, with Shetty and Yeo in order to create synchronized content derived from reference content. (Miller; col. 2, lines 60-66)

Claims 18 &19 amount to a computer program product comprising a non-transitory computer readable storage medium comprising instructions that, when executed by one or more processors, performs the method of Claims 2 & 3, respectively.  Accordingly, Claims 18 &19 are rejected for substantially the same reasons as presented above for Claims 2 & 3 and based on the references’ disclosure of the necessary supporting hardware and software (Miller; see col. 7, lines 10-64; e.g., method for implementation integrating hardware and software components).


Response to Arguments
Applicant's arguments and amendments, with respect to the rejection of Claims 1-10 & 13-19 under 35 USC 103 have been fully considered and are persuasive in-part, as the Tesch et al reference has been withdrawn from consideration.  The Yeo and Miller et al references have been maintained for their applicable teachings in view of Applicant’s claim language, with updated rational provided within this communication above. 
Upon further consideration and in direct response to Applicant’s arguments, a new ground(s) of rejection for claims 1-10 & 13-19 is made in view of Shetty et al (USPG Pub No. 20160070962A1; Shetty hereinafter).

Applicant's arguments and amendments, with respect to the applied 35 USC 101 rejection have been fully considered, and continue to be non-persuasive. 
As reiterated from the previous communication, Examiner contends that the claimed invention continues to be directed to an abstract idea without significantly more and contains limitations that can practically be performed in the human mind, and as such, under the broadest reasonable interpretation they amount to a mental process. Applicant's proposed amended language fails to provide any technologically significant improvements, by modifying language to state, “displaying the condensed snippets that display...". For "...displaying the one or more subdivisions of the one or more sections that include portions of the recording that reference the particular subject and portions of the recording that provide context surrounding the particular subject on a graphical user interface to the user” limitation, merely providing further classification of data {i.e. “portions...that reference the particular subject”/”portions...that provide context”} for presentation using an extra-solution activity for the displaying of content. The mere presentation of data does not integrate an abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Within Applicant’s “identifying one or more sections...” limitation of Independent Claim 1, there is a reference to “...the query utilizing...a natural language processing system or an image processing system”, which merely makes note of the general use of one or more technological processes without providing any level of specificity into the utilization of the one or more aforementioned components, and it would be helpful to add further specificity to the use of the “natural language processing system” and/or “image processing system” in order to better teach the improvement in a technological process or its functionality. Further expounding of the "grouping the one or more sections into one or more subdivisions relating to the particular subject..." limitation, where disclosure of one or more utilized algorithms/heuristics is provided, may be helpful in the realization of the abstract idea into a practical application. At least Independent Claim 1 continues to merely provide for the utilization of one or more well-understood, routine and conventionally used techniques for condensing and grouping snippets of a recording using mere computer components. As stated above, Independent Claims 9 & 17 each remain to be considered insignificant extra-solution activity incorporating well-understood, routine and conventional techniques for the mere gathering of data, similar to Selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display, Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016). 
As stated within the previous communication, and further in view of Applicant’s rebuttal concerning Examiner’s 101 analysis under Steps 2A (i)/(ii), Examiner maintains that the provided claim language continues to be directed towards a mental process that can be performed in the human mind. The steps of identifying one or more sections of a recording relating to a particular subject of the query and determining which of the one or more subdivisions to display to the user, for example, provide mental steps of identifying subjects from a query using mental steps and determining a course of action or steps to be taken before displaying information to one or more users. In performing the steps of “identifying, grouping, determining and correlating, a human can mentally identify subjects of a query and associated subjects with content of a recording in order to perform further steps of determining and identifying. Examiner contends that the incorporation of generic computer/hardware components conventionally used for the execution of these mental steps does not provide additional elements recited in the claim beyond the judicial exceptions, nor do they provide an improvement in the field of image and acoustical analysis, as the integration of insignificant extra-solution activity provide for the utilization of conventional techniques do not provide significantly more than the judicial exception as a whole. Additional elements recited within Applicant’s amended claim language, which have been identified and evaluated in light of the amended language, individually and in combination, fail to integrate the judicial exception into a practical application, and is therefore ineligible subject matter.  


With respect to Applicant’s argument that:

“...Miller is silent about determining which of the one or more subdivisions to display to the user based on a selected granularity of the partitions.”


Examiner is not persuaded, and has maintained that the Miller et al reference, in combination with Yeo et al and the now cited Shetty et al, continue to read on Applicant’s claimed limitations.  As stated within at least the previous communication, at least column 42, lines 58-67 through column 43, line 1and column 44, lines 62-67 through column 45, lines 1-6 provide teachings for utilizing one or more similarity metrics such as correlation, coefficients, distance measure, etc., as alignment measures which align the elements of the derived content template with one or more reference templates to identify portions of the derived content template that match portions of the reference content and associate these matched portions into a map of alignment information.  One or more alignment procedures can be utilized, such as a “Partition-and-Place Procedure” and/or “Seed-and-Grow Procedure” amongst a plurality of procedures, where the at least “Seed-and-Grow Procedure” allows a “synchronization engine” to expand one or more template elements/”seeds” in response to finding a match for the template element in the reference template.  Matches are identified using the distance between one or more seeds and a subset of reference templates having a predetermined relationship, and adhering to a configurable threshold value, therefore, determining matches of “seeds” to be returned according to a threshold-based correlation between one or more of a subset of reference templates having a plurality of “seeds”.  The “Partition-and-Place Procedure” causes the synchronization engine to divide derived content template into a plurality of template elements having constant length {i.e. 100 video frames} or variable length, with length defined by a configurable parameter selected to balance execution speed with matching accuracy, as discussed within column 44, lines 21-33 for further elaboration.  According to earlier text of Miller which has been previously cited, column 21, lines 13-60 and corresponding Figure 12 teach that customers are given access to edit information associated with derived content, such as matched “seed” element determined using alignment procedures such as the “Seed-and-Grow Procedure”, as Miller teaches of utilizing a customer interface, which helps to determine information to be displayed to a user based on a plurality of factors pertaining to the particular request being received.  Graphical representations of locations within the media file where portions of a clip may be determined and presented to a customer through the customer interface, as illustrated within the cited Figure 12.  The customer interface provides a user interface to the customer displaying grouped media file information such as one or more media files, descriptive attributes of the one or more media files, annotations, semantic tagging, and advertising related to the one or more media files, amongst other potential groupings of information.

Conclusion
The prior art made of reference and not relied upon is considered pertinent to Applicant’s disclosure.
**Takahashi et al (US Patent No. 7627823B2) teaches a video information editing method and editing device.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RAHEEM HOFFLER whose telephone number is (571)270-1036. The examiner can normally be reached Monday-Friday: 10:00am-2:00pm; 6pm-10:00pm w/ flex;.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached on 5712724241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TAMARA T KYLE/Supervisory Patent Examiner, Art Unit 2156                                                                                                                                                                                                        
/RAHEEM HOFFLER/
Examiner
Art Unit 2156

								5/2/2022