DETAILED ACTION
	This Office Action is in response to an original application filed 02/25/2020.
	Claims 1-20 are pending.
	Claims 1, 8 and 15 are independent claims.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The disclosure is objected to because of the following informalities:
p. 2, [0008], line 6 generated should be generate;
p. 18, [0042], line 11 operating system 420 should be operating system 426;

Appropriate correction is required.

Drawings
The drawings are objected to because:
Figure 1 includes two items labeled 104A. The Examiner strongly recommends amending Figure 1 to better visually distinguish between the paper 102 and the transcript 104. Presently, the only visible indications between the paper 102 and the transcript 104 is a thin horizontal line, and that the paper 102 appears to have headings (e.g. Title: and Paper:) and the transcript 104 a single heading (e.g. Talk transcript:).

Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended 

Allowable Subject Matter
Claims 4-5, 11-12, and 18-19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 6, 8-10, 13 and 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Malmaud, J. et al. (hereinafter Malmaud, “What’s Cookin’? Cooking Videos Using Text, Speech and Vision, © 03/13/2015, arXiv, 10 pages) in view of Conroy et al. (hereinafter Conroy, U.S. Patent Application Publication No. 2002/0174149 A1, filed 04/26/2002, published 11/21/2002), and in further view of Tsuchida et al. (hereinafter Tsuchida, U.S. Patent Application Publication No. 2012/0030157 A1, filed 10/06/2011, published 02/02/2012).
In regard to independent claim 1, Malmaud teaches:
A method implemented in a computer system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, the method comprising:
collecting, at the computer system, a plurality of video and audio recordings of presentations of documents (at least Section 2.1 [Wingdings font/0xE0] Malmaud teaches a process whereby cooking videos with associated text descriptions of the contents of the videos were obtained);
collecting, at the computer system, a plurality of documents corresponding to the video and audio recordings (at least Section 2.1 [Wingdings font/0xE0] Malmaud teaches a 
converting, at the computer system, the plurality of video and audio recordings of presentations of documents into transcripts of the plurality of presentations (at least Section 2.1 [Wingdings font/0xE0] Malmaud teaches a process whereby cooking videos having automatically produced English-language speech transcripts were obtained from YouTube. Here, the transcripts of the audio from the cooking videos were generated beforehand);
Malmaud fails to explicitly teach:
generating, at the computer system, a summary of each document by selecting a plurality of sentences from each document using the transcript of the that document; generating, at the computer system, a dataset comprising a plurality of the generated summaries.
Malmaud appears to focus more on alignment rather than summarization.
However, Conroy teaches:
generating, at the computer system, a summary of each document by selecting a plurality of sentences from each document using the transcript of the that document (at least Abstract; col. 2, lines 29-61; col. 3, line 8 through col. 9, line 67; Figures 1-2 [Wingdings font/0xE0] Conroy teaches the generation of a summary of a document using a Hidden Markov Model);
generating, at the computer system, a dataset comprising a plurality of the generated summaries (at least Abstract; col. 2, lines 29-61; col. 3, line 8 Conroy teaches the generation of a summary or summaries of a document(s) using a Hidden Markov Model) and
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Conroy with those of Malmaud as both inventions utilize Markov to compute probabilities that particular sequences of texts will occur. Adding the teaching of Conroy provides a method to arrive at those sequences of text that are most likely to occur.
Malmaud and Conroy fail to explicitly teach:
training, at the computer system, a machine learning model using the generated dataset.
However, Tsuchida teaches:
training, at the computer system, a machine learning model using the generated dataset (at least Abstract; Figures 1-2, 6 [Wingdings font/0xE0] Tsuchida teaches a training data generation apparatus 2, which generates training data used for creating “characteristic expression” extraction rules that are used to subsequently identify and extract “characteristic expressions” from texts; the “characteristic expressions” are interpreted as a form of summary).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Tsuchida with those of Malmaud and Conroy as all three inventions are used to generate text summaries. Adding the teaching of Tsuchida provides a method to generate and apply training data to perform subsequent summary generation.

In regard to dependent claim 2, Malmaud fails to explicitly teach:
selecting a plurality of sentences comprises modeling the generative process using a hidden Markov model.
However, Conroy teaches:
selecting a plurality of sentences comprises modeling the generative process using a hidden Markov model (at least Abstract; col. 2, lines 29-61; col. 3, line 8 through col. 9, line 67; Figures 1-2 [Wingdings font/0xE0] Conroy teaches the generation of a summary or summaries of a document(s) using a Hidden Markov Model).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Conroy with those of Malmaud as both inventions utilize Markov to compute probabilities that particular sequences of texts will occur. Adding the teaching of Conroy provides a method to arrive at those sequences of text that are most likely to occur.









In regard to dependent claim 3, Malmaud fails to explicitly teach:
each hidden state of the hidden Markov model corresponds to a single sentence of the document and the sequence of spoken words from the transcripts correspond to the output sequence of the hidden Markov model.
However, Conroy
each hidden state of the hidden Markov model corresponds to a single sentence of the document and the sequence of spoken words from the transcripts correspond to the output sequence of the hidden Markov model (at least Abstract; col. 2, lines 29-61; col. 3, line 8 through col. 9, line 67; Figures 1-2 [Wingdings font/0xE0] Conroy teaches the generation of a summary or summaries of a document(s) using a Hidden Markov Model. Document(s) are summarized by weighting the frequency of occurrence of each term in a sentence using the Hidden Markov Model and a Markov state space diagram having 2s+1 states, with s summary states and s+1 non-summary states).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Conroy with those of Malmaud as both inventions utilize Markov to compute probabilities that particular sequences of texts will occur. Adding the teaching of Conroy provides a method to arrive at those sequences of text that are most likely to occur.

In regard to dependent claim 6, Malmaud fails to explicitly teach:
finding a most likely hidden state sequence using a Viterbi algorithm.
However, Conroy teaches:
finding a most likely hidden state sequence using a Viterbi algorithm (at least Abstract; col. 2, lines 29-61; col. 3, line 8 through col. 9, line 67; Figures 1-2 [Wingdings font/0xE0] Conroy teaches the generation of a summary of a document using a Hidden Markov Model. In a fifth step in the process of generating summary sentence(s), Conroy teaches the computation of most likely set of states of, or path through, a Markov state space diagram. The Markov state space diagram may be traversed using 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Conroy with those of Malmaud as both inventions utilize Markov to compute probabilities that particular sequences of texts will occur. Adding the teaching of Conroy provides a method to arrive at those sequences of text that are most likely to occur.

In regard to claims 8-10 and 13, claims 8-10 and 13 merely recite a system to carry out the method of claims 1-3, and 6, respectively. Thus, Malmaud, Conroy and Tsuchida teaches every limitation of claims 8-10 and 13, and provide proper motivation, as indicated in the rejections of claims 1-3 and 6.

In regard to claims 15-17, claims 15-17 merely recite a computer program product storing instructions for the method of claims 1-3, respectively. Thus, Malmaud, Conroy and Tsuchida teaches every limitation of claims 15-17, and provide proper motivation, as indicated in the rejections of claims 1-3.

Claims 7, 14 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Malmaud in view of Conroy, and in further view of Tsuchida, and in further view of Petersen et al. (hereinafter Petersen, U.S. Patent No. 5,638,543, filed 06/03/1993, issued 06/10/1997).
In regard to dependent claim 7, Malmaud, Conroy and Tsuchida fail to explicitly teach:
each word in the transcript defines a time-step, and selecting a plurality of sentences further comprises scoring each sentence based on a number of time-steps in which each sentence appears and selecting top scoring sentences to appear in the summary up a predetermined summary length.
However, Pedersen teaches:
each word in the transcript defines a time-step, and selecting a plurality of sentences further comprises scoring each sentence based on a number of time-steps in which each sentence appears and selecting top scoring sentences to appear in the summary up a predetermined summary length (at least col. 3, line 21 through col. 4, line 25; Figures 2-3 [Wingdings font/0xE0] Pedersen teaches a method of automatically summarizing documents by extracting a first sentence in the document, scoring that sentence, then extracting the next sentence, scoring that sentence, and so on until all the sentences in the document have been processed. The scoring may be based on several criteria, but all manage to generate a summary of the document by selecting those sentences with the highest scores).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Pedersen with those of Malmaud, Conroy and Tsuchida as all of these inventions determine a summary from particular sequences of texts. Adding the teaching of Pedersen provides a method to arrive at a summary by scoring sequences of text (sentences) and picking those with the highest scores to be included in the summary.

In regard to claim 14, claim 14 merely recite a computer program product storing instructions for the method of claim 7. Thus, Malmaud, Conroy, Tsuchida and Petersen teaches every limitation of claims 14, and provide proper motivation, as indicated in the rejection of claim 7.
In regard to dependent claim 20, Malmaud fails to explicitly teach:
finding a most likely hidden state sequence using a Viterbi algorithm.
However, Conroy teaches:
finding a most likely hidden state sequence using a Viterbi algorithm (at least Abstract; col. 2, lines 29-61; col. 3, line 8 through col. 9, line 67; Figures 1-2 [Wingdings font/0xE0] Conroy teaches the generation of a summary of a document using a Hidden Markov Model. In a fifth step in the process of generating summary sentence(s), Conroy teaches the computation of most likely set of states of, or path through, a Markov state space diagram. The Markov state space diagram may be traversed using known state space traversal methods such as forward-backward recursion and the Viterbi method (see col. 4, line 56-61)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Conroy with those of Malmaud as both inventions utilize Markov to compute probabilities that particular sequences of texts will occur. Adding the teaching of Conroy provides a method to arrive at those sequences of text that are most likely to occur.
Malmaud, Conroy and Tsuchida fail to explicitly teach:
wherein each word in the transcript defines a time-step and selecting a plurality of sentences further comprises scoring each sentence based on a number of time-steps in which each sentence appears and selecting top scoring sentences to appear in the summary up a predetermined summary length.
However, Petersen
wherein each word in the transcript defines a time-step and selecting a plurality of sentences further comprises scoring each sentence based on a number of time-steps in which each sentence appears and selecting top scoring sentences to appear in the summary up a predetermined summary length (at least col. 3, line 21 through col. 4, line 25; Figures 2-3 [Wingdings font/0xE0] Pedersen teaches a method of automatically summarizing documents by extracting a first sentence in the document, scoring that sentence, then extracting the next sentence, scoring that sentence, and so on until all the sentences in the document have been processed. The scoring may be based on several criteria, but all manage to generate a summary of the document by selecting those sentences with the highest scores).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Pedersen with those of Malmaud, Conroy and Tsuchida as all of these inventions determine a summary from particular sequences of texts. Adding the teaching of Pedersen provides a method to arrive at a summary by scoring sequences of text (sentences) and picking those with the highest scores to be included in the summary.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to James H Blackwell whose telephone number is (571)272-4089.  The examiner can normally be reached on M-F 04:30AM - 12:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Cesar Paula can be reached on 571-272-4128.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/James H. Blackwell/
09/10/2021

/CESAR B PAULA/Supervisory Patent Examiner, Art Unit 2177