Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 02/03/2021 and 03/09/2020 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Specification
The disclosure is objected to because of the following informalities: 
Paragraph 0024, line 1, mentions “used in in the” should read “used in the”  
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 4, 5, 14, 15, 19 and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 4, 14, and 19 recites terms “the same size as the vocabulary” (Claim 4, 4-5; Claim 14, line 4; Claim 19, line 5), “the size of the corresponding vocabulary” (Claim 4, lines 7; Claim 14, line 6-7; Claim 
Claim 5, 15, and 20 recites terms “the vocabulary of the second- layer LSTM structure” (Claim 5, line 8 and line 15; Claim 15 line 9 and line 15; Claim 20, line 8 and line 15) and “the vocabulary are detected” (Claim 5, line 18-19; Claim 15, lines 18-19; Claim 20, line 18-19). There is insufficient antecedent basis for these limitation in these claims as there is no prior mention of the cited terms. To overcome this rejection, it is advised to change initial mention of the term as follows within each mentioned claim: “a vocabulary of the second- layer LSTM structure” and “vocabulary are detected”.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1, 11 and 16, are rejected under 35 U.S.C. 103 as being unpatentable over Sharma (Document ID: US-20190155877-A1) in view of Cheng (Document ID: “Neural Summarization by Extracting Sentences and Words.”)
Regarding Claims 1, 11 and 16, Sharma teaches an automatic text summarization (abstract and Paragraph 0025, lines 1-7) method, comprising: 
obtaining a character included in a target text sequentially (Fig 4 and Paragraph 0050, show text being inputted into the encoder), and decoding the character according to a first-layer long short-term memory (LSTM) structure inputted into a LSTM model sequentially to obtain a sequence composed of hidden states, wherein the LSTM model is a LSTM neural network (Fig 4 and Paragraph 0050, show first layer to be LSTM based encoder, whose hidden states are used by decoder for decoding); 
inputting the sequence composed of hidden states into a second-layer LSTM structure of the LSTM model and decoding the sequence composed of hidden states to obtain a word sequence of a summary (Fig 4; Paragraph 0047 and 0050, decoder is the second layer which is used to decode hidden states from encoder and select subsequent words for summary); 
Sharma also teaches obtaining of context vector and probability distribution to get the final text summary (Fig 4 and Paragraph 0052-0055). However, Sharma fails to teach the feeding back of word sequence from second LSTM layer to first LSTM layer; thus, failing to specifically cover following claimed limitations:
“inputting the word sequence of the summary into the first-layer LSTM structure of the LSTM model and decoding the word sequence of the summary to obtain an updated sequence composed of hidden states; obtaining a context vector corresponding to a contribution value of a hidden state of a decoder according to the contribution value of the hidden state of the decoder in the updated sequence composed of hidden states; and obtaining a probability distribution of a word in the updated sequence composed of hidden states according to the 
	Cheng does teach the claimed limitation of inputting the word sequence of the summary into the first-layer LSTM structure of the LSTM model (as seen in Figure 2 and 3, the output from second layer, sentence extractor,  is fed back into the document encoder which can be considered the first LSTM layer; also see Page 4, Col 2, Paragraph 5, line 1-4)  and decoding the word sequence of the summary to obtain an updated sequence composed of hidden states (Fig 3 and page 5, Section 4.2 “Sentence Extractor”, Paragraph 2, line 6-13, shows decoding being performed on the encoded document to get the hidden states); obtaining a context vector corresponding to a contribution value of a hidden state of a decoder according to the contribution value of the hidden state of the decoder in the updated sequence composed of hidden states; (Fig 3; Page 1, col 2, paragraph 3, lines 1-10; Page 5-6, section 4.3, paragraph 1; and eq 10-15, show hierarchical attention architecture used to obtain relevant content-based summary. Here, the context vector can be equated to Eq 14 which is using the hidden state to compute its corresponding values presenting content/ summary relevancy for summary word selection. The updating process of the presented hierarchical attention architecture  consisting of context vector is evident, considering the LSTM encoder and extractor layer feedback system shown in Fig 2 and 3. Also see, page 5,  column 2, paragraph 1-2 ) and obtaining a probability distribution of a word in the updated sequence composed of hidden states according to the updated sequence composed of hidden states and the context vector, and outputting the most probable word in the probability distribution of the word as a summary of the target text (Fig 3; Page 1, col 2, paragraph 3, lines 1-10; Page 5-6, section 4.3, paragraph 1; and eq 10-15, shows the most probable word found using the probability value presented using the show hierarchical attention architecture. The hidden state and context vector are same as defined in previous limitation. The use of updated sequence to find 
	Cheng is considered analogous to the claimed invention because it is also aimed toward text summarization using Long Short-term Memory (LSTM) unit. Therefore, it would have been obvious to one of the ordinary skilled in the art before the effective filling date of the claimed invented to have modified Sharma to incorporate the teaching of Cheng to have updating sequence in place that feedbacks output from second LSTM layer to first LSTM layer. Using the summarization system presented by Cheng, help yield minimum information loss and present flexibility of applying neural attention for selecting salient sentences and words within a larger context (Cheng, Page 4, col 1, Paragraph 2, line 11-14).
	As seen in the claim set, claims 1, 11, and 16 cover similar scope of invention. However, claim 1 is a method claim while claim 11 and 16 are computer device claim and computer readable medium claim respectively. Claim 1 method of using, correspond with each claimed element’s function in claims 11 and 16. Therefore, claims 11 and 16 are rejected under same rationale as applied above to method claim 1. Furthermore, Sharma also teaches claim 11, A computer device, comprising a memory, a processor (Paragraph 0019, fig 10) and a computer program stored in the memory and operated at the processor, characterized in that the processor executes the computer program (Paragraph 0087, fig 120). In addition, Sharma also teaches claim 16, A non-transitory computer-readable storage medium, .
Claim 2, 5, 12, 15, 17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Sharma (Document ID: US-20190155877-A1) in view of Cheng (Document ID: “Neural Summarization by Extracting Sentences and Words.”) in view of Maksak (Document ID: US 20180300295 A1)
	Regarding claims 2, 12  and 17, Sharma in view of Cheng teaches the automatic text summarization method as claimed in claim 1, the computer device as claimed in claim 11, and the non-transitory computer-readable storage medium as claimed in claim 16, carrying out the steps of sequentially obtaining the character included in the target text and decoding the character according to the LSTM structure inputted into the LSTM model sequentially to obtain the sequence composed of hidden states model (Sharma, Fig 4 and Paragraph 0050, show text inputted into the first layer, LSTM based encoder, whose hidden states are used by decoder to generated output hidden state with text sequence). However, Sharma in view of Cheng fails to specifically mention following claimed limitations:
“ further comprising a step of putting a plurality of historical texts of a corpus into the first-layer LSTM structure and putting a text summary corresponding to the historical text into the second-layer LSTM structure for training to obtain the LSTM model, before carrying out the steps of sequentially obtaining the character included in the target text and decoding the character according to the LSTM structure inputted into the LSTM model sequentially to obtain the sequence composed of hidden states.”
Maksak does teaches the claimed limitation of a step of putting a plurality of historical texts of a corpus (Paragraph 0070, lines 1-9) into the first-layer LSTM structure and putting a text summary corresponding to the historical text into the second-layer LSTM structure (Paragraph 0070; Paragraph 0054 and Fig 4, shows the processing input text which is a same process used for historical text according to Paragraph 0070. Here encoder can be considered the first layer while decoder can be  for training to obtain the LSTM, before carrying out the steps of sequentially obtaining the character included in the target text and decoding the character according to the LSTM structure inputted into the LSTM model sequentially to obtain the sequence composed of hidden states model (Paragraph 0069 mention of training phase that uses historical text occurring before encoding and decoding process of input text; also see Paragraph 0025 and 0047).  
Maksak is considered analogous to the claimed invention because it is also aimed toward text summarization using neural network (Paragraph 0037). Therefore, it would have been obvious to one of the ordinary skilled in the art before the effective filling date of the claimed invented to have modified Sharma in view of Cheng to incorporate the teaching of Maksak to include training step using historical text prior to input text summarization step. The training step from Maksak can help improve the accuracy of the system (Paragraph 0070, lines 9-11).
As seen in the claim set, claims 2, 12, and 17 cover similar scope of invention. However, claim 2 is a method claim while claim 12 and 17 are computer device claim and computer readable medium claim respectively. Claim 2 method of using, correspond with each claimed element’s function in claims 12 and 17. Therefore, claims 12 and 17 are rejected under same rationale as applied above to method claim 2.
	Regarding Claim 5, 15, and 20, Sharma in view of Cheng in view of Maksak  teaches the automatic text summarization method as claimed in claim 2, the computer device as claimed in claim 12, and the non-transitory computer-readable storage medium as claimed in claim 17, wherein the step of inputting the sequence composed of hidden states into the second-layer LSTM structure of the LSTM model for decoding to obtain the word sequence of summary (Sharma, Fig 4 and Paragraph 0047 and further comprises the steps of: 
obtaining the most probable word in the sequence composed of hidden states (Sharma, Paragraph 0043-0044, show selection probability being using select most preferred word), using the most probable word in the sequence composed of hidden states as an initial word in the word sequence of the summary (Sharma, Fig 4 and Paragraph 0044 and 0046-0047, mention of most preferred word being added to the target summary; also see Paragraph 0050); 
inputting each word in the initial word into the second-layer LSTM structure (Sharma, as seen in Figure 4, the initially generated partial summary is being inputted in the second layer which is the decoder), and combining each word in the initial word with each word in the vocabulary of the second- layer LSTM structure to obtained a combined sequence (Sharma, Fig 4 and Paragraph 0044-0047 and Paragraph 0052-0053, vocabulary is being considered as seen in fig 4 alongside second layer, decoder, to obtain the summary), and then obtaining and using the most probable word in the combined sequence as the sequence composed of hidden states (Sharma, Fig 4 and 9; Paragraph 0044-0047 and Paragraph 0050-055; shows the selected probable word being used to generate targeted summary); and 
repeating the steps of inputting each word in the sequence composed of hidden states into the second-layer LSTM structure, and combining each word in the sequence composed of hidden states with each word in the vocabulary of the second-layer LSTM structure to obtain the combined sequence, and then obtaining and using the most probable word in the combined sequence as the sequence composed of hidden states (the same steps are being repeated that were mention earlier within the claim, thus they are rejected under same rationale. Sharma, Fig 4 and Fig 9; Paragraph 0047; mention of feedback of selected preferred word to the generation model to repeat the steps of getting most probable subsequent word for the target summary; also see paragraph 0050-0055.), until each word in the sequence composed of hidden states and an end character in the vocabulary are detected, and using the sequence composed of hidden states as the word sequence of the summary (Sharma, Fig 4, shows partial summary being inputted as feedback and final word “Argentina” being the final hidden state used by the decoder to get target word mentioned in paragraph 0050 marking the end of process with summarized text).
As seen in the claim set, claims 5, 15, and 20 cover similar scope of invention. However, claim 5 is a method claim while claim 15 and 20 are computer device claim and computer readable medium claim respectively. Claim 5 method of using, correspond with each claimed element’s function in claims 15 and 20. Therefore, claims 15 and 20 are rejected under same rationale as applied above to method claim 5.
Claims 3, 13, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Sharma (Document ID: US-20190155877-A1) in view of Cheng (Document ID: “Neural Summarization by Extracting Sentences and Words.”) in view of Zhou (Document ID: “Selective Encoding for Abstractive Sentence Summarization”)
Regarding claim 3, 13, and 18, Sharma in view of Cheng teaches the teaches the automatic text summarization method as claimed in claim 1, the computer device as claimed in claim 11, and the non-
wherein the LSTM model is a gated recurrent unit, and the gated recurrent unit has a model with the conditions of:

    PNG
    media_image1.png
    140
    313
    media_image1.png
    Greyscale

wherein, Wz, Wr, and W are trained weight parameter values, xt is an input, ht-1 is a hidden state, zt is an updated state, rt is a reset signal, 
    PNG
    media_image2.png
    30
    24
    media_image2.png
    Greyscale
 is a new memory corresponding to the hidden state ht-1, ht is an output, σ () is a sigmoid function, and tanh () is a hyperbolic tangent function.
	Zhou does teach the claimed limitation of a gated recurrent unit, and the gated recurrent unit has a model with the conditions (Page 3, Col 2, paragraph 1-3 and Eq 1-4)

    PNG
    media_image1.png
    140
    313
    media_image1.png
    Greyscale

As seen in Equation 1-4 in Zhou, gated recurrent unit conditions are identical to that presented above in the claimed limitation. Zhou is considered analogous to the claimed invention because it is also aimed toward text summarization. Therefore, It would have been obvious to one skilled in the art to have each term mentioned in equations 1-4 by Zhou be equated to terminology mentioned in the claim; wherein, Wz, Wr, and W are trained weight parameter values, xt is an input, ht-1 is a hidden state, zt is 
    PNG
    media_image2.png
    30
    24
    media_image2.png
    Greyscale
 is a new memory corresponding to the hidden state ht-1, ht is an output, σ () is a sigmoid function, and tanh () is a hyperbolic tangent function. Furthermore, it would have been obvious to one of the ordinary skilled in the art before the effective filling date of the claimed invented to have modified Sharma in view of Cheng to incorporate the teaching of Zhou to use gated recurrent unit for text summarization. Implementation of gated network shown in Zhou, can help improve the encoding effectiveness and release burden of the decoder (Page 2, Column 1, Paragraph 2, line 11-16).
As seen in the claim set, claims 3, 13, and 18 cover similar scope of invention. However, claim 5 is a method claim while claim 13 and 18 are computer device claim and computer readable medium claim respectively. Claim 3 method of using, correspond with each claimed element’s function in claims 13 and 18. Therefore, claims 13 and 18 are rejected under same rationale as applied above to method claim 3.
Claims 4, 14, and 19  are rejected under 35 U.S.C. 103 as being unpatentable over Sharma (Document ID: US-20190155877-A1) in view of Cheng (Document ID: “Neural Summarization by Extracting Sentences and Words.”) in view of Zhou (Document ID: “Selective Encoding for Abstractive Sentence Summarization”) in view of Ma (Document ID: “Word Embedding Attention Network: Generating Words by Querying Distributed Word Representations for Paraphrase Generation”) in view of Maksak (Document ID: US 20180300295 A1)
Regarding Claims 4, 14, and 19 Sharma in view of Cheng in view of Zhou teaches the automatic text summarization method as claimed in claim 3, the computer device as claimed in claim 13, and the non-transitory computer-readable storage medium as claimed in claim 18, wherein the sequence composed of hidden states is inputted into the second-layer LSTM structure of the LSTM model for decoding to obtain the word sequence of the summary (Sharma, Fig 4; Paragraph 0047 and 0050, 
“the word sequence of the summary is a polynomial distribution layer having the same size as the vocabulary, and a vector                 
                    
                        
                            y
                        
                        
                            t
                        
                    
                
             ERK is outputted; wherein the nth dimension of yt represents the probability of generating the kth word, t is a value of a positive integer), and K is the size of the corresponding vocabulary of the historical text.”
Ma does teach the claimed limitation of the word sequence of the summary is a polynomial distribution layer having the same size as the vocabulary (Figure 1 show word sequence for summary generated in “Query” block which are the candidate words. These candidate words can be associate to term “vocabulary” mentioned in the claim which thus fulfills the claimed limitation of “polynomial distribution layer having the same size as the vocabulary”. An embodiment of polynomial distribution can be seen in Figure 1 where the decoded hidden state                 
                    
                        
                            S
                        
                        
                            t
                        
                    
                
             is associated which multiple candidate words’ “key” and “value; see Section 2.3: “Word Generation by Querying Word Embedding” and equation 5 and 6 for more on polynomial distribution structure; Also see, Page 2, Colum 2, paragraph 3, lines 3-5 and Page 2), and a vector                 
                    
                        
                            y
                        
                        
                            t
                        
                    
                
             ERK is outputted (equation 6); wherein the nth dimension of yt represents the probability of generating the kth word (as seen in Section 2.3, score function is used to get the most probable word as output. Figure 1 can be seen as an example where                 
                    
                        
                            i
                        
                        
                            t
                            h
                        
                    
                
             word is chosen as output; also see Equation 5 and 6), t is a value of a positive integer (as seen in Fig 1, the corresponding hidden state,                 
                    
                        
                            S
                        
                        
                            t
                        
                    
                
            , of the output                 
                    
                        
                            y
                        
                        
                            t
                        
                    
                
             has t beginning from 1. Therefore, it is implicated that the t which is the time step is a positive integer; also see Section 2.2 “Encoder and Decoder”, paragraph 2-3 and eq 4-6), and K is the size of the corresponding vocabulary (Page 3, Column 1, Paragraph 3, here it is seen that “i” is the size of candidate word used as vocabulary). Ma is considered analogous to the claimed invention because it is also aimed toward text summarization using decoder and encoder network. Therefore, it would have been obvious to one of the ordinary skilled in the art before the effective filling 
Ma fails to teach that the vocabulary corresponded to historical texts. 
Maksak does teach a text summarization model (Paragraph 0037) that incorporates keyword probabilities from the historical text for Prediction and training of encoder and decoder (Paragraph 0073-0077; fig 4-6). Makasak is considered analogous to the claimed invention because it is also aimed toward text summarization using decoder and encoder network (Paragraph 0037). Therefore, it would have been obvious to one of the ordinary skilled in the art before the effective filling date of the claimed invented to have modified Sharma in view of Cheng in view of Zhou in view of Ma to incorporate the teaching of Maksak to use historical text for keywords to predict be possible word. The use of historical text from Maksak can help improve the accuracy of the system (Paragraph 0070, lines 9-11).
As seen in the claim set, claims 4, 14, and 19 cover similar scope of invention. However, claim 4 is a method claim while claim 14 and 19 are computer device claim and computer readable medium claim respectively. Claim 4 method of using, correspond with each claimed element’s function in claims 
Conclusion
The analogous prior art made of record and not relied upon is considered to applicant’s disclosure.
Mir (Document ID: “METHODS OF SENTENCE EXTRACTION, ABSTRACTION AND ORDERING FOR AUTOMATIC TEXT SUMMARIZATION”) teaches text summarization using LSTM model.
See (Document ID: “Get To The Point: Summarization with Pointer-Generator Networks”) teaches text summarization using vocabulary and probability system.
Filippova (Document ID: US-10229111-B1) teaches sentence compression using LSTM decoder and encoder.
Paulus (Document ID: US-20180300400-A1) teaches abstractive summarization using context vector alongside encoder and decoder in LSTM model.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NEEL P. KARELIA whose telephone number is (571)272-4377. The examiner can normally be reached Monday-Friday 6:30 am - 4:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on (571)272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/NEEL PIYUSHKUMAR KARELIA/Examiner, Art Unit 2659                                                                                                                                                                                                        

/PIERRE LOUIS DESIR/Supervisory Patent Examiner, Art Unit 2659