Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 21-24 are new. Claims 1, 3-5, 8-10, 12-14, and 16 - 19 are amended. Claims 1-24 are pending.
The following rejections are withdrawn in view of new grounds of rejection necessitated by applicant’s amendments:
Claims 1-20 rejected under 35 U.S.C. 103 as being unpatentable over Lee et al (US Patent: 10713439, issued: Jul. 14, 2020, filed: Mar. 28, 2017, foreign priority: Oct. 31, 2016) in view of Paulus (US Application: US 2018/0300400, published: Oct. 18, 2018, filed: Nov. 16, 2017, EEFD: April, 14, 2017).

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 

Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “.. ‘communications interface configured to obtain …’ in claim 10.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-24 are newly rejected under 35 U.S.C. 103 as being unpatentable over Lee et al (US Patent: 10713439, issued: Jul. 14, 2020, filed: Mar. 28, 2017, foreign priority: Oct. 31, 2016) in view of Paulus (US Application: US 2018/0300400, published: Oct. 18, 2018, filed: Nov. 16, 2017, EEFD: April, 14, 2017) and further in view of Socher et al (“CS 224D: Deep Learning for NLP”, publisher: Stanford.edu, published: 2015, pages 1-12).



obtaining a first sentence (column 5, lines 35-37: an input sentence is obtained); 

generating m second sentences based on the first sentence and a paraphrase generation model, wherein each of the m second sentences has a paraphrase relationship with the first sentence, and wherein m is an integer greater than zero (column 8, lines 49-67 and column 9, lines 1-7: a plurality of verification sentences (interpreted as ‘m’ sentences) are generated based upon the first sentence using the paraphrase model having encoder and decoder components. The verification sentences have a semantic relationship based upon the first sentence and its corresponding embedding vector); 

determining a matching degree between each of the m second sentences and the first sentence based on a paraphrase matching model, wherein a higher matching degree between a second sentence and the first sentence indicates a higher probability that the second sentence and the first sentence are paraphrases of each other (column 9, lines 32-43: the variant sentences are matched against the first sentences when the sentences are within a degree/range of the first sentence’s vector); and 

determining n second sentences from the m second sentences based on matching degrees between the m second sentences and the first sentence, wherein the n second sentences are paraphrase sentences of the first sentence, and wherein n is an integer greater than zero and less than or equal to m (column 9, lines 30-43: a subset (interpreted as ‘n’) of the verification sentences (the total verification sentences interpreted as ‘m’) that fall within the N-best (interpreted as a threshold matching degree) are identified/determined), wherein the paraphrase generation model and the paraphrase matching model are both constructed by a deep neural network (column 5, lines 40-42 and 50-59: a neural network is used to generate and match paraphrases), wherein the paraphrase generation model is obtained through … learning-based training … (column 5, lines 35-59: the paraphrase generation model is based upon training data) .

However the combination of wherein the paraphrase generation model is obtained through reinforcement learning-based training, and wherein the reinforcement learning-based training is based on a reward from the paraphrase matching model; calculating a hidden state variable of each word in the first sentence; determining the second sentence based on hidden state variables of J words, wherein J represents a quantity of words in the second sentence; continuously generating new words when each hidden state variable of each word in the second sentence is determined.

Yet Paulus teaches wherein the paraphrase generation model is obtained through reinforcement learning-based training, and wherein the reinforcement learning-based is based on a reward from the [verification] … matching model (paragraphs 0037, 0038 and 0050: summaries having phrases are generated using models with at least encoder and decoders and a reward/reinforcement is generated and fed back into the model based upon degree of the verification score to generate the  summary/phrases (with respect to the input text/sentence fed into the model and output text/sentence/phrase from the model). It is interpreted that the input sentence, output sentence and the reward mentioned/discussed by Paulus above is collectively a type of the claimed ‘policy gradient algorithm’).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Lee et al’s ability to use a paraphrase generation model that is based upon use of training data and paraphrase generation (of at least a first output paraphrase sentence) using encoder, decoder and matching verification steps, such that the result of the verification step is further modified to be fed back into the paraphrase generation model to encourage/reward generation of the generated phrase(s) based upon a degree/score of the verification, as taught by Paulus. The combination would have allowed Lee et al to have implemented a model that addresses issues of coherence, flow, readability while resolving unnatural summaries (Paulus, paragraph 0012). 

However the combination does not expressly teach calculating a hidden state variable of each word in the first sentence; determining the second sentence based on hidden state variables of J words, wherein J represents a quantity of words in the second sentence; continuously generating new words when each hidden state variable of each word in the second sentence is determined.

Yet Socher et al teaches calculating a hidden state variable of each word in the first sentence; determining the second sentence based on hidden state variables of J words, wherein J represents a quantity of words in the second sentence; continuously generating new words when each hidden state variable of each word in the second sentence is determined (pages 2-3:  a hidden state variable ht is generated for each word calculated over time, where there are T words (mapped to the claimed ‘J’ words), wherein T represents a quantity of words in the output sentence. Continuous word sequence(s) are generated by looping of neurons over time based upon hidden state variables of the T words (‘J’ –words). As also explained, each generated word is based on a weight (attention weight) of each word in an earlier first node/sentence (prior time step) and also weight vector(s) of current time step xt (attention vector), such that the attention vector(s) are mapped to the correspondence of the instant neural node at time step ‘t’ and generate an output probability distribution value set ‘yt’ for the word in the second sentence at time ‘t’).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified the combination of Lee et al and Paulus’ ability to use reinforcement learning model upon words, such that the reinforcement learning model is RNN based that references hidden states for each word to yield a second sentence, as taught by Socher et al. The combination would have allowed Lee et al and Paulus to 

2. The paraphrase sentence generation method of claim 1, the combination of Lee et al and Paulus teaches further comprising: generating, using the paraphrase generation model and a first input sentence, a first output sentence; and determining, using the paraphrase matching model, a second matching degree between the first input sentence and the first output sentence, wherein the first input sentence has a second paraphrase relationship with the first output sentence, and wherein the second matching degree is the reward, as similarly explained in the rejection of claim 1 (as explained in the combination of Lee et al and Paulus, Lee et al first teaches using a paraphrase generation model that takes in an input sentence and generates at least one or more output sentences while using matching verification result(s). The result of the verification step is then further modified (by Paulus) to be fed back into the paraphrase generation model to encourage/reward generation of the generated phrase(s) based upon a degree/score of the verification (as taught by Paulus), and is rejected under similar rationale.

3. The paraphrase sentence generation method of claim 2, the combination of Lee et al and Paulus teaches further comprising wherein the paraphrase generation model is  obtained through reinforcement learning and according to a policy gradient algorithm, and wherein inputs of the policy gradient algorithm comprise the first input sentence, the first output sentence, and the reward, as similarly explained in the rejection of claim 1, and is rejected under similar rationale.

4. The paraphrase sentence generation method of claim 1, Lee et al teaches wherein the paraphrase generation model comprises a primary generation model, wherein the primary generation model is  obtained through training that is based on parallel data for paraphrasing, wherein the parallel data for paraphrasing comprises a paraphrase sentence pair, and wherein the paraphrase sentence pair comprises two sentences that are paraphrases of each other (a paraphrase sentence pair (the claimed ‘parallel data for paraphrasing’ is implemented from the training as explained in column 5, lines 42-44 of Lee et al)).

5. The paraphrase sentence generation method of claim 4, Lee et al teaches wherein the paraphrase generation model further comprises a secondary generation model  based on a paraphrase generation rule set, wherein the paraphrase generation rule set comprises a paraphrase generation rule, and wherein the paraphrase generation model is a machine learning model obtained by integrating the primary generation model and the secondary generation model (column 5, lines 35-59, column 6, lines 20-27, column 9, lines 3-44: the sentence embedding vector is created (based upon a rule/condition to identify corresponding paraphrase pair data). The sentence embedding vector defines an adjustment of the phrase/words by implementing a node layer and a hidden node layer that is characterized by weights (interpreted as the claimed ‘attention degree’ to each word/phrase). The sentence embedding vector is used/applied to generate the 

6. The paraphrase sentence generation method of claim 5, Lee et al teaches wherein the primary generation model and the secondary generation model are integrated using an attention mechanism (column 6, lines 20-27: a sentence embedding vector is interpreted as the claimed ‘attention mechanism’), wherein the attention mechanism dynamically adjusts words in the first sentence and an attention degree of the paraphrase generation rule in a process of generating the m second sentences by the paraphrase generation model, and wherein the attention degree is represented by an attention vector set (column 5, lines 35-59, column 6, lines 20-27 and column 9, lines 3-44: a rule/condition to identify corresponding paraphrase pair data to create/define a sentence embedding vector is implemented. The sentence embedding vector defines an adjustment of the phrase/words by implementing a node layer and a hidden node layer that is characterized by weights (interpreted as the claimed ‘attention degree’ to each word/phrase). The sentence embedding vector is used to generate the claimed (‘m’) second verification sentences).

7. The paraphrase sentence generation method of claim 6, Lee et al teaches wherein the attention vector set comprises an attention vector corresponding to the paraphrase generation rule in a one-to-one manner, wherein the paraphrase sentence generation method further comprises obtaining the attention vector through calculation that is based on the paraphrase sentence pair, and wherein the paraphrase sentence pair 

8. The paraphrase sentence generation method of claim 1, Lee et al teaches wherein the paraphrase matching model comprises a primary matching model, wherein the primary matching model is  obtained through training  based on paraphrase matching data (column 5, lines 35-57: paraphrase pairs in a trained model are matched to obtain a sentence embedding vector), wherein the paraphrase matching data comprises a matching sentence pair, and wherein the matching sentence pair comprises two sentences that are not paraphrases of each other (column 5, lines 35-57, column 7, lines 16-30 and 49-55: a matching paraphrase sentence pair is recognized to define a sentence embedding vector that can be resides within vector space of the model falling within the noise vector 340. The model also includes potential pairing to be matched that are not paraphrases of each other as shown in ref 330 of Fig. 3 for a phrase ‘I want to eat an apple’ that is different from a vector similar to ‘I am going to work’).

9. The paraphrase sentence generation method of claim 8, Lee et al teaches wherein the paraphrase matching model further comprises a secondary matching model  based on a paraphrase matching rule, wherein the paraphrase matching model  obtained by integrating the primary matching model and the secondary matching model (column 9, 

10. The combination of Lee et al,  Paulus and Socher et al teaches a paraphrase sentence generation apparatus, comprising: a communications interface configured to obtain a first sentence; and a processor coupled to the communications interface and configured to: generate m second sentences based on the first sentence and a paraphrase generation model, wherein each of the m second sentences has a paraphrase relationship with the first sentence, and wherein m is an integer greater than zero; determine a matching degree between each of the m second sentences and the first sentence based on a paraphrase matching model, wherein a higher matching degree between a second sentence and the first sentence indicates a higher probability that the second sentence and the first sentence are paraphrases of each other; and determine n second sentences from the m second sentences based on matching degrees between the m second sentences and the first sentence, wherein the n second sentences are paraphrase sentences of the first sentence and wherein n is an integer greater than zero and less than or equal to m, wherein the paraphrase generation model and the paraphrase matching model are both constructed by a deep neural network, wherein the paraphrase generation model is obtained through reinforcement learning-based training, and wherein the reinforcement learning-based training is based on a reward from the paraphrase matching model; calculate a hidden state variable of each word in the first sentence; determine the second sentence based on hidden state 

11. The paraphrase sentence generation apparatus of claim 10, the combination of Lee et al,  Paulus and Socher et al teaches wherein the processor is further configured to: generate, using the paraphrase generation model and a first input sentence, a first output sentence; and determine, using the paraphrase matching model, a second matching degree between the first input sentence and the first output sentence, wherein the first input sentence has a second paraphrase relationship with the first output sentence, and wherein the second matching degree is the reward, as similarly explained in the rejection for claim 2, and is rejected under similar rationale.

12. The paraphrase sentence generation apparatus of claim 11, the combination of Lee et al,  Paulus and Socher et al teaches wherein the paraphrase generation model is  obtained through reinforcement learning performed on the paraphrase generation model and according to a policy gradient algorithm, and wherein inputs of the policy gradient algorithm comprise the first input sentence, the first output sentence, and the reward, as similarly explained in the rejection of claim 3, and is rejected under similar rationale.



14. The paraphrase sentence generation apparatus of claim 13, the combination of Lee et al,  Paulus and Socher et al teaches wherein the paraphrase generation model further comprises a secondary generation model  based on a paraphrase generation rule set, wherein the paraphrase generation rule set comprises a paraphrase generation rule, and wherein the paraphrase generation model is  obtained by integrating the primary generation model and the secondary generation model, as similarly explained in the rejection of claim 5, and is rejected under similar rationale.

15. The paraphrase sentence generation apparatus of claim 14, the combination of Lee et al,  Paulus and Socher et al teaches wherein the primary generation model and the secondary generation model are integrated using an attention mechanism, wherein the attention mechanism dynamically adjusts words in the first sentence and an attention degree of the paraphrase generation rule in a process of generating the m second sentences by the paraphrase generation model, and wherein the attention degree is 

16. The paraphrase sentence generation apparatus of claim 15, the combination of Lee et al,  Paulus and Socher et al teaches wherein the attention vector set comprises an attention vector corresponding to the paraphrase generation rule in a one-to-one manner, wherein the processor is further configured to obtain the attention vector through calculation  based on the paraphrase sentence pair, and wherein the paraphrase sentence pair meets the paraphrase generation rule, as similarly explained in the rejection of claim 7, and is rejected under similar rationale.

17. The paraphrase sentence generation apparatus of claim 10, the combination of Lee et al,  Paulus and Socher et al teaches wherein the paraphrase matching model comprises a primary matching model, wherein the primary matching model is  obtained through training  based on paraphrase matching data, wherein the paraphrase matching data comprises a matching sentence pair, and wherein the matching sentence pair comprises two sentences that are not paraphrases of each other, as similarly explained in the rejection for claim 8, and is rejected under similar rationale.

18. The paraphrase sentence generation apparatus of claim 17, the combination of Lee et al,  Paulus and Socher et al teaches wherein the paraphrase matching model further comprises a secondary matching model, wherein the paraphrase matching model is  

19. the combination of Lee et al,  Paulus and Socher et al teaches a computer program product comprising computer-executable instructions for storage on a non-transitory computer-readable medium that, when executed by a processor, cause a paraphrase sentence generation apparatus to: obtain a first sentence; generate m second sentences based on the first sentence and the paraphrase generation model, wherein each of the m second sentences has a paraphrase relationship with the first sentence, and wherein m is an integer greater than zero; determine a matching degree between each of the m second sentences and the first sentence based on the paraphrase matching model, wherein a higher matching degree between a second sentence and the first sentence indicates a higher probability that the second sentence and the first sentence are paraphrases of each other; and determine n second sentences from the m second sentences based on matching degrees between the m second sentences and the first sentence, wherein the n second sentences are paraphrase sentences of the first sentence, and wherein n is an integer greater than zero and less than or equal to m, wherein the paraphrase generation model and the paraphrase matching model are both constructed by a deep neural network, wherein the paraphrase generation model is obtained through reinforcement learning-based training, and wherein the reinforcement learning-based training is based on a reward from the paraphrase matching model, calculate a hidden state variable of each word int eh first sentence; determine the second sentence based on hidden state variables of J words, wherein J represents a 

20. The computer program product of claim 19, the combination of Lee et al and Paulus teaches wherein the computer-executable instructions further cause the paraphrase sentence generation apparatus to: generate, using the paraphrase generation model and a first input sentence, a first output sentence; and determine, using the paraphrase matching model, a second matching degree between the first input sentence and the first output sentence, wherein the first input sentence has a second paraphrase relationship with the first output sentence, and wherein the second matching degree is the reward, as similarly explained in the rejection of claim 2, and is rejected under similar rationale.

21. The paraphrase sentence generation apparatus of claim 14, wherein the paraphrase generation rule set is represented as r {rk: pk->p'k}k=1 K, wherein r={rk-p'k}k=1 K represents a total of K paraphrase generation rules, wherein K is an integer greater than 0, wherein rk represents a kth paraphrase generation rule, wherein k is an integer greater than 0 but less than or equal to K, wherein pk-p'k represents a paraphrase generation rule to rewrite pk as p'k, wherein pk and p'k are paraphrases of each other, wherein pk represents a condition of the paraphrase generation rule, and wherein p' represents a result of the paraphrase generation rule, as similarly explained in the 

22. The paraphrase sentence generation method of claim 1, further comprising calculating, during generation of a jth word in the second sentence, a generation probability of the jth word in the second sentence based on a calculated attention weight of each word in the first sentence and an attention vector corresponding to the jth word , as similarly explained in the rejection of claim 1 (Socher et al, pages 2-3:  a hidden state variable ht is generated for each word calculated over time, where there are T words (mapped to the claimed ‘J’ words), wherein T represents a quantity of words in the output sentence. Continuous words are generated by looping of neurons over time based upon hidden state variables of the T words. As also explained, each generated word is based on a weight of each word in an earlier first node/sentence (prior time step) and also weight vector(s) of current time step xt )), and is rejected under similar rationale.  

23.  The paraphrase sentence generation method of claim 5, further comprising: calculating an attention vector corresponding to a jth word in the second sentence, wherein the attention vector has a one-to-one correspondence with the paraphrase generation rule in the paraphrase generation rule set; and calculating a generation probability of the jth word in the second sentence based on the attention vector, a (j-1)th h word in the second sentence, as similarly explained in the rejection of claim 1 (Socher et al, pages 2-3:  a hidden state variable ht is generated for each word calculated over time, where there are T words (mapped to the claimed ‘J’ words), wherein T represents a quantity of words in the output sentence. Continuous words are generated by looping of neurons over time based upon hidden state variables of the T words (‘J’ –words). As also explained, each generated word is based on a weight (attention weight) of each word in an earlier first node/sentence (prior time step) and also weight vector(s) of current time step xt (attention vector), such that the attention vector(s) are mapped to the correspondence of the instant neural node at time step ‘t’ and generate an output probability distribution value set ‘yt’ for the word in the second sentence at time ‘t’), and is rejected under similar rationale.  

24. The paraphrase sentence generation of claim 1, further comprising: determining the second sentence based on the hidden state variables of the J words; and determining hidden state variables of a jth word in the second sentence based on an attention vector of the jth word in the second sentence, as similarly explained in the rejection of claim 1, (Socher et al, pages 2-3:  a hidden state variable ht is generated for each word calculated over time, where there are T words (mapped to the claimed ‘J’ words), wherein T represents a quantity of words in the output sentence. Continuous words are generated by looping of neurons over time based upon hidden state variables of the T words (‘J’ –words). As also explained, each generated word is based on a weight (attention weight) of each word in an earlier first node/sentence (prior time step) and t (attention vector), such that the attention vector(s) are mapped to the correspondence of the instant neural node at time step ‘t’ and generate an output probability distribution value set ‘yt’ for the word in the second sentence at time ‘t’), and is rejected under similar rationale.

Response to Arguments
Applicant's arguments filed 09/07/2021 have been fully considered but they are not persuasive. 
With regards to claims 1-20, the applicant argues the prior cited references (Lee and Paulus) to not teach the newly amended limitations. The examiner respectfully notes that the amendments necessitated a new grounds of rejection, and thus directs attention to the rejection above that further applies Socher et al to address the new amendments. 
The applicant argues new claims 21-24 are allowable in view of the cited prior art, however the examiner respectfully directs attention to how the new claims are rejected above (through the application of Socher et al). 
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILSON W TSUI whose telephone number is (571)272-7596. The examiner can normally be reached Monday - Friday 9 am -6 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen Hong can be reached on 571-272-4124. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For 





/WILSON W TSUI/Primary Examiner, Art Unit 2178