DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claims 21, 31, and 38, are objected to because of the following informalities:  
As per Claim 21 (and similarly claims 31 and 38): Line 8 of claim 21 recites “a a” (i.e. one “a” should be deleted).
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 22-23, 32-33, and 39 rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 

As per Claim 22 (and similarly claim 32):
The original Specification (i.e. the original Specification of Parent Application 15/974,118, hereafter original Specification, where this application is a continuation and not a continuation-in-part) does not have written description for “generating, using a switch, the set of attention weights between the first distribution over the plurality of words from the vocabulary and the second distribution over the context-based words”.
	In claim 1 of the parent application, a weighting is generated (see 3rd to last limitation) but the weighting is not necessarily the set of attention weights.

As per Claim 23 (and similarly claims 33 and 39)
The original Specification does not have written description for “generating, using a switch, a composite distribution based on the set of attention weights”.
	In claim 1 of the parent application, the composite distribution is based on the weighting, which is not necessarily the set of attention weights.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 21-40 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.


As per Claim 21 (and similarly claims 31 and 38):
“a second output of the LSTM” in the 7th to last line of claim 21 is unclear because it can refer to either:
1. an output of the LSTM which is “second” relative to the “first output of the biLSTM” (and which can be the context-adjusted hidden state).
Or
2. an output of the LSTM which is “second” relative to the context-adjusted hidden state which is generated using the LSTM (such that the second output cannot be the context-adjusted hidden state), where the context-adjusted hidden state can also (under the interpretation that the LSTM generates the context-adjusted hidden state) be interpreted as an output of the LSTM.
	“the context-based words” in the 4th to last line of claim 21 lacks antecedent basis (line 6 of claim 1 only has a single context-based word and the 6th to last line of claim 1 recites a plurality of words [not necessarily context-based]).

	As per Claim 22 (and similarly claim 32):
	“the context-based words” in line 3 of claim 22 lacks antecedent basis (as a result of “the context-based words” in the 4th to last line of claim 21 lacking antecedent basis).
	It is also not clear how/in-what-sense a set of attention weights are generated between two distributions.

	As per Claim 25 (and similarly claims 34 and 40):
	It is not clear how/in-what-sense an affinity matrix is generated between two representations (an affinity can be between two representations but it is not clear how an affinity matrix is generated between two representations).

	As per Claim 27:
	“the decoder” lacks antecedent basis.

	As per Claim 28:
	“the transformer” in line 1 if claim 28 lacks antecedent basis (claim 37 depends on claim 36 which recites a self-attention-based transformer, whereas claim 28 does not depend on a claim reciting a self-attention-based transformer)

	The dependent claims include the issues of their respective parent claims.

Allowable Subject Matter
The following is a statement of reasons for the indication of allowable subject matter:  
	As per Claim(s) 21 (and similarly claim[s] 31 and 38, and consequently claim[s] 22-30, 32-37, and 39-40 which depend on claim[s] 21, 31, and 38), the prior art of record does not teach or suggest the combination of all limitations in claim(s) 21, including (i.e. in combination with the remaining limitations in claim[s] 21) receiving, at an input layer, a natural language input of a question; performing a first encoding of a context-based word and a question-based word from the question into a context-based representation and a question-based representation; performing, using a bi-directional long-term short-term memory (biLSTM), a a second encoding of the context-based representation and the question-based representation; generating, using a long-term short-term memory (LSTM), a context-adjusted hidden state based at least in part from the context-based representation and the question-based representation; generating, by an attention network, a set of attention weights based on a first output of the biLSTM and a second output of the LSTM; generating, by a vocabulary layer, a first distribution over a plurality of words in a vocabulary based on the set of attention weights; generating, by a context layer, a second distribution over the context-based words based on the set of attention weights; and selecting a set of words for an answer to the question based on the first distribution and the second distribution (selecting words for an answer based on the first distribution and the second distribution, where both distributions are based on the same set of attention weights, where the set of attention weights is based on an output of the biLSTM [which is used to perform a second encoding of the context-based representation and the question representation] and the LSTM [which is used to generate a context-adjusted hidden state based at least in part from the context-based representation and the question-based representation])
2017/0024645 (salesforce.com assignee, published more than one year before effective filing date) teaches “FIG. 1 is a block diagram of one embodiment of a DMN 100 in accordance with the present invention. In this embodiment, DMN 100 structures data as information, questions and answers. Input module 130 is responsible for computing representations of raw inputs, such as training and data input, such that they can be retrieved when later needed. Input module 130 these representations to semantic memory 110 and episodic memory 120. In one embodiment, inputs are structured as a temporal sequence indexable by a time stamp. For example, for video input, this would be an image at each time step. As another example, for written language, there would be a sequence of T.sub.w words v.sub.1, . . . , v.sub.T.sub.w. In one embodiment, DMN uses both unsupervised and supervised learning in order to compute a useful input representation. In one embodiment, DMN computes both context-independent and context dependent hidden states. Input module 130 processes raw inputs and maps them into a representation that is useful for asking questions about this input. The raw inputs may be in any form, for example, visual input, speech input or text input. In one embodiment, input module 130 converts arbitrary natural language input sequences into useful computer understandable retrievable representations. In another embodiment, input module 130 converts audio input into useful computer understandable retrievable representations” (paragraph 32), where DMN refers to “Dynamic Memory Network” (paragraph 26).  This reference appears to describe using an input module (which can be interpreted as an input layer) to generate representations of words (some of which can be interpreted as context of a particular word in a word sequence), and computing context dependent hidden states (suggested to be hidden states that are “adjusted” based on context).  This reference does not appear to describe where a set of attention weights are used to generate two distributions that are used for selecting words for an answer to a question.
2018/0121799 (salesforce.com assignee, 2 shared inventors, published less than one year before filing of parent application) teaches using a bi-directional LSTM to process word embedding vectors to produce additional vectors and embeddings (paragraphs 41-43).
2019/0278835 teaches “Word attention block 204 then receives word-level LSTM 214 hidden states h.sub.i,1.sup.e,w-h.sub.i,N.sup.e,w to generate word attention weights α.sub.i for i={1, . . . N} for all words in a section, which are provided to linear superposition block 244. Linear superposition block 244 generates context vector c.sub.t as a weighted sum of the word-level encoder hidden states h.sub.i,1.sup.e,w-h.sub.i,N.sup.e,w using word attention weights α.sub.i” (paragraph 39).  This reference describes generating word attention weights based on LSTM hidden states, but does not appear to teach generating word attention weights based on an output of an LSTM and an output of a biLSTM.
2021/0004605 (cited subject matter supported by provisional 62/646834) teaches “an attention layer hierarchical temporal memory (HTM) coupled with the first RNN and the second RNN and configured to obtain a temporal attention weight of each video temporal segment in the video based on the temporal information and the hidden representation-based output” (claim 17).  This reference suggests obtaining plural attention weights (“each video temporal segment” suggests multiple segments that each have a corresponding temporal attention weight) based on two RNNs, but does not appear to describe where the attention weights are obtained based on an output of an LSTM and an output of a biLSTM.
	10102844 teaches “As another example, a dialect or accent for a particular region associated with where the voice or manually activated electronic device is located may be determined, and the dialect or accent may be used to select words or a set of words for a response to the question” (col. 3, lines 5-29).  This reference describes selecting words for a response to a question, but does not appear to select the words based on the claimed distributions.
	Fenglong Ma, Radha Chitta, Saurabh Kataria, Jing Zhou, Palghat Ramesh, Tong Sun, Jing Gao, “Long-Term Memory Networks for Question Answering”, 2017, arXiv:1707.01961 teaches using both an external memory module and an LSTM to comprehend input data and generate multi-word answers (Abstract; see also Figure 2).  This reference does not appear to describe generating a context-based representation of a context-based word.  This reference also does not appear to describe selecting words for an answer to the question based on multiple distributions that were generated based on the same set of attention weights that is generated based on a biLSTM output and an LSTM output.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 21, 24-31, 34-38, and 40 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-9 and 14 of U.S. Patent No. 10,776,581, hereafter Parent Patent. Although the claims at issue are not identical, they are not patentably distinct from each other because the claims of this application are rendered obvious by the claims of the Parent Patent.

As per Claim 21 (and similarly claims 31 and 38), Claim 5 of Parent Patent (interpreted as including the limitations of claim 1 of the Parent Patent) suggests (along with its method and medium equivalents) A system for natural language processing, the system comprising: one or more processors; and a memory storing computer-executable instructions, which when executed by the one or more processors, cause the system to perform operations comprising: (lines 1-6 of claim 1 of the Parent Patent)
receiving, at an input layer, a natural language input of a question; performing a first encoding of a context-based word and a question-based word from the question into a context-based representation and a question-based representation; (lines 7-8 of claim 1 of the Parent Patent; encoding words from a question using an input layer at least suggests where the input layer receives the question words [where the question words can be interpreted as “a natural language input of a question”], and the encoding result can be interpreted as a combination of a context-based representation and a question-based representation)
performing, using a bi-directional long-term short-term memory (biLSTM), a a second encoding of the context-based representation and the question-based representation; (lines 7-8 of claim 1 of the Parent Patent and lines 11-13 of claim 1 of the Parent Patent; further encoding an output of the encoding [interpreted as the encoding, using the input layer, first words form a context and second words from a question] is suggested to further/second encode the context-based representation and the question-based representation [suggested to be the output of the input layer encoding])
generating, using a long-term short-term memory (LSTM), a context-adjusted hidden state based at least in part from the context-based representation and the question-based representation; (lines 7-10 and lines 14-17 of claim 1 of the Parent Patent; the context-adjusted hidden state is generated from an output of “the decoding” which is suggested to refer to the decoding, using a self-attention based transformer, an output of the input layer, where the output of the input layer is suggested to be the context-based representation and the question-based representation)
generating, by an attention network, a set of attention weights based on a first output of the biLSTM and a second output of the LSTM; (6th to last limitation of claim 1 of the Parent Patent)
generating, by a vocabulary layer, a first distribution over a plurality of words in a vocabulary based on the set of attention weights; (5th to last limitation of claim 1 of the Parent Patent)
generating, by a context layer, a second distribution over the context-based words based on the set of attention weights; (4th to last limitation of claim 1 of the Parent Patent)
and selecting a set of words for an answer to the question based on the first distribution and the second distribution (last 3 limitations of claim 1 of the Parent Patent and Claim 5 of the Parent Patent; the selecting of a word for inclusion in an answer uses the composite distribution which is based on a weighting over the two distributions, and claim 5 of Parent Patent 1 describes selecting “each word for the answer” [which suggests that the answer has multiple selected words])

	Claim 24 corresponds to Claim 2 of the Parent Patent.
	Claims 25, 34 and 40 correspond to Claim 3 of the Parent Patent.
	Claims 26 and 35 correspond to Claim 4 of the Parent Patent.
	Claim 27 corresponds to Claim 5 of the Parent Patent (the self-attention based transformer performs decoding and so can be interpreted as a “decoder”)
	Claim 28 corresponds to Claim 6 of the Parent Patent.
	Claim 29 corresponds to Claim 7 of the Parent Patent.
	Claim 30 corresponds to Claim 8 of the Parent Patent.
	Claim 1 of the Parent Patent suggests claim 36 (lines 9-10 of claim 1 of the Parent Patent).
	Claim 37 corresponds to Claim 6 of the Parent Patent.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC YEN whose telephone number is (571)272-4249. The examiner can normally be reached M-F 12:00PM -8:30PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RICHEMOND DORVIL can be reached on (571)272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





EY 7/25/2022
/ERIC YEN/Primary Examiner, Art Unit 2658