DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments
The action is responsive to the Applicant’s Amendment filed on 4/08/2022. Claims 1, 3-5, 13, 15-17, 19 and 20 are pending in the application. Claims 1, 3, 4, 13, 15, 19 and 20 are amended. Claim 8 has been canceled.
Applicant’s amendments to the claims have overcome each and every objection previously set forth in the Non-Final Office Action mailed 11/09/2021.

Response to Arguments
Applicant’s arguments with respect to the rejections of claims 1, 3-5, 13, 15-17, 19 and 20 have been fully considered. In view of the claim amendment filed, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made. 
Further, regarding the new limitations recited in claims 1, 3, 4, 13, 15, 19 and 20, it is submitted that they are properly addressed by the new ground of rejection.
Furthermore, it is also submitted that all limitations in pending claims, including those not specifically argued, are properly addressed. The reason is set forth in the rejections. See claim analysis below for detail.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 3-4, 8, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Yoon et al. (US 20200372025 A1, hereinafter Yoon) in view of Orlin et al. (US 20130117012, hereinafter Orlin).

Regarding Claim 1, Yoon discloses a method of obtaining a response to a query inputted by a user, the method comprising: 
receiving a user inputted query (Fig. 1; [0023]: Generally, a user may input a question as natural language query on target corpus 150 via an interface component of client device 105); 
encoding, using a trained model, the user inputted query to produce a context vector (Fig. 2; [0028]: The embedding vectors can be processed to generate context representation(s) 220, which encode contextual information for a particular question and candidate answer; In this embodiment, answer selection component 200 includes a language model 210 which generates embedding vectors for the question and candidate answer, which are processed to capture context representation 220), 
the encoding comprising the following: segmenting the user inputted query into words ([0027]: Generally, language model 210 generates embedding vectors from a particular natural language question Q… The sequence in Q and/or A may comprise units of any suitable sub-division of text (e.g., phoneme, word, phrase, sentence, etc.) [Q corresponds to the query]); and 
matching units from a vocabulary (Fig. 1, target corpus 150) to parts of the words (Fig. 1; [0023]: Generally, environment 100 is suitable for identifying textual similarity between units of textual information [0026]: comparison 240 matches words (or some other sub-division) in the context representation 220 to the corresponding attention-applied vector representation to generate a matched representation (e.g., separate representations for the question and answer, a combined representation)), the vocabulary comprising a plurality of units, wherein each of the units comprises a given number of characters and has a unit length corresponding to the given number of characters ([0027]: The sequence in Q and/or A may comprise units of any suitable sub-division of text (e.g., phoneme, word, phrase, sentence, etc.)… Q and A are lengths of the sequence in Q and A, respectively. See also para [0015], [0021], [0024]); 
retrieving responses with associated response vectors ([0016]: The textual-similarity computing model can be used to select an answer (or a top number of answers) from a set of candidate answers of a target corpus; Fig. 2; [0026]: …comparison 240 matches words (or some other sub-division) in the context representation 220 to the corresponding attention-applied vector representation to generate a matched representation (e.g., separate representations for the question and answer, a combined representation) [the matched representation for the answer corresponds to the response]); 
scoring response vectors against the context vector, wherein the scoring is a measure of the similarity between the context vector and a response vector (Fig. 2, [0026]: The matched representations with latent clustering information can be aggregated 290 and optionally normalized 295 to generate a matching score quantifying textual similarity between the question and candidate answer); and 
outputting the responses with the closest response vectors ([0038]: Fig. 3; As a user enters a question, the candidate answers from the selected target document can be evaluated for textual similarity with the question, and the top answer can be presented or otherwise identified in answer field 320), 
wherein the model has been trained using corresponding queries and responses such that an encoding is used that maximizes the similarity between the response vector and context vector for a corresponding query and response ([0016]: The textual-similarity computing model can be used to select an answer (or a top number of answers) from a set of candidate answers of a target corpus. For example, a query with a natural language question can be encoded, paired with each of a plurality of candidate answers from the target corpus, and fed into the textual-similarity computing model to compute a matching score for each question and answer pair. The candidate answer with the best matching score (or the candidate answers with the top scores) can be selected and presented as an answer(s) to the question).
However, Yoon does not explicitly teach “the matching of units from the vocabulary to parts of the words comprising: for each word: (a) identifying an unmatched part of the word; (b) identifying, from units that have not yet been compared to the part of the word, a unit having a longest unit length; (c) comparing the unit identified to the unmatched part of the word to determine whether the unit matches the part of the word; (d) (i) in response to determining that the unit matches the part of the word, associating the unit with the part of the word and identifying the part of the word as having been matched; or (ii) in response to determining that the unit does not match the part of the word, identifying the unit as having been compared to the part of the word; (e) determining whether any unmatched part of the word remains; and (f) in response to determining that an unmatched part of the word remains, returning to step (a), wherein the context vector is based on the units matched with the one or more parts of the words.”
On the other hand, in the same field of endeavor, Orlin teaches the matching of units from the vocabulary to parts of the words comprising: for each word: 
(a) identifying an unmatched part of the word (Fig. 11; [0006]: systems and methods are provided for identifying a set of word-grams based on a set of unmatched words included in the term; [0070]: At reference numeral 1104, a word-gram length, S, is set to the number of unmatched words, W, in the term. At reference numeral 1106 , a determination is made whether the word-gram length, S, is equal to zero [a word-gram of length S >0 identifies an unmatched part of the word]); 
(b) identifying, from units that have not yet been compared to the part of the word, a unit having a longest unit length (Fig. 2; [0043]: The manager component 208 can be further configured to determine if there is more than one word-gram within the predetermined threshold of the known domain, and associate the word-gram having the longest length (S), with the known domain. [0070]: At reference numeral 1106, a determination is made whether the word-gram length, S, is equal to zero… then at reference numeral 1108, a set of word-grams can be generated by splitting the term is into word-grams of size S [word-grams of size S correspond to a unit having a longest unit length. See also Table 1]); 
(c) comparing the unit identified to the unmatched part of the word to determine whether the unit matches the part of the word (Fig. 11; [0070]: At reference numeral 1110, the word-grams included in the set of word-grams generated at reference numeral 1108 are compared to a set of known values for the unmatched domains, D); 
(d) (i) in response to determining that the unit matches the part of the word, associating the unit with the part of the word and identifying the part of the word as having been matched ([0071]-[0072]: If a match has been found (Y at reference numeral 1112), then the methodology advances to reference numeral 1202 (See FIG. 12)... Referring now to FIG. 12, at reference numeral 1202, in response to finding a match, at reference numeral 1112, the word-gram is matched to the domain, and the domain is marked as matched); or 
(ii) in response to determining that the unit does not match the part of the word, identifying the unit as having been compared to the part of the word (Fig. 11; [0071]: At reference numeral 1112... If a match has not been found (N at reference numeral 1112), then at reference numeral 1114 S is set as S minus 1, and the methodology returns to reference numeral 1106. [Setting S as S minus 1 identifies the unit as having been compared. See also para [0072]]); Page 2 of 13 DM-#8158189Applicant: Steedman Henderson Application Serial No.: 16/709,529 Atty Docket No.: 01635JB.020194 
(e) determining whether any unmatched part of the word remains (Fig. 11; [0070]: At reference numeral 1106, a determination is made whether the word-gram length, S, is equal to zero. If the word-gram length, S, is not equal to zero (N at reference numeral 1106), then at reference numeral 1108, a set of word-grams can be generated by splitting the term is into word-grams of size S [word-gram of size S>0 indicates an unmatched part of the word remains]); and 
(f) in response to determining that an unmatched part of the word remains, returning to step (a), wherein the context vector is based on the units matched with the one or more parts of the words ([0071]: If a match has not been found (N at reference numeral 1112), then at reference numeral 1114 S is set as S minus 1, and the methodology returns to reference numeral 1106); 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Yoon with the teachings of Orlin to include “the matching of units from the vocabulary to parts of the words comprising: for each word: (a) identifying an unmatched part of the word; (b) identifying, from units that have not yet been compared to the part of the word, a unit having a longest unit length; (c) comparing the unit identified to the unmatched part of the word to determine whether the unit matches the part of the word; (d) (i) in response to determining that the unit matches the part of the word, associating the unit with the part of the word and identifying the part of the word as having been matched; or (ii) in response to determining that the unit does not match the part of the word, identifying the unit as having been compared to the part of the word; (e) determining whether any unmatched part of the word remains; and (f) in response to determining that an unmatched part of the word remains, returning to step (a), wherein the context vector is based on the units matched with the one or more parts of the words.”
The motivation for doing so would be to compare unmatched words to known values to identify the most relevant words in a sentence, as recognized by Orlin ([0006] of Orlin: In accordance therewith, a method is provided that includes inspecting a term, determining a set of domains related to the term, identifying a set of word-grams based on a set of unmatched words included in the term).

Regarding Claim 3, the combined teachings of Yoon and Orlin disclose a method according to claim 1, wherein the model is configured to represent each unit in the sequence as an embedding vector, wherein at least one of the units in the vocabulary is an incomplete word (Fig. 2; [0027]: The sequence in Q and/or A may comprise units of any suitable sub-division of text (e.g., phoneme, word, phrase, sentence, etc.), and language model 210 may be configured to generate an embedding vector for each of a plurality of sub-divisions (e.g., phoneme, word, sentence, paragraph, some other suitable sub-division, some combination thereof, etc.) [phoneme corresponds to an incomplete word]).

Regarding Claim 4, the combined teachings of Yoon and Orlin disclose a method according to claim 3, wherein the model is configured to combine the embedding vectors to produce the context vector (Fig. 2; [0002]: The condensed probabilities can be aggregated and combined with a downstream vector representation of the question (or answer) in the textual-similarity computing model; [0028]: The embedding vectors can be processed to generate context representation(s) 220, which encode contextual information for a particular question and candidate answer).

Regarding Claim 19, Yoon discloses a dialogue system for obtaining a response to a query inputted by a user ([0005]: FIG. 1 is a block diagram of an example computing system for automatic answer selection), the system comprising: 
an input for receiving a user inputted query (Fig. 1; [0023]: Generally, a user may input a question as natural language query on target corpus 150 via an interface component of client device 105); 
a processor ([0045]: With reference to FIG. 6, computing device 600 includes bus 610 that directly or indirectly couples the following devices: memory 612, one or more processors 614), configured to:  
encode, using a trained model, the user inputted query to produce a context vector (Fig. 2; [0028]: The embedding vectors can be processed to generate context representation(s) 220, which encode contextual information for a particular question and candidate answer; In this embodiment, answer selection component 200 includes a language model 210 which generates embedding vectors for the question and candidate answer, which are processed to capture context representation 220), 
wherein the model has been trained using corresponding queries and responses such that an encoding is used that maximises the similarity between the response vector and the context vector for a corresponding query and response ([0016]: The textual-similarity computing model can be used to select an answer (or a top number of answers) from a set of candidate answers of a target corpus. For example, a query with a natural language question can be encoded, paired with each of a plurality of candidate answers from the target corpus, and fed into the textual-similarity computing model to compute a matching score for each question and answer pair. The candidate answer with the best matching score (or the candidate answers with the top scores) can be selected and presented as an answer(s) to the question)); 
retrieve responses with associated response vectors ([0026]: …comparison 240 matches words (or some other sub-division) in the context representation 220 to the corresponding attention-applied vector representation to generate a matched representation (e.g., separate representations for the question and answer, a combined representation) [the matched representation for the answer corresponds to the response]); 
score response vectors against the context vector wherein the scoring is a measure of the similarity between the context vector and a response vector ([0026]: The matched representations with latent clustering information can be aggregated 290 and optionally normalized 295 to generate a matching score quantifying textual similarity between the question and candidate answer);  and 
select the responses with the closest response vectors, and an output, configured to output speech or text corresponding to the selected responses ([0038]: Fig. 3; As a user enters a question, the candidate answers from the selected target document can be evaluated for textual similarity with the question, and the top answer can be presented or otherwise identified in answer field 320).
However, Yoon does not explicitly teach “the matching of units from the vocabulary to parts of the words comprising: for each word: (a) identifying an unmatched part of the word; (b) identifying, from units that have not yet been compared to the part of the word, a unit having a longest unit length; (c) comparing the unit identified to the unmatched part of the word to determine whether the unit matches the part of the word; (d) (i) in response to determining that the unit matches the part of the word, associating the unit with the part of the word and identifying the part of the word as having been matched; or (ii) in response to determining that the unit does not match the part of the word, identifying the unit as having been compared to the part of the word; (e) determining whether any unmatched part of the word remains; and (f) in response to determining that an unmatched part of the word remains, returning to step (a), wherein the context vector is based on the units matched with the one or more parts of the words.”
On the other hand, in the same field of endeavor, Orlin teaches the matching of units from the vocabulary to parts of the words comprising: for each word: 
(a) identifying an unmatched part of the word (Fig. 11; [0006]: systems and methods are provided for identifying a set of word-grams based on a set of unmatched words included in the term; [0070]: At reference numeral 1104, a word-gram length, S, is set to the number of unmatched words, W, in the term. At reference numeral 1106 , a determination is made whether the word-gram length, S, is equal to zero [a word-gram of length S >0 identifies an unmatched part of the word]); 
(b) identifying, from units that have not yet been compared to the part of the word, a unit having a longest unit length (Fig. 2; [0043]: The manager component 208 can be further configured to determine if there is more than one word-gram within the predetermined threshold of the known domain, and associate the word-gram having the longest length (S), with the known domain. [0070]: At reference numeral 1106, a determination is made whether the word-gram length, S, is equal to zero… then at reference numeral 1108, a set of word-grams can be generated by splitting the term is into word-grams of size S [word-grams of size S correspond to a unit having a longest unit length. See also Table 1]); 
(c) comparing the unit identified to the unmatched part of the word to determine whether the unit matches the part of the word (Fig. 11; [0070]: At reference numeral 1110, the word-grams included in the set of word-grams generated at reference numeral 1108 are compared to a set of known values for the unmatched domains, D); 
(d) (i) in response to determining that the unit matches the part of the word, associating the unit with the part of the word and identifying the part of the word as having been matched ([0071]-[0072]: If a match has been found (Y at reference numeral 1112), then the methodology advances to reference numeral 1202 (See FIG. 12)... Referring now to FIG. 12, at reference numeral 1202, in response to finding a match, at reference numeral 1112, the word-gram is matched to the domain, and the domain is marked as matched); or 
(ii) in response to determining that the unit does not match the part of the word, identifying the unit as having been compared to the part of the word (Fig. 11; [0071]: At reference numeral 1112... If a match has not been found (N at reference numeral 1112), then at reference numeral 1114 S is set as S minus 1, and the methodology returns to reference numeral 1106. [Setting S as S minus 1 identifies the unit as having been compared. See also para [0072]]); Page 2 of 13 DM-#8158189Applicant: Steedman Henderson Application Serial No.: 16/709,529 Atty Docket No.: 01635JB.020194 
(e) determining whether any unmatched part of the word remains (Fig. 11; [0070]: At reference numeral 1106, a determination is made whether the word-gram length, S, is equal to zero. If the word-gram length, S, is not equal to zero (N at reference numeral 1106), then at reference numeral 1108, a set of word-grams can be generated by splitting the term is into word-grams of size S [word-gram of size S>0 indicates an unmatched part of the word remains]); and 
(f) in response to determining that an unmatched part of the word remains, returning to step (a), wherein the context vector is based on the units matched with the one or more parts of the words ([0071]: If a match has not been found (N at reference numeral 1112), then at reference numeral 1114 S is set as S minus 1, and the methodology returns to reference numeral 1106); 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Yoon with the teachings of Orlin to include “the matching of units from the vocabulary to parts of the words comprising: for each word: (a) identifying an unmatched part of the word; (b) identifying, from units that have not yet been compared to the part of the word, a unit having a longest unit length; (c) comparing the unit identified to the unmatched part of the word to determine whether the unit matches the part of the word; (d) (i) in response to determining that the unit matches the part of the word, associating the unit with the part of the word and identifying the part of the word as having been matched; or (ii) in response to determining that the unit does not match the part of the word, identifying the unit as having been compared to the part of the word; (e) determining whether any unmatched part of the word remains; and (f) in response to determining that an unmatched part of the word remains, returning to step (a), wherein the context vector is based on the units matched with the one or more parts of the words.”
The motivation for doing so would be to compare unmatched words to known values to identify the most relevant words in a sentence, as recognized by Orlin ([0006] of Orlin: In accordance therewith, a method is provided that includes inspecting a term, determining a set of domains related to the term, identifying a set of word-grams based on a set of unmatched words included in the term).

Regarding Claim 20, Yoon discloses a carrier medium comprising computer readable code ([0046]: Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data) configured to cause a computer to perform the following for obtaining a response to a query inputted by a user: 
receiving a user inputted query (Fig. 1; [0023]: Generally, a user may input a question as natural language query on target corpus 150 via an interface component of client device 105); 
encoding, using a trained model, the user inputted query to produce a context vector (Fig. 2; [0028]: The embedding vectors can be processed to generate context representation(s) 220, which encode contextual information for a particular question and candidate answer; In this embodiment, answer selection component 200 includes a language model 210 which generates embedding vectors for the question and candidate answer, which are processed to capture context representation 220);
the encoding comprising the following: segmenting the user inputted query into words ([0027]: Generally, language model 210 generates embedding vectors from a particular natural language question Q… The sequence in Q and/or A may comprise units of any suitable sub-division of text (e.g., phoneme, word, phrase, sentence, etc.) [Q corresponds to the query]); and 
matching units from a vocabulary (Fig. 1, target corpus 150) to parts of the words (Fig. 1; [0023]: Generally, environment 100 is suitable for identifying textual similarity between units of textual information [0026]: comparison 240 matches words (or some other sub-division) in the context representation 220 to the corresponding attention-applied vector representation to generate a matched representation (e.g., separate representations for the question and answer, a combined representation)), the vocabulary comprising a plurality of units, wherein each of the units comprises a given number of characters and has a unit length corresponding to the given number of characters ([0027]: The sequence in Q and/or A may comprise units of any suitable sub-division of text (e.g., phoneme, word, phrase, sentence, etc.)… Q and A are lengths of the sequence in Q and A, respectively. See also para [0015], [0021], [0024]); 
retrieving responses with associated response vector ([0026]: …comparison 240 matches words (or some other sub-division) in the context representation 220 to the corresponding attention-applied vector representation to generate a matched representation (e.g., separate representations for the question and answer, a combined representation) [the matched representation for the answer corresponds to the response]);  
scoring response vectors against the context vector, wherein the scoring is a measure of the similarity between the context vector and a response vector ([0026]: The matched representations with latent clustering information can be aggregated 290 and optionally normalized 295 to generate a matching score quantifying textual similarity between the question and candidate answer); and 
outputting the responses with the closest response vectors ([0038]: Fig. 3; As a user enters a question, the candidate answers from the selected target document can be evaluated for textual similarity with the question, and the top answer can be presented or otherwise identified in answer field 320), wherein the model has been trained using corresponding queries and responses such that an encoding is used that maximises the similarity between the response vector and context vector for a corresponding query and response ([0016]: The textual-similarity computing model can be used to select an answer (or a top number of answers) from a set of candidate answers of a target corpus. For example, a query with a natural language question can be encoded, paired with each of a plurality of candidate answers from the target corpus, and fed into the textual-similarity computing model to compute a matching score for each question and answer pair. The candidate answer with the best matching score (or the candidate answers with the top scores) can be selected and presented as an answer(s) to the question).
However, Yoon does not explicitly teach “the matching of units from the vocabulary to parts of the words comprising: for each word: (a) identifying an unmatched part of the word; (b) identifying, from units that have not yet been compared to the part of the word, a unit having a longest unit length; (c) comparing the unit identified to the unmatched part of the word to determine whether the unit matches the part of the word; (d) (i) in response to determining that the unit matches the part of the word, associating the unit with the part of the word and identifying the part of the word as having been matched; or (ii) in response to determining that the unit does not match the part of the word, identifying the unit as having been compared to the part of the word; (e) determining whether any unmatched part of the word remains; and (f) in response to determining that an unmatched part of the word remains, returning to step (a), wherein the context vector is based on the units matched with the one or more parts of the words.”
On the other hand, in the same field of endeavor, Orlin teaches the matching of units from the vocabulary to parts of the words comprising: for each word: 
(a) identifying an unmatched part of the word (Fig. 11; [0006]: systems and methods are provided for identifying a set of word-grams based on a set of unmatched words included in the term; [0070]: At reference numeral 1104, a word-gram length, S, is set to the number of unmatched words, W, in the term. At reference numeral 1106 , a determination is made whether the word-gram length, S, is equal to zero [a word-gram of length S >0 identifies an unmatched part of the word]); 
(b) identifying, from units that have not yet been compared to the part of the word, a unit having a longest unit length (Fig. 2; [0043]: The manager component 208 can be further configured to determine if there is more than one word-gram within the predetermined threshold of the known domain, and associate the word-gram having the longest length (S), with the known domain. [0070]: At reference numeral 1106, a determination is made whether the word-gram length, S, is equal to zero… then at reference numeral 1108, a set of word-grams can be generated by splitting the term is into word-grams of size S [word-grams of size S correspond to a unit having a longest unit length. See also Table 1]); 
(c) comparing the unit identified to the unmatched part of the word to determine whether the unit matches the part of the word (Fig. 11; [0070]: At reference numeral 1110, the word-grams included in the set of word-grams generated at reference numeral 1108 are compared to a set of known values for the unmatched domains, D); 
(d) (i) in response to determining that the unit matches the part of the word, associating the unit with the part of the word and identifying the part of the word as having been matched ([0071]-[0072]: If a match has been found (Y at reference numeral 1112), then the methodology advances to reference numeral 1202 (See FIG. 12)... Referring now to FIG. 12, at reference numeral 1202, in response to finding a match, at reference numeral 1112, the word-gram is matched to the domain, and the domain is marked as matched); or 
(ii) in response to determining that the unit does not match the part of the word, identifying the unit as having been compared to the part of the word (Fig. 11; [0071]: At reference numeral 1112... If a match has not been found (N at reference numeral 1112), then at reference numeral 1114 S is set as S minus 1, and the methodology returns to reference numeral 1106. [Setting S as S minus 1 identifies the unit as having been compared. See also para [0072]]); Page 2 of 13 DM-#8158189Applicant: Steedman Henderson Application Serial No.: 16/709,529 Atty Docket No.: 01635JB.020194 
(e) determining whether any unmatched part of the word remains (Fig. 11; [0070]: At reference numeral 1106, a determination is made whether the word-gram length, S, is equal to zero. If the word-gram length, S, is not equal to zero (N at reference numeral 1106), then at reference numeral 1108, a set of word-grams can be generated by splitting the term is into word-grams of size S [word-gram of size S>0 indicates an unmatched part of the word remains]); and 
(f) in response to determining that an unmatched part of the word remains, returning to step (a), wherein the context vector is based on the units matched with the one or more parts of the words ([0071]: If a match has not been found (N at reference numeral 1112), then at reference numeral 1114 S is set as S minus 1, and the methodology returns to reference numeral 1106); 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Yoon with the teachings of Orlin to include “the matching of units from the vocabulary to parts of the words comprising: for each word: (a) identifying an unmatched part of the word; (b) identifying, from units that have not yet been compared to the part of the word, a unit having a longest unit length; (c) comparing the unit identified to the unmatched part of the word to determine whether the unit matches the part of the word; (d) (i) in response to determining that the unit matches the part of the word, associating the unit with the part of the word and identifying the part of the word as having been matched; or (ii) in response to determining that the unit does not match the part of the word, identifying the unit as having been compared to the part of the word; (e) determining whether any unmatched part of the word remains; and (f) in response to determining that an unmatched part of the word remains, returning to step (a), wherein the context vector is based on the units matched with the one or more parts of the words.”
The motivation for doing so would be to compare unmatched words to known values to identify the most relevant words in a sentence, as recognized by Orlin ([0006] of Orlin: In accordance therewith, a method is provided that includes inspecting a term, determining a set of domains related to the term, identifying a set of word-grams based on a set of unmatched words included in the term).

Claims 13, 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Yoon et al. (US 20200372025 A1, hereinafter Yoon) in view of Orlin et al. (US 20130117012, hereinafter Orlin), and in further view of Vargas et al. (US Patent No. 10482183 B1, hereinafter Vargas).

Regarding Claim 13, Yoon discloses a method of training a response retrieval system to provide a response to a query inputted by a user ([0012]: The conventional compare-aggregate model is trained using a list-wise approach in which a question Q is paired with a set of valid answers A), the method comprising: 
providing a set of training data, wherein the training data set comprises queries and corresponding responses ([0012]: … and a training set comprising question-answer sets and corresponding labels are used to train the model using KL-divergence loss); 
encoding, using a first model, each user inputted query to produce a context vector (Fig. 2; [0028]: The embedding vectors can be processed to generate context representation(s) 220, which encode contextual information for a particular question and candidate answer);
the encoding comprising the following: segmenting the user inputted query into words ([0027]: Generally, language model 210 generates embedding vectors from a particular natural language question Q… The sequence in Q and/or A may comprise units of any suitable sub-division of text (e.g., phoneme, word, phrase, sentence, etc.) [Q corresponds to the query]); and 
matching units from a vocabulary (Fig. 1, target corpus 150) to parts of the words (Fig. 1; [0023]: Generally, environment 100 is suitable for identifying textual similarity between units of textual information [0026]: comparison 240 matches words (or some other sub-division) in the context representation 220 to the corresponding attention-applied vector representation to generate a matched representation (e.g., separate representations for the question and answer, a combined representation)), the vocabulary comprising a plurality of units, wherein each of the units comprises a given number of characters and has a unit length corresponding to the given number of characters ([0027]: The sequence in Q and/or A may comprise units of any suitable sub-division of text (e.g., phoneme, word, phrase, sentence, etc.)… Q and A are lengths of the sequence in Q and A, respectively. See also para [0015], [0021], [0024]); 
However, Yoon does not explicitly teach “the matching of units from the vocabulary to parts of the words comprising: for each word: (a) identifying an unmatched part of the word; (b) identifying, from units that have not yet been compared to the part of the word, a unit having a longest unit length; (c) comparing the unit identified to the unmatched part of the word to determine whether the unit matches the part of the word; (d) (i) in response to determining that the unit matches the part of the word, associating the unit with the part of the word and identifying the part of the word as having been matched; or (ii) in response to determining that the unit does not match the part of the word, identifying the unit as having been compared to the part of the word; (e) determining whether any unmatched part of the word remains; and (f) in response to determining that an unmatched part of the word remains, returning to step (a), wherein the context vector is based on the units matched with the one or more parts of the words; encoding each response to produce a response vector using a second model; and training the first and second models using the condition that the similarity between the context vector and the response vector is higher for a corresponding response and query and that the similarity between the context vector and the response vector is lower for a random response and query.”
On the other hand, in the same field of endeavor, Orlin teaches the matching of units from the vocabulary to parts of the words comprising: for each word: 
(a) identifying an unmatched part of the word (Fig. 11; [0006]: systems and methods are provided for identifying a set of word-grams based on a set of unmatched words included in the term; [0070]: At reference numeral 1104, a word-gram length, S, is set to the number of unmatched words, W, in the term. At reference numeral 1106 , a determination is made whether the word-gram length, S, is equal to zero [a word-gram of length S >0 identifies an unmatched part of the word]); 
(b) identifying, from units that have not yet been compared to the part of the word, a unit having a longest unit length (Fig. 2; [0043]: The manager component 208 can be further configured to determine if there is more than one word-gram within the predetermined threshold of the known domain, and associate the word-gram having the longest length (S), with the known domain. [0070]: At reference numeral 1106, a determination is made whether the word-gram length, S, is equal to zero… then at reference numeral 1108, a set of word-grams can be generated by splitting the term is into word-grams of size S [word-grams of size S correspond to a unit having a longest unit length. See also Table 1]); 
(c) comparing the unit identified to the unmatched part of the word to determine whether the unit matches the part of the word (Fig. 11; [0070]: At reference numeral 1110, the word-grams included in the set of word-grams generated at reference numeral 1108 are compared to a set of known values for the unmatched domains, D); 
(d) (i) in response to determining that the unit matches the part of the word, associating the unit with the part of the word and identifying the part of the word as having been matched ([0071]-[0072]: If a match has been found (Y at reference numeral 1112), then the methodology advances to reference numeral 1202 (See FIG. 12)... Referring now to FIG. 12, at reference numeral 1202, in response to finding a match, at reference numeral 1112, the word-gram is matched to the domain, and the domain is marked as matched); or 
(ii) in response to determining that the unit does not match the part of the word, identifying the unit as having been compared to the part of the word (Fig. 11; [0071]: At reference numeral 1112... If a match has not been found (N at reference numeral 1112), then at reference numeral 1114 S is set as S minus 1, and the methodology returns to reference numeral 1106. [Setting S as S minus 1 identifies the unit as having been compared. See also para [0072]]); Page 2 of 13 DM-#8158189Applicant: Steedman Henderson Application Serial No.: 16/709,529 Atty Docket No.: 01635JB.020194 
(e) determining whether any unmatched part of the word remains (Fig. 11; [0070]: At reference numeral 1106, a determination is made whether the word-gram length, S, is equal to zero. If the word-gram length, S, is not equal to zero (N at reference numeral 1106), then at reference numeral 1108, a set of word-grams can be generated by splitting the term is into word-grams of size S [word-gram of size S>0 indicates an unmatched part of the word remains]); and 
(f) in response to determining that an unmatched part of the word remains, returning to step (a), wherein the context vector is based on the units matched with the one or more parts of the words ([0071]: If a match has not been found (N at reference numeral 1112), then at reference numeral 1114 S is set as S minus 1, and the methodology returns to reference numeral 1106).
Additionally, Vargas teaches encoding each response to produce a response vector ([Col. 2, lines 38-42]: By operating in word embedding space, embodiments may become more efficient and effective by making use of the semantic meaning encoded within the embeddings; [Col. 14, lines 11-15]: The system may then be able to determine the most similar predefined phrase and then respond with a predefined response that is associated with that predefined phrase. The predefined phrases may be stored as sets of embedding vectors) using a second model ([Abstract]: A computer-implemented method comprising:… calculating a second likelihood-based measure representing how well a second model can be fit to the first and second sets of words); and 
training the first and second models ([Col. 6, lines 66-67]: The embedding service 9 generates vector representations based on machine learning models that have been trained on training data) using the condition that the similarity between the context vector and the response vector is higher for a corresponding response and query and that the similarity between the context vector and the response vector is lower for a random response and query ([Col. 7, lines 42-49]: It can be seen from the above description that the effectiveness of retrieval of responses to queries depends strongly on the ability to determine the similarity between the queries and predetermined queries (that have predetermined responses) stored in the content database 13; [Col. 8, lines 37-58]: FIG. 3 shows a flow chart of a method 300 of determining the similarity between two sets of words in accordance with an embodiment… These probability values are then used to determine the similarity score [The similarity score will be higher for a corresponding response and query and lower for a random response]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Yoon with the teachings of Orlin and Vargas to include “the matching of units from the vocabulary to parts of the words comprising: for each word: (a) identifying an unmatched part of the word; (b) identifying, from units that have not yet been compared to the part of the word, a unit having a longest unit length; (c) comparing the unit identified to the unmatched part of the word to determine whether the unit matches the part of the word; (d) (i) in response to determining that the unit matches the part of the word, associating the unit with the part of the word and identifying the part of the word as having been matched; or (ii) in response to determining that the unit does not match the part of the word, identifying the unit as having been compared to the part of the word; (e) determining whether any unmatched part of the word remains; and (f) in response to determining that an unmatched part of the word remains, returning to step (a), wherein the context vector is based on the units matched with the one or more parts of the words; encoding each response to produce a response vector using a second model; and training the first and second models using the condition that the similarity between the context vector and the response vector is higher for a corresponding response and query and that the similarity between the context vector and the response vector is lower for a random response and query.”
The motivation for doing so would be to compare unmatched words to known values to identify the most relevant words in a sentence, as recognized by Orlin ([0006] of Orlin: In accordance therewith, a method is provided that includes inspecting a term, determining a set of domains related to the term, identifying a set of word-grams based on a set of unmatched words included in the term), and to represent the similarity between the query and response, as recognized by Vargas ([Abstract] of Vargas: A computer-implemented method comprising:… calculating a similarity score based on a ratio of the first likelihood measure to the second likelihood measure, the similarity score being representative of the similarity between the first and second sets of words).

Regarding Claim 15, the combined teachings of Yoon, Orlin, and Vargas disclose a method of training according to claim 13, and 
Yoon further teaches wherein the first model is configured to segment an inputted query into a sequence of units from a vocabulary of units ([0015]: As such, the target corpus can be split into a desired set of candidate answers (e.g., sentences, phrases, paragraphs, sections, sub-divisions, etc.), the candidate answers can be encoded into corresponding vector representations, and the vector representations can be clustered into a designated number of latent memory vectors)
and represent each unit in the sequence as an embedding vector, wherein at least one of the units in the vocabulary is an incomplete word (See Yoon, Fig.2; [0027]: The sequence in Q and/or A may comprise units of any suitable sub-division of text (e.g., phoneme, word, phrase, sentence, etc.), and language model 210 may be configured to generate an embedding vector for each of a plurality of sub-divisions (e.g., phoneme, word, sentence, paragraph, some other suitable sub-division, some combination thereof, etc.) [phoneme corresponds to an incomplete word]).

Regarding Claim 16, the combined teachings of Yoon, Orlin, and Vargas disclose a method of training according to claim 13, and
Vargas further teaches wherein the second model uses at least some of the parameters of the first model (See Vargas, [Col. 3, lines 6-11]: According to one embodiment the first model comprises a shared set of parameters that describe the shared parametric distribution and the second model comprises first and second sets of parameters, the first set of parameters describing the first parametric distribution and the second set of parameters describing the second parametric distribution).

Regarding Claim 17, the combined teachings of Yoon, Orlin, and Vargas disclose a method of training according to claim 13, and 
Vargas further teaches wherein the second model is configured to segment an inputted response into a sequence of units from the vocabulary of units and represent each unit in the sequence as an embedding vector (See Vargas, [Col. 6, lines 66-68]: The embedding service 9 generates vector representations based on machine learning models that have been trained on training data; Fig. 5; [Col. 13, lines 48-52]: The embedding module 513 is operable to retrieve the sets of words for comparison and determine embedding vectors for the words (for instance, by multiplying word vectors for the words with an embedding matrix), 
wherein the first model uses at least some of the parameters of the second model (See Vargas, [Abstract]: A computer-implemented method comprising: receiving the first set of words and the second set of words… the first model comprising a shared parametric distribution representing both the first and second sets of words).

Claims 5 is rejected under 35 U.S.C. 103 as being unpatentable over Yoon et al. (US 20200372025 A1, hereinafter Yoon) in view of Orlin et al. (US 20130117012, hereinafter Orlin), and in further view of Mugan et al. (US Patent No. 10445356 B1, hereinafter Mugan).

Regarding Claim 5, the combined teachings of Yoon and Orlin disclose a method according to claim 1.
However, the combined teachings of Yoon and Orlin do not explicitly teach “wherein there are 30 000 to 50 000 units in the vocabulary”.
On the other hand, in the same field of endeavor, Mugan teaches wherein there are 30 000 to 50 000 units in the vocabulary ([Col. 12, lines 56-58]: In typical embodiments, the vocabulary may include 30,000 to 100,000 words (w), and the vector length m may be between 100 to 1,000).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the method of Yoon and Orlin with the teachings of Mugan to include “wherein there are 30 000 to 50 000 units in the vocabulary”.
The motivation for doing so would be to predict the probability distribution over the next word in a sentence using the number of words in the vocabulary, as recognized by Mugan ([Col. 6, lines 36-43] of Mugan: FIG. 2 illustrates an exemplary basic RNN 200 for predicting the probability distribution over the next word in a sentence…  the number of words in the vocabulary is given by w).



Examiner Note
Examiner has cited particular columns/paragraph and line numbers in the references applied to the claims above for the convenience of the applicant. Although the specified citations are representative of the teachings of the art and are applied to specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
In the case of amending the Claimed invention, Applicant is respectfully requested to indicate the portion(s) of the specification which dictate(s) the structure relied on for proper interpretation and also to verify and ascertain the metes and bounds of the claimed invention. This will assist in expediting compact prosecution. MPEP 714.02 recites: "Applicant should also specifically point out the support for any amendments made to the disclosure. See MPEP § 163.06. An amendment which does not comply with the provisions of 37 CFR 1.12l(b), (c),  (d), and (h) may be held not fully responsive. See MPEP § 714." Amendments not pointing to
specific support in the disclosure may be deemed as not complying with provisions of 37 C.F.R. 1.131(b), (c), (d), and (h) and therefore held not fully responsive. Generic statements such as "Applicants believe no new matter has been introduced" may be deemed insufficient.
 
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIRLEY D. HICKS whose telephone number is (571)272-3304.  The examiner can normally be reached on Mon - Fri 7:30 - 4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on (571) 272-4034.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/S D H/Examiner, Art Unit 2168
/IRETE F EHICHIOYA/Supervisory Patent Examiner, Art Unit 2168