DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
The following title is suggested: Method and System Using Contextual Display Data for an Utterance in Natural Language Understanding.
The disclosure is objected to because of the following informalities:
In ¶[0001], U.S. Patent Application No. 15/858,174 should be updated as “now U.S. Patent No. 10,515,625 issued 24 December 2019”.   
Appropriate correction is required.

Claim Objections
Claims 2 to 16 and 21 to 25 are objected to because of the following informalities:
Independent claims 2 and 12 set forth a limitation of “third input data comprising a plurality of elements, wherein a first element of the plurality of elements represents a first content item”, but “a first content item” should be “the first content item” because “a first content item” has a prior recitation in a step of “receiving contextual data . . . wherein a first content item of the plurality of content items is associated. . . .”

Claim 10 sets forth “wherein the second NLU output data represents label”, which should be “wherein the second NLU output data represents a label”.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2 to 5, 9 to 15, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Sarikaya et al. (U.S. Patent No. 9,767,091) in view of Han et al. (U.S. Patent Publication 2011/0029301).
Concerning independent claims 2 and 12, Sarikaya et al. discloses a method and system for natural language understanding, comprising:
“as performed by a computing system comprising one or more computer processors configured to execute specific instructions” – components of computing device 500 with which aspects may be practiced include at least one processing unit 502 (column 14, Lines 1 to 12: Figure 5);

“receiving contextual data, wherein the contextual data represents a plurality of content items displayed by the computing device when the utterance occurred, and wherein a first content item of the plurality of content items is associated with domain data representing a first domain of a plurality of domains” – domain set predictor 135 receives a set of possible domains (“a plurality of domains”) identified by natural language analysis component 130, and adds, modifies, or changes the possible domains based on other contextual information (“receiving contextual data”), which may include information previously received, turn-based information, and display information (“the contextual data represents a plurality of content items displayed by the computing device when the utterance occurred”) may be used to refine the set of possible domains (column 6, lines 40 to 51: Figure 1); contextual information may include information extracted from each turn in a turn; contextual information may include a response to a previous turn by dynamic system 100, where the response to a previous turn may include items located on the display of the client computing device (“the contextual data represents a plurality of content items displayed by the computing device when the utterance occurred”) (column 10, lines 29 to 42: Figure 1); implicitly, what is currently being displayed is generally associated with one of the domains, e.g., a map for a mapping application domain or a music player for a music playing application domain 
“generating natural language understanding (‘NLU’) input data for an NLU subsystem, wherein the NLU input data comprises: first input data representing at least a portion of the utterance” – input is sent to extraction component 120 that has a character n-gram extractor 122 (column 5, lines 35 to 49: Figure 1); word n-gram extractor component 124 may parse a natural language expression (column 5, lines 60 to 63: Figure 1); extraction component 120 sends extracted n-grams to natural language analysis component 130 (“an NLU subsystem”) (column 6, lines 3 to 6: Figure 1); here, “first input data” is n-grams representing text of spoken input, where these n-grams represent “at least a portion of the utterance”, which are sent to natural language analysis component 130 for “generating natural language understanding (‘NLU’) input data”; 
“second input data that indicates that content associated with the first domain was displayed when the utterance occurred” – once the set of possible domains is identified, the identified possible domains along with relevant n-grams and contextual information are passed to hypothesis generation component 140; hypothesis generation component 140 analyzes the information and assigns a confidence number to one or more user intents associated with each possible domain predicted by domain set predictor 125; a confidence score is a ranking, e.g., a number between 0 and 1, which represents the likelihood that a particular predicted intent is the actual intent of the user (column 6, lines 52 to 62: Figure 1); contextual information may include information previously received, turn-based information, and display information may be used to 
“generating NLU output data using the NLU subsystem, the first input data, and the second input data, [and the third input data,] wherein the NLU output data represents a correspondence of the utterance to intent data associated with the first domain” – a set of possible domains is identified and a confidence score is given to each predicted intent for each possible domain; information may be sent to domain component 150; the information may include the predicted intent of the user, one or more n-grams, and any contextual information; domain component 150 is associated with one or more domain applications including a first domain application 160, a second domain application 170, and a third domain application 180 (column 8, line 63 to column 9, line 6: Figure 1); here, “NLU output data” is a predicted intent of a domain, where the predicted intent is based on a score for that intent in the domain; moreover, an intent and a domain are based on n-grams of the input speech (“the first input data”) and scores for the intent and domain (“the second input data”); a score for an intent and a domain are “a correspondence of the utterance to intent data associated with the first domain”;
“sending the intent data to the first domain” – upon receiving a predicted intent from hypothesis generation component 140, domain component 150 opens an application, e.g., first domain application 160, that corresponds to the predicted intent; information relevant to the domain applications that is stored by or accessible to domain component 150 may be sent by domain component 150 to any of the domain applications associated with domain component 150; if first domain application 160 is a 
Concerning independent claims 2 and 12, Sarikaya et al. discloses that a domain is predicted based on contextual information that includes display information, where this contextual information may include items located on a display of a client computing device.  (Column 6, Lines 40 to 49; Column 10, Lines 29 to 42)  Sarikaya et al., then, discloses contextual data representing “a plurality of content items displayed by the computing device” because it is stated that contextual information may include items located on a display of a client computing device.  Similarly, this contextual information provides “second input data that indicates content associated with the first domain was displayed” due to domain prediction being based on contextual information of items on the display.  Moreover, Sarikaya et al. discloses that an input component receives spoken input from a user, so that contextual information is implicitly being used to predict a domain at the same time that a spoken input from a user is received.  Here, Sarikaya et al. discloses receiving “first input data representing at least a portion of the utterance” and “second input data that indicates content associated with the first domain was displayed when the utterance occurred” because domain prediction is based on speech received and contextual information of what is currently being displayed.  However, Sarikaya et al. does not expressly disclose “third input data corresponding to a plurality of elements, wherein a first element of the plurality of elements represents a first content item of the plurality of content items, and wherein a second element of the plurality of elements represents a second content item of the plurality of content items”.  Sarikaya et al. does not expressly disclose this data structure, but appears to presuppose that there can be a plurality of content items currently displayed and that the algorithm is aware of which ones are currently displayed.
Concerning independent claims 2 and 12, Han et al. teaches whatever limitations directed to “third input data” that may be omitted by Sarikaya et al.  Generally, Han et al. teaches recognizing speech according to dynamic display.  (Abstract)  Words displayed as text on a screen may become objects having weights, and a domain may include a group of words that can be recognized as associated with each other.  A domain may be a broad geographic region in a map system, and a specific region may be defined as a domain, e.g., a Rome domain, a Colosseum domain, and a Gladiator domain.  A domain associated with a current screen may be created, and word information and domain information may be acquired for recognized objects.  Speech recognizer 120 may adjust a word weight for at least one word associated with the current screen and a domain weight for at least one domain included in the current screen, so that a language model assigns greater weight to words and domains related to a current screen than weights assigned to words and domains not related to the current screen.  (¶[0050] - ¶[0054]: Figure 2)  Words and domains included in a current screen may be transferred to controller 110 and display information manager 210, where word information may include word IDs, word coordinates, and domain information may include domain IDs and domain area coordinates.  (¶[0065]: Figure 2)  Word weight e.g., Gyeongbokgung, Gyotaejeon, etc., which are displayed on the screen at time t.  (¶[0107]: Figures 7A to 7B)  Figures 7A to 7B and 11A to 11B, then, illustrate that each of displayed location objects on a map, e.g., Gyeongbokgung, Royal Museum, Galleria Hyundai, National Folk Museum, and Kogsuji, is “a first content item of a plurality of content items”, and “a first element” is a weight assigned to “a first content item”, e.g., a weight value of 0.5 for Gyeongbokgung and “a second element” is a weight assigned to “a second content item”, e.g., a weight value of 0.4 for Royal Museum.  Han et al.’s weights for each displayed object, then, can be construed to be “third input data” corresponding to ‘elements’ for ‘content items’ of Applicants.  An objective is to improve a speech recognition rate and speed by reflecting information for a dynamic display.  (Abstract)  It would have been obvious to one having ordinary skill in the art to provide third input data of first and second elements for first and second content items as taught by Han et al. for natural language understanding using contextual information of Sarikaya et al. for a purpose of improving speech recognition rate and speed.
Sarikaya et al. discloses that a set of possible domains is identified, and hypothesis generation component 140 analyzes the information and assigns a confidence number to one or more user intents associated with each possible domain.  The confidence score is a ranking, e.g., a number between 0 and 1, which represents the likelihood that a particular predicted intent is the actual intent of the user.  (Column 6, Lines 52 to 67: Figure 1)  A set of possible domains is identified and a confidence score is given to each predicted intent for each possible domain; information may be sent to domain component 150.  The information may include the predicted intent of the user, one or more n-grams, and any contextual information; domain component 150 is associated with one or more domain applications including a first domain application 160, a second domain application 170, and a third domain application 180.  (Column 8, Line 63 to Column 9, Line 6: Figure 1)  Here, “first NLU output data” is a predicted intent of a domain, where the predicted intent is based on a score for that intent in the domain, and “second NLU output data” is a predicted intent of a different domain.  Determining a predicted intent with the highest score is “selecting the intent based at least partly on the analysis of the first NLU output data with respect to the second NLU output data.”  
Concerning claims 4 and 14, Sarikaya et al. discloses that domain set predictor 135 receives a set of possible domains identified by natural language analysis component 130, and adds, modifies, or changes the possible domains based on other contextual information, which may include information previously received, turn-based information, and display information may be used to refine the set of possible domains (column 6, lines 40 to 51: Figure 1); contextual information may include information 
Concerning claims 5, 9, and 15, Sarikaya et al. discloses that contextual information may include information extracted from each turn, a response to a previous turn of how the system responded to the previous request from a user, e.g., what the dynamic system provided to the user, GPS information, e.g., a location of the client computing device 104, a current time, e.g., morning, night, time zone (column 10, lines 29 to 50: Figure 1); contextual information may include a contact list on the client computing device, GPS information, and current time (column 12, lines 45 to 51: Figure 1); broadly, “an internal state of the computing device” may represent information extracted from a previous turn or information about a current time; similarly, GPS information or time information may be construed as “a background process of the computing device” (“wherein receiving the contextual data comprises receiving data representing at least one of: a background process of the computing device, an internal state of the computing device, or a capability of the computing device”).
Sarikaya et al. discloses that a set of possible domains is identified and a confidence score is given to each predicted intent for each possible domain; information may be sent to domain component 150; the information may include the predicted intent of the user, one or more n-grams, and any contextual information; domain component 150 is associated with one or more domain applications including a first domain application 160, a second domain application 170, and a third domain application 180 (column 8, line 63 to column 9, line 6: Figure 1); here, “second NLU output data” is a predicted intent of a domain.  Han et al. teaches that a word may be a name of a place on a map or a name of an object that is displayed (¶[0050]); word IDs can be represented by names of the words themselves, e.g., words belonging to current screens are ‘National Folk Museum’ corresponding to number 25 and ‘Kogsuji’ corresponding to number 26.  (¶[0135]: Figure 11A to 11B)  Han et al., then, teaches that displayed objects are “a named entity”, i.e., a name of a place or object on a map, for a name that is spoken for speech recognition.  Implicitly, text of a name corresponding to a spoken name represents a “label of the portion of the utterance is a named entity.”   
Concerning claims 11 and 21, Sarikaya et al. discloses that information used by a domain application to perform a function is known as a slot; domain component 150 may identify information of multiple slots for various domain applications including any of a first domain application 160, a second domain application 170, and a third domain application 180; slot prediction is performed using an incomplete natural language expression including n-gram analysis (column 9, lines 30 to 52: Figure 1); domain component 150 may infer that a character tri-gram is representative of a user’s intent to .

Claims 6 to 8 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Sarikaya et al. (U.S. Patent No. 9,767,091) in view of Han et al. (U.S. Patent Publication 2011/0029301) as applied to claims 2 and 12 above, and further in view of Hakkani-Tur et al. (U.S. Patent No. 10,181,322).
Concerning claims 6 and 16, Sarikaya et al. does not disclose that the NLU input data comprises “generating the second input data as vector data representing a vector comprising a first vector comprising a first vector element representing the first domain and a second vector element representing a second domain of the plurality of domains.”  Here, “the second input data . . . comprising a first vector element and a second vector element” is being construed as a first score for a first domain and a second score for a second domain.  
Hakkani-Tur et al. teaches a multi-domain dialog system that uses multi-human conversational context to improve domain detection to reduce a domain detection error rate, and to enable better interactions with users when turns are not recognized or are ambiguous.  (Abstract; Column 2, Lines 1 to 40)  Language understanding module 204 disassembles and parses text.  Domain detection estimates a domain c’ for a given conversational input u, and is defined as c’ = argmax c an element of CP(c|u), where C is a set of all domains.  (Column 4, Line 66 to Column 5, Line 17: Equation (1))  Feature extraction component extracts lexical and contextual features from conversational input and inputs occurring prior to a current conversational input for use in domain detection.  Lexical features extracted include word n-grams 230.  Contextual features extracted include top context (C_TOP) 232 and average context (C_AVG) 234.  The top context and the average context include a label 236 and a feature vector 238 derived from the collection.  The feature vector for the top context is the topic distribution score of the conversational input and the feature vector for the average context is the average of the topic distribution scores for all context conversational inputs within the collection.  The domain classification component uses the extracted features to select the domain of the conversational input, and the domain is the one determined from the lexical or contextual feature with the highest topic distribution score.  (Column 5, Line 33 to Column 6, Line 31: Figure 2)  For each conversational input ui, the output of the prior knowledge operation is a vector of scores <scorei1, scorei2, . . . , scoreiT>, where pik is the probability of a domain k for each conversational input ui.  The prior domain distribution is used as topic distribution in context and as classification features.  (Column 8, Lines 1 to 36)  Hakkani-Tur et al., i1 for a first domain and “a second vector element” is scorei2 for a second domain.  It would have been obvious to one having ordinary skill in the art to represent scores for domains as vector data as taught by Hakkani-Tur et al. in natural language understanding of multiple domains in Sarikaya et al. for a purpose of improving domain detection and reducing a domain detection error rate.

Concerning claim 7, Hakkani-Tur et al. teaches that conversational inputs that are categorized as dialog commands or interjections do not contain any topic information and are skipped for purposes of determining context; moreover, when the domain confidence is lower than a threshold, conversational input ui is considered ambiguous and discarded.  (Column 8, Lines 15 to 31)  Here, <scorei1, scorei2, . . . , scoreiT> is a vector of scores for a set of T domains for each conversational input ui.  Generally, scorei1 has some value “representing an association of the utterance with the first domain” and scorei2 has some value “representing an association of the utterance with the second domain”.  However, if conversational input does not include any topic information associated with a domain, or if a domain confidence is lower than a threshold, then a score for conversational input can be set to skipped or set to zero, i.e., “generating second value data for the second element, the second value data representing lack of an association of the utterance with the second domain.”
Hakkani-Tur et al. teaches that for each conversational input ui, the output of the prior knowledge operation is a vector of scores <scorei1, scorei2, . . . , scoreiT>, where pik is the probability of a domain k for each conversational input ui.  The prior domain distribution is used as topic distribution in context and as classification features.  (Column 8, Lines 1 to 36)  Here, <scorei1, scorei2, . . . , scoreiT> is a vector of scores for a set of T domains for each conversational input ui.  Generally, scorei1 has some value “representing an association of the utterance with the first domain” and scorei2 has some value “representing an association of the utterance with the second domain”.  That is, scorei1 and scorei2 represent values that an utterance in a conversational input is associated with a first domain or a second domain.

Claims 22 to 25 are rejected under 35 U.S.C. 103 as being unpatentable over Sarikaya et al. (U.S. Patent No. 9,767,091) in view of Han et al. (U.S. Patent Publication 2011/0029301) as applied to claim 12 above, and further in view of Cao et al. (U.S. Patent Publication 2018/0032897).
Concerning claim 22, Han et al. teaches “wherein the third input data represents . . . the plurality of elements, wherein the first element comprises a first identifier of the first content item, and wherein the second element comprises a second identifier of the second content item.”  Here, Han et al. teaches that each word on a screen may be a name of a place or a name of an object, where each object has a weight.  (¶[0050]: Figure 2)  Additionally, word information may include word IDs and word coordinates.  (¶[0065]: Figure 2)  Word IDs may be represented by the names of the words themselves, or word IDs may be represented by numbers and coordinates.  National Han et al., then, teaches “a first identifier of the first content item”, i.e., number 24 or coordinates (3.35, 5.75) and “a second identifier of the second content item”, i.e., number 25 or coordinates (3.46, 5.62).  The only limitation omitted by Han et al. is that the third input data “represents a vector”.  Still, a vector representation is commonly used whenever a plurality of components are grouped together, and x-y coordinates could conceivably be construed as a “vector”.  Moreover, Han et al. teaches representing a name of an object by words, and it is known in the art of machine learning to represent words as vectors.
Concerning claim 22, even if “a vector comprising the plurality of elements” is omitted by Han et al., this is taught by Cao et al.  Generally, Cao et al. teaches an embedding representation based on word clustering for a machine learning algorithm.  (Abstract)  An embedding representation is a low dimensional and real-valued vector.  (¶[0015])  A word is represented by a vector, which is called a word embedding.  (¶[0017]: Figure 1)  Here, Han et al. teaches a plurality of words corresponding to displayed names of places or names of objects on a map, and Cao et al. teaches that words may be represented by vectors for clustering in machine learning.  It would have been obvious to one having ordinary skill in the art to represent words corresponding to displayed names of places or names of objects of Han et al. as vectors as taught by Cao et al. for a purpose of clustering words in a machine learning algorithm.
Concerning claim 23, Han et al. teaches that displayed names of places or objects are identified by words IDs, and are ‘ordered’ in a table according to an x-Cao et al. teaches that a word is represented by a vector, which is called a word embedding.  (¶[0017]: Figure 1)  If words corresponding to word IDs are arranged in an order corresponding how they are displayed as taught by Han et al., and these words are represented by a vector as taught by Cao et al., then “an order in which the plurality of elements are arranged corresponds to an order in which the plurality of content items were displayed by the computing device when the utterance occurred.”  Generally, ordering the elements of a vector according to the order the corresponding words were displayed could be considered by one skilled in the art to be an obvious ordering structure for Han et al. 
Concerning claim 24, Cao et al. teaches predicting a label for data using clusters of words, where a document may have a title, and clustering is based on word embeddings of words in the title.  (¶[0003])  Responsive to determining that a document has a title, clusters are ranked based on cosine similarity of word embeddings of words in the title.  (¶[0004])  A title is representing by summing word embeddings of the words in the title.  (¶[0026])  If a document has a title, clusters are ranked or ordered based on cosine similarity of words in the title.  (¶[0038])  Cao et al., then, teaches “the first identifier comprises a first word embedding representing a title of the first content item” and “the second identifier comprises a second word embedding representing a title of the second content item.”  That is, Han et al. teaches that content items can be words Cao et al. teaches representing words in a title by word embeddings for machine learning.
Concerning claim 25, Cao et al. teaches embeddings of representations of titles, where a title is represented by summing words’ embeddings of the words in the title for machine learning.  (¶[0026])  Cao et al., then, teaches generating “the first identifier as a composite of a plurality of word embeddings, wherein a title of the first content item comprises a plurality of words, and wherein individual word embeddings of the plurality of word embeddings correspond to individual words of the plurality of words.”  Han et al. teaches that an object being displayed may be identified by a plurality of words, e.g., Daelim Art Museum or National Folk Museum.  (Figure 11A)  Broadly, Han et al.’s names of places or buildings are analogous to ‘a title’ of an object being displayed. 

Response to Arguments
Applicants’ arguments filed 09 February 2021 have been considered but are moot in view of new grounds of rejection as necessitated by amendment.
Applicants amend independent claims 2 and 12, cancel claims 17 to 20, and add new claims 22 to 25.  Applicants present some brief arguments directed against the prior rejection of independent claims 2 and 12 as being obvious under 35 U.S.C. §103 over Sarikaya et al. (U.S. Patent No. 9,767,091) in view of Mathias et al. (U.S. Patent Publication 2015/0302002).  Generally, Applicants merely state that they disagree with prior rejection, but have amended these independent claims to expedite prosecution, and do not provide any additional arguments besides simply quoting the language of the independent claims as amended, then requesting allowance.

New claim objections are noted in this Office Action.  
Applicants have amended the independent claims to set forth new limitations directed to “third input data comprising a plurality of elements, wherein a first element of the plurality of elements represents a first content item of the plurality of content items, and wherein a second element of the plurality of elements represents a second content item of the plurality of content items”, and delete limitations directed “wherein the first element corresponds to the first domain and the second element corresponds to a second domain of the plurality of domains”.  Applicants cite support for these new limitations as being given by ¶[0141] - ¶[0143] of the Specification.  Applicants’ amendments, then, narrow these independent claims in some respects but broaden them in some respects, too.  These amendments necessitate new grounds of rejection.
Specifically, independent claims 2 and 12 are now rejected as being obvious under 35 U.S.C. §103 over Sarikaya et al. (U.S. Patent No. 9,767,091) in view of Han et al. (U.S. Patent Publication 2011/0029301).  The rejection no longer relies upon Mathias et al., but Han et al. is being substituted for that reference to address the new limitations.  Additionally, Cao et al. (U.S. Patent Publication 2018/0032897) is cited to render new claims 22 to 25 as being obvious under 35 U.S.C. §103.  The rejection of claims 6 to 8 and 16 continues to rely upon Hakkani-Tur et al. (U.S. Patent No. 10,181,322).
Han et al. teaches whatever limitations that might be omitted by Sarikaya et al. as directed to “third input data comprising a plurality of elements, wherein a first element of the plurality of elements represents a first content item of the plurality of content items, and wherein a second element of the plurality of elements represents a second content item of the plurality of content items”.  Even if Sarikaya et al. only briefly mentions that contextual information can represent display information of items located on a display device to predict natural language understanding, this use of display information improving recognition of speech is taught in more detail by Han et al.  Mainly, Han et al. teaches an embodiment of recognizing speech according to names of places and objects on a map so that words are weighted in a language model according to their being displayed.  Han et al. teaches various embodiments within this overall idea, where words can be weighted according to their being displayed during a recent time period, so that if a user zooms in or out so that the words are no longer visible on the map, then those words will still have a weighting, but words currently displayed on the map can have a higher weighting.  These weightings, then, can be construed to correspond to contextual information of the display information for a plurality of items in Sarikaya et al.  
However, a significant point to address Applicants’ new claim limitations directed to “third input data” is that a plurality of content items are clearly illustrated by Han et al., e.g., names of a plurality of places and buildings for Gyeongbokgung, Royal Museum, National Folk Museum, and Kogsuji in Figures 7A to 7B and 11A to 11B.  Broadly, “a first element” of “a first content item” and “a second element” of “a second content item” can be construed as a first weight of a first place name and a second weight of a Han et al.  (Alternatively, “a first element” and “a second element” could be broadly construed to correspond to word IDs or coordinates in Figure 11B of Han et al., too; here, “a first element” and “a second element” are simply used by the claim language in some unspecified way during processing to represent “a first content item” and “a second content item”.)  The point is that a plurality of content items that are currently being displayed are represented by first and second elements of weightings or word IDs for first and second content items as taught by Han et al., and this can be applied to similar contextual information that can be display information of items located on a screen to improve speech recognition in Sarikaya et al.   
Additionally, Cao et al. teaches details of new dependent claims 22 to 25 as directed to representing words as vectors as a common tool of machine learning.  (Claim 22)  Han et al. provides a teaching that elements corresponding to word IDs are ordered in a table according to an x-coordinate as illustrated in Figures 11A to 11B, and words are represented by vectors in Cao et al., so that it would be obvious to arrange elements in a vector corresponding to an order in which they are displayed when representing words in a table of Figures 11A to 11B.  (Claim 23)  Moreover, Cao et al. teaches word embeddings for titles, and that an embedding of a title that comprises a plurality of words is “a composite of a plurality of word embeddings”, so that an embedding of a title corresponds to word embeddings of a plurality of words of the title.  (Claims 24 to 25)    
All of these new grounds of rejection are necessitated by amendment.  Accordingly, this rejection is properly FINAL.


Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicants’ disclosure.
Biadsy et al. and Goussard et al. disclosed related prior art.
Applicants’ amendment necessitated the new grounds of rejection presented in this Office Action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP §706.07(a).  Applicants are reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARTIN LERNER whose telephone number is (571) 272-7608.  The examiner can normally be reached on Monday-Thursday 8:30 AM-6:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571) 272- 5551.  The fax phone 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/MARTIN LERNER/Primary Examiner
Art Unit 2657                                                                                                                                                                                                        February 18, 2021