DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
	Applicant filed an RCE and an IDS on 6/23/2022 (The Corrected Notice of Allowability mailed 6/27/2022 was posted for processing on 6/22/2022)
EXAMINER'S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

Authorization for this examiner’s amendment was given in an interview with John Kacvinsky on 5/11/2022.

The application has been amended as follows: 

Make the following amendment to the claims filed 4/27/2022:

Amend “the natural language understanding model configured to operate on byte-level embeddings” in lines 9-10 of claim 15 to recite --the natural language understanding model, the natural language understanding model configured to operate on byte-level embeddings--

Allowable Subject Matter
Claims 1-20 are allowed.
The following is an examiner’s statement of reasons for allowance:

As per Claim(s) 1 (and similarly claim[s] 8 and 15, and consequently claim[s] 2-7, 9-14, and 16-20 which depend on claim[s] 1, 8, and 15), the prior art of record does not teach or suggest the combination of all limitations in claim(s) 1, including (i.e. in combination with the remaining limitations in claim[s] 1) A method comprising: receiving an input at a first end-user device, the input comprising natural language from a communication associated with a second end-user device and transmitted over a communications service; converting the input into a byte-level embedding; providing the byte-level embedding to a natural language understanding model located on the first end-user device, the natural language understanding model configured to operate on byte-level embeddings; generating an output from the natural language understanding model; selecting a recommendation based on the output; presenting the recommendation on an interface of the first end-user device; receiving a selection of the recommendation; and transmitting a message incorporating the recommendation to the second end-user device
	The prior art suggests:
Displaying suggested response message content and sending a response message with the suggested content in response to user commands.
2016/0308794 teaches displaying a recommended reply message based on analyzing a received message to determine a sending intention (paragraphs 65-67).
2014/0184544 teaches “A corresponding recommended answer 510 may be displayed on a bottom of a chat window. In doing so, if a user selects `x`, the recommended answer 510 may disappear from the chat window. If a region except `x` is selected from the recommended answer, the recommended answer can be transmitted as a message. In this situation, referring to FIG. 26 (2), if a second message 307 requiring a user's answer is received, the controller 180 searches for a recommended answer 520 to the second messages and can then display the recommended answer 520 on the chat window. In doing so, if the user selects the first recommended answer 510, referring to FIG. 26 (3), a message 390 corresponding to the recommended answer can be displayed between the first message 306 and the second message 307” (paragraph 277).  This reference describes presenting a recommendation on a display, receiving a selection of a recommendation, and transmitting a message incorporating the recommendation.  This reference does not appear to describe determining recommendations based on an output generated by a natural language understanding model, and does not appear to describe providing a byte-level embedding of natural language to a natural language understanding model.
2014/0237057 teaches “According to an embodiment, for messages that do warrant a response, the agent may be able to reply directly from his/her desktop 104. These replies may be public (anyone can see them) or private (only the sender and recipient can see them). In one example, the agent may be able to select a "pre-canned" response from the standard response library and edit it as may be necessary before submitting it to the customer. The interaction workspace may also be configured to display a suggested response to the agent based on analysis of the message content, according to an embodiment of the invention. In such a case, the agent may be able to edit the suggested response before replying to the customer” (paragraph 149).  Paragraph 151 also describes a character limit (i.e. 140 characters for Twitter).  This reference describes displaying a suggested response based on analysis of message content but does not appear to describe where analysis of message content includes providing a byte-level embedding of the message content to an NLU model and generating a recommendation based on an output of the NLU model.
	2005/0228790 teaches “The foregoing methods and computer program products may also involve various modifications. For example, executing one of the first and second modules may involve displaying suggested response message content on a display device, the suggested content being included in the linked response information. The suggested content may include at least one document or at least one response template. The method may further involve receiving user commands to send a response message with the suggested content. In such case, a message routing instruction may route the received electronic message to a user's incoming electronic message account, the message routing instruction being included in the linked response information” (paragraph 13).
2010/0070908 teaches “In general, in one aspect, the invention relates to a method for accepting or rejecting suggested text corrections, comprising receiving a first input from a keyboard associated with a computing device, wherein the first input corresponds to a composition of a message in a text input area on the computing device, displaying a first suggested text correction, in response to the first input, during the composition of the message, receiving a first keypress on the keyboard, wherein the first keypress corresponds to a rejection of the suggested text correction, when a rejection of the suggested text correction is desired, receiving a second keypress on the keyboard, wherein the first keypress corresponds to an acceptance of the suggested text correction, when an acceptance of the suggested text correction is desired, and automatically inserting a space in the message when one selected from a group consisting of the first keypress and the second keypress is received” (paragraph 3).
Embeddings used in natural language understanding.
2018/0336183 teaches “Word embedding is used for semantic parsing to extract meaning from text to enable natural language understanding. That is, for a natural language processing model to be able to predict the meaning of text, the model needs to be aware of the contextual similarity of words or sentences. For example, through contextual similarity, it may be determined that plat words, such as flower, tree, etc., are found in sentences that reference concepts such as “grown,” “eaten,” “cut,” and “picked”, whereas such concepts are not generally associated with words such as “airplane.” The vector representations created by word embedding techniques preserve these similarities such that words that regularly occur nearby in text will also be in close proximity in vector space” (paragraph 19).  This reference suggests where natural language understanding is based on word embeddings.
2019/0130285 teaches “The utterances may be mapped to respective embedding vectors using a natural language understanding (NLU) or natural language processing (NLP) machine learning model, in effect capturing at least some characteristics of the verbal feedback” (paragraph 79).  This reference appears to describe where NLU is used to generate vectors, not where vectors are processed by NLU.
Embeddings using bits/bytes.
2021/0099307 teaches “In one embodiment, the embedding can be performed on a bit level in addition or as an alternate to the digit level described in the embodiments above. For example, the speed 42 km/h is encoded in binary as “00110100 00110010,” which has a least significant bit of “0”. If the corresponding bit to embed is “1”, the calculation module 305 modulates the bit to embed “1” into the least significant bit “0” of the speed data. The embedding/extracting module 307 replaces the least significant bit “0” of the speed data with the at least one bit “1”, such that the speed data is converted into 43 m/h which in binary as “00110100 00110011.” (paragraph 56).  This reference describes where speed can be encoded on a bit level.
2021/0150606 teaches “Referring now to FIG. 4, operation of the encoder module 214 is now described. The encoder module 214 receives the mapping 314 output by the name identifier module 212 and, for each product name included in the mapping 314, employs word embedding to encode a product name into a respective encoded product name. Each encoded product name output by the encoder module 214 is of a predefined length, such as 128 bytes. The encoder module 214 outputs a mapping 402 that maps the product names in the mappings 314 to corresponding encoded product names generated by way of word embedding. Thus, the mapping 402 includes four encoded product names that respectively correspond to the four product names from the mapping 314” (paragraph 49).  This reference describes where an encoded product name which is encoded by word embedding has a predefined length, such as 128 bytes, which suggests where word embeddings can be made of bytes.
10015194 teaches “Shellcode in an infected or malicious file is typically encoded or embedded in byte level data—a basic data unit of information for the file” (col. 1, lines 22-34).
2020/0201697 teaches “FIG. 9 shows exemplary mapping of embedded memory to parameters of AI networks at a bit level” (paragraph 31; Figure 9)
2021/0049443 teaches “In some variations, the method may further include: preprocessing the text prior to applying the dense convolutional neural network, the preprocessing includes tokenizing the text to form a plurality of tokens, the text being tokenized by at least applying a byte-pair-encoding such that each of the plurality of tokens correspond to a partial word or a full word from the text; and embedding each of the plurality of tokens by at least transforming each of the plurality of tokens to form a corresponding vector representation” (paragraph 12).  The “byte-pair-encoding” appears to refer to tokenizing and not generating the vector representations of tokens.
11151106 teaches “By way of further definition, an “embedding” or “embedding value” corresponds to, or is descriptive of, some particular aspect of an item of content. Typically, though not exclusively, embedding information (a set of embedding values of an item of content) is determined as a result of convolutions of a deep neural network. Typically, embedding information for an item of content is output by a deep neural network in the form of an “embedding vector.” Due to the inclusion of plural aspects of an item, an embedding vector is viewed as being multi-dimensional with each aspect corresponding to a particular dimension. In this regard, embedding vectors may include any number of aspects, each corresponding to a dimension. In various embodiments, each aspect of an embedding vector is represented by a 16-bit word or a 32-bit float, which results in embedding vectors being large in size, e.g., 512 bytes long for example” (col. 2, lines 24-40).  This reference describes where an embedding vector aspects are represented by bits and where an embedding vector has a size measured in bytes, which suggests where embedding vectors are “byte-level” embeddings (i.e. vectors defined by “bytes” of data).
2019/0065486 teaches “The compressed word embeddings can be binary numbers that may be stored compactly as bits instead of bytes (float number) in electronic devices that have limited amounts of storage. Additionally, an electronic device may operate a NLPS independent of any other computing devices when the compressed word embeddings are stored in the electronic device” (paragraphs 21-22) which teaches, but also appears to teach away from, word embeddings that are made of bytes.
2021/0182612 teaches “FIG. 4 is an application example diagram of an embodiment of domain name conversion according to the present invention. As shown in FIG. 4, in the embodiment of the present invention, a DGA generated domain name zzzzanerraticallyqozaw.com is taken as an example. First, the domain name string is converted into an image byte matrix of [224×224×3] by word embedding. Since the maximum length of the domain name string usually does not exceed 25, we can further reduce the size of the image byte matrix of [224×224×3] to [25×25×3], and finally it is input into a AlexNet deep learning model pre-trained based on an ImageNet data set to generate a domain name characteristic. Thus, the size of the converted image byte matrix is reduced to a predetermined size, which can significantly reduce the memory space occupation” (paragraph 108) and “In the embodiment of the present invention, by normalizing the multi-dimensional image byte matrix after the word embedding conversion, the vector representation of the domain name data is more standard and standardized, and the classification accuracy of the domain name is further improved” (paragraph 107).

Upon further search (in response to the amendment filed 4/27/2022):
20170161279 teaches “A method and apparatus are provided for recommending concepts from a first concept set in response to user selection of a first concept Ci by performing a natural language processing (NLP) analysis comparison of vector representations of user concepts contained in written content authored by the user and candidate concepts in a first concept set to determine a similarity measure for each candidate concept, and to select therefrom one or more of the candidate concepts for display as recommended concepts which are related to the user concepts contained in written content authored by the user based on the similarity measure between each candidate concept and each user concept” (Abstract).  Paragraph 51 describes “a vector processing application 14 may be configured to provide immediate hints identifying concepts of potential interest to the user by analyzing user's written or viewed content to potentially enrich their discourse by pointing user to one or more data sources that illustrate additional information related to the user's written/viewed content. In an example embodiment where a user's word processor (or slide editor) is being used to write or view content as pages are scrolled up and down, the vector processing application 14 may be hooked up to the word processor/slide editor to generate content recommendations which are dynamically adjusted as the visible content in the word processor/slide editor changes. In other embodiments, additional data sources may be generated from user-explored concepts (e.g., Wikipedia concepts or more generally, the concepts in a Knowledge Graph which connects concepts by edges of one or more types) when the user selects a concept Ci, such as by placing a mouse over the concept Ci. In response, the vector processing application 14 may process the extracted concept vectors 13A to identify and display the top U concepts whose vectors having a high cosine distance to a vector constructed from Vi and vectors of concepts occurring in close vicinity to the concept Ci in the Wikipedia page (e.g., 3 preceding and 3 following it), where U and the vicinity parameters may be programmable. The constructed vector can be such that the weight of Ci is higher than that of its neighbors and the average is a weighted average. Based on the computation results, the vector processing application 14 may be configured to automatically display the top U concepts to the user when the cursor passes over the concept Ci. In other embodiments, the cosine distance metric values may be used to control the subject matter proximity of the content recommendations to range from an “exploratory” domain (where the user is provided with a fairly diverse set of concepts and passages that are similar, but not too similar, to the concepts and passages in the user's written content) to an “exploitative” domain (where the user is provided with content concepts and passages that are more specific and similar to the concepts and passages in the user's written content). Between these extremes on the exploratory and exploitative domains, the user may be provided with the option of controlling how far to go between the domains”.  Paragraph 77 descries vector embedding applied to concepts and discusses word2vec.  The input to the natural language processing in this reference does not appear to be a communication associated with a second end-user device and transmitted over a communications service (the user’s written/viewed content appears to be input of the same user, as opposed to written/viewed content received from somebody else, see paragraph 75).  Paragraph 78 describes where the recommendations are a list of concepts that are related to a user-selected concept based on vector similarity.  Paragraph 85 describes where related concepts are used to browse the displayed concepts (which appears to be the recommended concepts) and their links, and where a user provides suggestions for adding/removing links.  The recommendations do not appear to be incorporated into a message that is transmitted to the second user device.
2017/0161619 similarly describes what was discussed in the previous paragraph (see paragraph 90).
2009/0306962 teaches “In the present exemplary embodiment, content verification system 100 includes a reference detection module 110, a natural language model 120, a content detection module 130, a learning module 140, and an interface prompt module 150. Reference detection module 110 includes instructions for detecting the language used in an active document 162 in an active edit window of document preparation application 160. Natural language model 120 is implemented to store associations with phrasal forms corresponding to keywords and text sequences that suggest a particular content item should be included within a document. Natural language model 120 can comprise any suitable data repository for storing, managing, and retrieving data, and which may be implemented using any suitable database technology, such as relational or object-oriented database technology. The associations maintained by natural language model 120 may be retrieved by a processor for use by reference detection module 110 in determining whether the user may have intended to include a particular content item in active document 162. Content detection module 130 includes instructions for detecting whether a referenced content item is contained within active document 162. Learning module 140 includes instructions for adapting natural language model 120 based on reference language patterns adopted by the user of the computer system. Interface prompt module 150 includes instructions for prompting a user of document preparation application 160 regarding whether a content item should be included in active document 162” (paragraph 27).  This reference appears to use a natural language model to provide a suggestion/recommendation based on something a user is typing in an active document, and not based on a byte-level embedding of an input that comprises natural language from a communication associated with a second end-user device and transmitted over a communications service.

Upon further search (in response to the RCE filed 6/23/2022):
Recommendations based on NLP models.
2020/0120163 teaches “deriving a plurality of data patterns wherein a plurality of machine learning models and natural language processing (NLP) models enable generation of data patterns using an artificial intelligence engine thereby providing recommendation to a user based on the data patterns” (paragraph 7).  This reference describes where data generated using a natural language processing model is used to provide a recommendation to a user.  This reference does not appear to describe where the NLP models operate on byte-level embeddings.
2021/0134279 teaches “While some example NLP models are disclosed, any suitable NLP model may be used to process the speech text during the conversation between the user and the agent (or other text). Also, any number of NLP models may be used to generate any number of data sets or inputs (such as features) for a product solution recommendation engine 340. In some implementations, the product solution recommendation engine 340 is configured to identify one or more product solutions for recommendation to the user based on the inputs (such as the one or more words and the collocation information from the speech text and any other suitable inputs). For example, the apparatus 300 uses the product solution recommendation engine 340 to identify multiple product solutions and generate a ranked list of the product solutions for possible recommendation to the user. If the ranked list includes multiple product solutions, the highest ranked product solution may be based on a generated probability of the user to engage with the product solution after a recommendation” (paragraph 50).  This reference describes where data generated using a natural language processing model is used to provide a recommendation to a user.  This reference does not appear to describe where the NLP models operate on byte-level embeddings.
Generating an output (particularly intent information) by a natural language model which receives vectors/embeddings as input.
2018/0137855 teaches “The intent information may be generated by a natural language processing model receiving the sentence vector” (paragraph 11).  This reference describes where an NLP model receives a sentence vector which is generated based on word vectors and character vectors, and where the NLP model generates intent information (paragraphs 5 and 11).  This reference thus suggests where an NLP model “understands” intent based on an input sentence vector (a vector can be interpreted as an “embedding” of a sentence).  Intent in this reference, appears to be an intention of a command (see paragraphs 53 and 55) and this reference does not appear to specifically describe where intent information is used to select a recommendation.  Paragraph 63 describes an example of intent information which is searching for a French restaurant which suggests where a response is at least one restaurant (which may be considered a “recommendation” of a French restaurant).
Recommendations based on intent.
2015/0178371 teaches “The disclosure is related to mining of text to derive information from the text that is useful for a variety of purposes. The text mining process can be implemented in a service oriented industry such as a call center, where a customer and an agent engage in a dialog, e.g., to discuss product/service related issues. The messages in dialogues between the customers and the agents are tagged with features that describe an aspect of the conversation. The text mining process can mine various dialogues and identify a set of features and messages based on prediction algorithms. The identified set of features and messages can be used to infer an intent of a particular customer for contacting the agent, and to generate a recommendation based on the determined intent” (Abstract).  Paragraph 37 describes where a recommendation can be for an agent or a customer.  Paragraph 5 describes where one example of using mined features is making suggestions to auto completing a word/phrase sentence being input by an agent (but not specifically where the suggestion is based on the intent)
2012/0303452 teaches “By parsing textual data of, for example, an SMS message, into structured information with a NER engine, it can enable automatic intelligent recommendation based on a user's intentions from the SMS message” (paragraph 71).

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC YEN whose telephone number is (571)272-4249. The examiner can normally be reached M-F 12:00PM -8:30PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RICHEMOND DORVIL can be reached on (571)272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





EY 7/5/2022
/ERIC YEN/Primary Examiner, Art Unit 2658