DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The disclosure is objected to because of the following informalities:
In ¶[0011], “One embodiment . . . comprising.” is not a complete grammatical sentence.  
In ¶[0045], “The t text” should be “The text”.
In ¶[0058], “(before or a after” should be “(before or after”.
In ¶[0070], “interfaces 103” are not illustrated in Figure 1, but could be “interfaces 150”.
In ¶[0070], “an application programming interfaces (API)” should be “an application programming interface (API)”.
In ¶[0103], “based on an error analysis based on an error analysis” should be “based on an error analysis”. 
In ¶[0132], “relatedness of the of the selected segment” should be “relatedness of the selected segment”.
In ¶[0140], “annotated content 175” does not appear to be illustrated in Figure 1.  
In ¶[0143], there is a comma at the end of the paragraph, and the paragraph does not appear to be complete.
In ¶[0165], “an of the steps” should be “any of the steps”.
In ¶[0165], “may be used” appears to be redundant in the sentence beginning “The invention may be implemented . . . .”
Appropriate correction is required.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim 1, 8, and 15 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by MacAvaney et al. (U.S. Patent Publication 2021/0027141).
Regarding independent claims 1, 8, and 15, MacAvaney et al. discloses a method, system, and computer program product for classifying items from source texts, comprising:
“storing a category in a computer memory, the category comprising a collection of semantic concepts, each semantic concept comprising a collection of semantically related words, sub-words, or phrases” – a term ‘class’ refers to a particular category or classification (¶[0033]); a class can include a category of words displaying a common property or a common meaning (¶[0035]); a class may correspond to one or more labels; a term ‘label’ refers to a semantic reference to, or a name for, a class; because a class may be expressed or described in various ways, a single class may correspond to multiple labels; a named entity of ‘Dr. Suess’ may include both the labels ‘Theodore Seuss Geisel’ and ‘Dr. Suess’ (¶[0035]); here, a class is equivalent to “a category” and a label is equivalent to a “semantic concept”; class recognition system 106 includes a class-recognition machine-learning model 108; storage manager 1016 stores data and models for class recognition system 106 (“storing a category in a computer memory”) (¶[0148]: Figure 10); that is, classes and labels for classes are stored in memory for class-recognition machine-learning model 108;
“receiving, by a processor, a digital text to be evaluated against the category” – term sequences are classified within a source text (“a digital text to be evaluated against the category”) (Abstract); server 102 can receive a request to render a digital document comprising a source text from client device 116a (¶[0040]: Figure 1); content applications 118a-118n include instructions executed by a processor (“by a processor”) (¶[0046]: Figure 1); class recognition system 106 identifies source text 202a from within a digital document (¶[0049]: Figure 2);
“converting, by the processor, the digital text into a plurality of embedded text segments, using an embedding” – class recognition system 106 generates feature vectors based on terms from source text 202a (¶[0050]: Figure 2); class recognition system 106 applies bi-directional LSTM layers to term embeddings 402a to 402n from terms of a source text to generate feature vectors 408a to 408n; class recognition system 106 generates a term embedding for each term from source text (¶[0068] - ¶[0069]: Figure 5);
“converting, by the processor, the category into an embedded category comprising a plurality of embedded semantic concepts, said converting the category into the embedded category comprising embedding the collection of semantic concepts using the embedding” – class recognition system 106 can apply a Word2Vec model or a GloVe model to each term from label terms 502 to generate a term embedding corresponding to each term from label terms 502 (¶[0079]: Figure 5A); class recognition system 106 converts label terms from a class label into label term embeddings 512a to 512n (¶[0086]: Figure 5B);
“determining, by the processor, a relatedness score to each of the plurality of embedded text segments with respect to each of the plurality of embedded semantic concepts to determine a plurality of relatedness scores” – a class recognition system can generate similarity matrices based on terms from a source text and labels corresponding to different classes; a similarity matrix includes similarity scores between (i) feature vectors for the term of the source text and (ii) feature vectors for a label corresponding to a class, e.g., a named entity (¶[0021]); class recognition system 106 generates similarity matrices based on (i) terms from source text 202a and (ii) labels corresponding to the plurality of classes (¶[0051]: Figure 2); class recognition system 106 generates a similarity matrix 500 comprising similarity scores between (i) feature vectors for terms of a source text and (ii) feature vectors for a label corresponding to a class; cell 506a of similarity matrix 500 is a first similarity score indicating a distance or similarity between a first label-feature vector for a first label term and a first source-term-feature vector for a first source term; cell 506b of similarity matrix 500 is a second similarity score indicating a distance or similarity between a second label-feature vector for a second label and a first source-term-feature vector for a first source term (¶[0079] - ¶[0081]: Figure 5A); here, a similarity score is equivalent to a “relatedness score”; a similarity matrix, then, represents “a plurality of relatedness scores” between every combination of terms of a source text and label terms after source terms and label terms are represented as embedded feature vectors;
“determining, by the processor, that the digital text is related to the category based on the plurality of relatedness scores” – class recognition system 106 determines a class (“determining . . . that the digital text is related to the category”) from multiple classes corresponding to the source text and a term sequence from the source text reflecting the class (¶[0043]: Figure 1); class recognition system 106 can identify a class score from among consolidated-class scores for term sequences 208a that satisfies a threshold class score and corresponds to a particular class (¶[0052]: Figure 2); class recognition system 106 identifies a consolidated-class score from among consolidated-class scores 322, 324, 326 satisfying a threshold class score (¶[0066]: Figure 3); class recognition system 106 can generate and analyze similarity scores from similarity matrices to output class scores; class recognition system 106 applies a convolutional neural network 518a to similarity scores from similarity matrix 516 to generate preliminary class scores corresponding to a class, and subsequently consolidates the preliminary class scores (¶[0085]: Figure 5A);
“based on the determination that the digital text is related to the category, annotating by the processor, the digital text with the category; storing, by the processor, the annotated digital text” – a class recognition system can provide a source text and an indication of the particular term sequence corresponding to a class for display; a class recognition system provides the source text and a visual indicator identifying the term sequence in the source text as corresponding to the class for display within a graphical user interface (¶[0023]); server 102 can transmit data packets that cause client device 116a to present the source text and an indication of a term sequence corresponding to a class within a graphical user interface; server 102 can include a digital content system 104 that may render documents with visual indicators identifying term sequences in source texts and a corresponding class for the term sequences (¶[0040] - ¶[0042]: Figure 1); class recognition system 106 provides source text 202a and a visual indicator identifying the term sequence in the source text 202a as corresponding to the class for display within a graphical user interface (¶[0053] and ¶[0055]: Figure 2); here, applying a visual indicator to source text as identifying a class is equivalent to “annotating the digital text with the category”; implicitly, an indicator is stored at least temporarily when it received from a server and displayed on a client device; Compare Specification, ¶[0105]: Figure 6. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 9, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over MacAvaney et al. (U.S. Patent Publication 2021/0027141) in view of Yu (U.S. Patent Publication 2018/0101773).
Concerning claims 2, 9, and 16, MacAvaney et al. discloses generating feature vectors for a label corresponding to a class.  A similarity matrix includes similarity scores between (i) feature vectors for the terms of the source text and (ii) feature vectors for a label corresponding to a class.  (¶[0021])  Term sequences include unigrams, e.g., ‘if’, and bigrams, e.g., ‘click here’.  (¶[0049]: Figure 2)  Here, n-grams that have a length greater than unigrams are equivalent to ‘phrases’.  Moreover, MacAvaney et al. discloses class recognition system 106 can apply a Word2Vec model or a GloVe model to each term from label terms 502 to generate a term embedding corresponding to each term from label terms 502.  (¶[0079]: Figure 5A).  Class recognition system 106 converts label terms from a class label into label term embeddings 512a to 512n.  (¶[0086]: Figure 5B)  MacAvaney et al., then, discloses “wherein converting the category into the embedded category comprising, for each semantic concept in the collection of semantic concepts of the category, generating a word vector for each word or sub-word and a vector [array] for each phrase in the collection of semantically related words, sub-words, or phrases of the semantic concept”.  That is, a word and a phrase of a label for a class are converted into feature vectors to provide an embedding of a class label.  However, MacAvaney et al. omits the limitations directed to “a vector array for each phrase” and “aggregating the word vectors and arrays generated for the semantic concept to generate a concept matrix for the semantic concept, wherein the embedded category comprises a plurality of concept matrices.”  That is, MacAvaney et al. omits “concept matrices” that are generated by “aggregating the word vectors and arrays”.   
Concerning claims 2, 9, and 16, Yu teaches deriving an inner product matrix based on a distance matrix for processing concepts.  (Abstract)  Specifically, Yu teaches assembling a plurality of concept vectors into a concept matrix representative of a sentence.  (¶[0013])  Concepts may include a phrase or a plurality of words of a sentence.  One embodiment provides that concept vectors may be assembled into a concept matrix representative of the sentence.  Specifically, each of the words of the sentence have an associated concept vector that may constitute one of a plurality of rows of a concept matrix.  The concept matrix may enable any desired analysis of the sentence as a whole, either by itself or relative to other concept matrices of other sentences.  (¶[0056])  Concept vectors are assembled into a concept matrix that is representative of the sentence.  (¶[0077])  Yu, then, teaches “aggregating word vectors . . . generated for the semantic concept to generate a concept matrix for the semantic concept”.  An objective is to extract vectors related to concepts in a way that is a faster and more cost-effective than in the prior art.  (¶[0016])  It would have been obvious to one having ordinary skill in the art to aggregate word vectors into concept matrices as taught by Yu from embedded feature vectors representing labels of classes in MacAvaney et al. for a purpose of extracting vectors related to concepts in a faster and more cost-effective way.

Claims 3 to 4, 10 to 11, and 17 to 18 are rejected under 35 U.S.C. 103 as being unpatentable over MacAvaney et al. (U.S. Patent Publication 2021/0027141) in view of Maksak et al. (U.S. Patent Publication 2018/0089152).
MacAvaney et al. discloses the limitations directed to “embedding the digital text using the embedding to convert each word or sub-word in the digital text into a corresponding vector to generate a plurality of word vectors” and “for each of a plurality of segments selected from the digital text, converting each of a plurality of words or sub-words into a corresponding vector to generate a plurality of word vectors”.  Here, MacAvaney et al. discloses that a term sequence refers to an n-gram or a contiguous sequence of n items from a given sample of text, and that a class score may be particular to an n-gram, including a unigram class score, a bigram class score, or a trigram class score.  (¶[0032] and ¶[0034])  Class recognition system 106 generates feature vectors based on terms from source text 202a.  (¶[0050]: Figure 2)  Class recognition system 106 applies bi-directional LSTM layers to term embeddings 402a to 402n from terms of a source text to generate feature vectors 408a to 408n.  Class recognition system 106 generates a term embedding for each term from source text.  (¶[0068] - ¶[0069]: Figure 5)  That is, a term is a word (“word or sub-word”) that is converted to a word vector by an embedding.  However, MacAvaney et al. does not disclose “after embedding the digital text, aggregating the subsets of word vectors from the plurality of word vectors to generate a plurality of embedded text matrices, each of the plurality of embedded text matrices representing a different segment of the digital text” and “aggregating the plurality of word vectors into an embedded text matrix representing that segment.”  
Still, Maksak et al. teaches message text labelling that receives an input of a sequence of word vectors corresponding to a sequence of words, and outputting a probability score for each of a plurality of labels.  If at least one probability score meets a criterion, the at least one label corresponding to the at least one probability score is assigned to the message.  (Abstract)  Label determination layer 120 is configured to process the probability score, and if one of the probability scores is greater than a predetermined threshold score, label determination layer 120 is configured to assign the label to which the probability score corresponds to the message.  (¶[0030]: Figure 1)  Labelling engine 100 parses a message into a string of text and that text is tokenized into a sequence of words.  Network layer 114 then determines a word vector for each word.  Specifically, the result is a matrix listing sequentially the word vector for each word.  The use of a word table and the word representation matrix in combination is an efficient way of generating a matrix or a list of word vectors corresponding to the sequence of words in the message.  (¶[0031] - ¶[0032]: Figure 2)  Maksak et al., then, teaches that one way of representing a sequence of words in text is a matrix of word vectors.  This matrix of word vectors is obtained by “aggregating subsets of word vectors from the plurality of word vectors to generate a plurality of embedded text matrices, each of the embedded text matrices representing a different segment of the text” and “aggregating the plurality of word vectors into an embedded text matrix representing that segment.”  An objective is to provide labels for a text message that are accurate.  (¶[0004] - ¶[0005])  It would have been obvious to one having ordinary skill in the art to obtain embedded term vectors of MacAvaney et al. and aggregate them into embedded text matrices as taught by Maksak et al. for a purpose of providing labels for a text message that are more accurate as an efficient way of representing a sequence of words.

Claims 5, 12, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over MacAvaney et al. (U.S. Patent Publication 2021/0027141) in view of Maksak et al. (U.S. Patent Publication 2018/0089152) as applied to claims 1, 4, 8, 11, 15, and 18 above, and further in view of Zou et al. (U.S. Patent Publication 2020/0226126).
MacAvaney et al. discloses that a term sequence refers to an n-gram or a contiguous sequence of n items from a given sample of text, and that a class score may be particular to an n-gram, including a unigram class score, a bigram class score, or a trigram class score.  (¶[0032] and ¶[0034])  Here, determining n-grams from a sample of text is similar to “wherein the plurality of segments is determined by applying a sliding window to the digital text”.  That is, an n-gram applies a window to text that has a length of n words.  MacAvaney et al. does not expressly disclose a “sliding” window, but sliding windows are well known in processing text by machine learning.  Generally, Zou et al. teaches vector-based contextual text searching that computes vectors for respective text units.  The text vectorizer computes a given vector for a given text unit by (i) computing word vectors for respective words in the text unit, (ii) computing phrase vectors for respective phrases in the text unit, and (iii) combining the word vectors and the phrase vectors to produce the given vector for the given text unit.  The text vectorizer computes corpus vectors for the respective corpus documents.   Search text is received, and based thereon, the text vectorizer computes a search vector for the search text.   (Abstract)  Word vectorizer 142 may implement a word or sentence embedding model.  Specifically, Zou et al. teaches that when a word is passed to a first neural network, sliding windows of various lengths are used to break down the word into its n-gram subwords.  Then a model tries to classify each word if they are actually in the context window of the word.  (¶[0027]: Figure 4)  An objective is to provide text searching that is efficient, able to account for spelling variations, and leverages contextualized word representations.  (¶[0004])  It would have been obvious to one having ordinary skill in the art to apply a sliding window as taught by Zou et al. to obtain n-grams from samples of text in MacAvaney et al. for a purpose of providing efficient text searching that accounts for spelling variations and uses contextualized word representations.  

Claims 6 to 7, 13 to 14, and 20 to 21 are rejected under 35 U.S.C. 103 as being unpatentable over MacAvaney et al. (U.S. Patent Publication 2021/0027141) in view of Zhelezniak et al. (“Don’t Settle for Average, Go for the Max Fuzzy Sets and Max-Pooled Word Vectors”).
Concerning claims 6, 13, and 20, MacAvaney et al. discloses ‘projecting’ embedded text vectors onto concept vectors representing a semantic concept.  That is, MacAvaney et al. generates a similarity matrix 500 comprising similarity scores using a cosine similarity between (i) feature vectors for terms of a source text obtained by an embedding (“embedded text vectors”) and (ii) feature vectors for a label corresponding to a class (“concept vectors representing a semantic concept”).  (¶[0078] - ¶[0080]: Figure 5)  Here, generating a similarity score is a ‘projection’ because two vectors are ‘projected’ onto one another to determine a cosine similarity.  However, MacAvaney et al. does not generate similarity scores using an embedded text “matrix” and a concept “matrix”, but only obtains these similarity scores by projecting vectors.  Still, Zhelezniak et al. teaches comparing a first sentence and a second sentence, where a first sentence has word embeddings of x[1], x[2], . . . , x[k], and a second sentence has word embeddings of y[1], y[2], . . . , y[k].  Then a matrix X is obtained by stacking rows x[1], x[2], . . . , x[k], and a matrix Y is obtained by stacking rows y[1], y[2], . . . , y[k].  (Algorithm 1: Dynamax Jaccard: Page 4)  Zhelezniak et al., then, teaches projecting a first embedded matrix into a second embedded matrix to determine a similarity between two sentences.  An objective is to provide a deep learning method on semantic textual similarity that is both efficient and easy to implement.  (Abstract)  It would have been obvious to one having ordinary skill in the art to generate similarity scores between embedded feature vectors of a source text and embedded feature vectors of labels for a class in MacAvaney et al. by embedding word vectors into word matrices as taught by Zhelezniak et al. for a purpose of providing a deep learning method of semantic textual similarity that is efficient and easy to implement.
Concerning claims 7, 14, and 21, MacAvaney et al. discloses that a similarity score represents a distance (“a distance metric . . . representing the degree of relatedness”), e.g., cosine similarity between a first label-feature vector for a first label term and a first source-term-feature vector for a first source term.  (¶[0080]: Figure 5)  Zhelezniak et al. teaches;
“combining the embedded text matrix with the concept matrix to determine a universe” – universe matrix is obtained by stacking matrix X and matrix Y (Algorithm 1: Dynamax Jaccard: Page 4);
“multiplying the universe matrix by the embedded text matrix to determine a first additional matrix” – x is obtained by max pooling elementwise x[1]UT x[2]UT . . . , x[k]UT(Algorithm 1: Dynamax Jaccard: Page 4);
“multiplying the universe matrix by the concept matrix to determine a second additional matrix” – y is obtained by max pooling elementwise y[1]UT y[2]UT . . . , y[k]UT(Algorithm 1: Dynamax Jaccard: Page 4);
“converting the first additional matrix to a first vector and the second additional matrix to a second vector” – r is obtained by min pooling elementwise x and y; q is obtained by max pooling elementwise x and y;   (Algorithm 1: Dynamax Jaccard: Page 4);
“determining a degree of relatedness of the embedded text matrix to the concept matrix by using a vector distance to reduce the first vector and the second vector to a vector representing the degree of relatedness” – DynaMax is a similarity measure (“determining a degree of relatedness”) of a sentence pair (1. Introduction: Page 2); DMJ (DynaMax Jaccard) is obtained from r and q (Algorithm 1: Dynamax Jaccard: Page 4).

Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicants’ disclosure.
Mohan et al., Gehrking et al., Sayers et al., Srinivasan, Hoffman et al., Tu, Biswas et al., and Lu discloses related prior art.  
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARTIN LERNER whose telephone number is (571) 272-7608.  The examiner can normally be reached Monday-Thursday 8:30 AM-6:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571) 272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center.  Unpublished application information in Patent Center is available to registered users.  To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov.  Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format.  For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MARTIN LERNER/Primary Examiner
Art Unit 2657                                                                                                                                                                                                        November 21, 2022