DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

With regard to Claims 1 and 11,
	The limitations of “generate … word feature scores…”, “generate … scores by aggregation of … scores for the words”, “generate a … score for the document based on aggregation” as drafted, are processes that, under their broadest reasonable interpretations cover performance of the limitations in the mind but for the recitation of generic computer components. That is, other than reciting “one or more physical processors configured” nothing in the claim elements preclude the step from practically being performed in the mind. For example, but for the “processor configured” language, to “generate” in the context of this claim could encompass a user manually generating a score, and aggregating scores, relating to document data received by the user. If a 
	The judicial exception is not integrated into a practical application. The processor and storage recited in the claims are recited at a high-level of generality (i.e., as a generic processor performing a generic computer function) such that they amount to no more than mere instructions to apply the exception using a generic computer component. Further, the limitations directed to obtaining data formatted in a particular way to denote structure/location in the document, are mere insignificant extrasolution activity that does not integrate that judicial exception into a practical application. For example, these limitations do not provide any improvement in the functioning of a computer or an improvement to any other technology or technical field. Further, the judicial exception is not integrated into a practical application because the generically recited computer elements do not add a meaningful limitation to the abstract idea because they amount to simply implementing the abstract idea on a computer.
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a processor to perform the mental steps amounts to no more than mere instructions to apply the exception using a generic computer component, which cannot provide an inventive concept. Further, the “obtain[ing]” steps are mere insignificant 

With regard to the dependent claims,
	The dependent claims all further add additional mental steps/calculations or merely elaborate on mental steps from the independent claims. No additional elements are presented and thus the dependent claims are not patent eligible.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-3, 8, 11-13 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over US Pre-Grant Publication 2005/0262050 to Fagin in view of US Pre-Grant Publication 2020/0257761 to Bull.


With regard to Independent Claim 1,
	Fagin teaches a system for generating context-sensitive feature scores for documents, the system comprising: 
	one or more physical processors configured by machine-readable instructions (Fagin: ¶0014 and claim 11 – instructions executed by computer, i.e. “processor”.) to: 
		obtain document information, the document information defining words within a document (Fagin: abstract and ¶0001 make clear that document information is considered through the scoring functionality of the system. See e.g. ¶0048, which reads in part: “The content index 235 comprises an inverted keyword index on document content. The title index 240 comprises an inverted keyword index on titles and metadata about the set of found documents. The anchortext index 245 comprises an inverted keyword index on anchortext for the set of found documents….”), key groups of words within the document (Fagin: ¶¶0046-0047 – for ranked sets of documents, “Scoring modules 205 comprise a set of indices 230 such as, for example, a content index 235, a title index 240, and an anchortext index 245. Additional indices may be used as desired. The content index 235, the title index 240, and the anchortext index 245 take as input query 220 and find a set of documents in dB 40 that match the text of input query 220. The indices (e.g., content index, title index, anchortext index) provide pointers into the set of documents in dB 40 containing the query terms, and pass them to the union module 250 and to the rank aggregation processor 215.” Examiner notes that the broadest reasonable interpretation of the term “groups of words” would read upon text with more than one word, including titles, content, indices derived therefrom, etc.), and sets of context words corresponding to individual ones of the key groups of words (Fagin: Examiner notes that the problem of context-sensitive searching is stated at ¶¶0009-0010 to be addressed by the The present system provides the opportunity to customize and personalize search results within a single software architecture. When deploying a search engine in many different intranets, the ranking function can be tailored to represent the set of users and that structure of the organization. For example, a client may require a particular server to supply all answers to human resource questions. The present system incorporates this requirement incorporated into the ranking function without the need to change the underlying text index software. Moreover, a "best" answer to a query may depend on the location of a user asking the question. For example, the query "retirement" can have different answers if the query is submitted in Zurich than if it is submitted in Raleigh. The present system can incorporate the geographical location of a user without changing the underlying indexing software that returns ranked lists of results.”), wherein the sets of context words for the individual ones of the key groups of words are determined based on a hierarchical structure of the document and locations of the key groups of words within the hierarchical structure of the document (Fagin: See above citations relating to indexes, which are hierarchical structures according to ¶0041, i.e. “hierarchical structure of the document”. Examiner further notes the particular mention of title or content keyword location determinations relating to a document, would read upon the broadest reasonable interpretation of “locations of the key groups of words”.); 
		generate key group feature scores for the individual key groups of words based on aggregation of the word feature scores for the words within the individual key groups of words and the word feature scores for the words within the corresponding sets of context words (Fagin: See above citations concerning context and ¶0014 aggregation of scores.); 
		generate a document feature score for the document based on aggregation of the word feature scores for the words within the document (Fagin: ¶0023 reads in part, “...The present system also provides means for the user to specify a query in the form of text input. A user specifies the input data and the query and then invokes the modular scoring utility program to produce a scored set of documents.” See also discussion of generation of a document feature score aggregation at ¶¶0013-0015, as discussed above.); and
 			store the key group feature scores and the document feature score, wherein storage of the key group feature scores and the document feature score enables context-sensitive searching of words. (Fagin: ¶0022: reads, “Dynamic orderings that depend on the particular user comprise, for example, geographical proximity scores for the user, role or job title of the user within an organization, educational level, history of previous queries by the user, etc.” Examiner notes that document ordering based upon previous queries would depend upon the association with the previous query’s document scores/ranks. Examiner further notes, as cited above, that the problem of context-sensitive searching is stated at ¶¶0009-0010 to be addressed by the reference’s teachings. An example of such context-sensitive words, e.g. “Zurich” found in an example result/candidate document The present system provides the opportunity to customize and personalize search results within a single software architecture. When deploying a search engine in many different intranets, the ranking function can be tailored to represent the set of users and that structure of the organization. For example, a client may require a particular server to supply all answers to human resource questions. The present system incorporates this requirement incorporated into the ranking function without the need to change the underlying text index software. Moreover, a "best" answer to a query may depend on the location of a user asking the question. For example, the query "retirement" can have different answers if the query is submitted in Zurich than if it is submitted in Raleigh. The present system can incorporate the geographical location of a user without changing the underlying indexing software that returns ranked lists of results.”)
	Fagin does not fully and explicitly teach generate word feature scores for words within individual key groups of words, for words within individual sets of context words, and for the words within the document.
	Bull teaches a system to generate word feature scores for words within individual key groups of words, for words within individual sets of context words, and for the words within a document. (Bull: abstract reads in part, “A plurality of importance scores are generated for a plurality of words included in the electronic document by processing the electronic document using a trained passage encoder. Important words are identified based on the plurality of importance scores. One or more clusters of words are generated, where each of the one or more clusters of words includes at least one of the plurality of important words. A representative word is selected for a first cluster, and the representative word is mapped to one or more concepts from a predefined list of concepts. The one or more concepts are disambiguated to identify a set of relevant concepts for the electronic document. An annotated version of the electronic document is generated based at least in part on the set of relevant concepts.”. See ¶0019 and ¶0026 discussion of word context determinations that are used to annotate the entire document.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the word, context, group score aggregations of Bull into the word feature score aggregation system of Fagin by programming the instructions of Fagin (Fagin: ¶0014) to aggregate document scores based upon word groupings and context, as taught by Bull. Both systems are directed to document and word scoring/criteria (Fagin: ¶0041; Bull: abstract) and aggregation of the scores for search/ranking (Fagin: abstract; Bull: ¶0035). An advantage obtained through aggregating word feature scores based upon cosine distance would have been advantageous to implement in the word feature score aggregation system of Fagin. In particular, the motivation to combine the Fagin and Bull references would have been to improve searching accuracy through identification of important words in textual data. (Bull: ¶¶0002-0003)

With regard to dependent claim 2, which depends upon independent claim 1,
	Fagin and Bull teach the system of claim 1, wherein the aggregation of the word feature scores for the words within the individual key groups of words and the word feature scores for the words within the corresponding sets of context words is performed based on cosine distances between individual word feature scores.  (Bull: ¶0016: “In the illustrated embodiment, once the Importance Component 110 has scored each word in the document (and optionally determined a set of Important Words 115), a Clustering Component 120 analyzes the words to generate one or more Concept Clusters 125. In some embodiments, the Clustering Component 120 only clusters words that have been classified as Important Words 115. In one embodiment, the Clustering Component 120 processes all of the words, regardless of whether they are considered "important." In one embodiment, the Clustering Component 120 generates a vector representation for each word, and clusters the words into one or more groups based on their relative locations in the embedding space. For example, in one embodiment, the Clustering Component 120 computes a similarity measure (e.g., the cosine similarity) for each pair of words, and clusters words based on this similarity. In some embodiments, the Clustering Component 120 clusters words with a similarity that exceeds a predefined threshold (e.g., 0.85)” See also ¶0035, which reads in part: “…in one embodiment, the Clustering Component 120 sums the importance scores of each word in the cluster, and uses this aggregate value as the importance score for the representative word (and therefore, for the cluster)….”)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the cosine distance for word feature score aggregation of Bull into the word feature score aggregation system of Fagin by programming the instructions of Fagin (Fagin: ¶0014) to aggregate word feature scores based upon cosine distance, as taught by Bull. Both systems are 


With regard to dependent claim 3, which depends upon independent claim 1,
	Fagin and Bull teach the system of claim 1, wherein the document includes requirements, and individual key groups of words within the document correspond to individual requirements. (Fagin: ¶0017 – reads in part, “…For example, a client may require a particular server to supply all answers to human resource questions. The present system incorporates this requirement incorporated into the ranking function….” See also above citations relating to key groups of words and scoring/ranking.)

With regard to dependent claim 8, which depends upon independent claim 1,
	Fagin and Bull teach the system of claim 1, wherein the document is associated with operating system metadata, and the sets of context words corresponding to the individual ones of the key groups of words include words within at least some of the operating system metadata. (Fagin: abstract and ¶0001 The content index 235 comprises an inverted keyword index on document content. The title index 240 comprises an inverted keyword index on titles and metadata about the set of found documents. The anchortext index 245 comprises an inverted keyword index on anchortext for the set of found documents….” See also ¶0017 discussion of location OS metadata from the query, reading in part, “Moreover, a "best" answer to a query may depend on the location of a user asking the question. For example, the query "retirement" can have different answers if the query is submitted in Zurich than if it is submitted in Raleigh.”)

	Claims 11-13 and 18 are each similar in scope to claims 1-3 and 8 respectively and are each rejected under a similar respective rationale.

	Claims 4-7 and 14-17 are rejected under 35 U.S.C. 103 as being unpatentable over Fagin in view of Bull, in further view of US Patent No. 8,538,989 to Datar.

With regard to dependent claim 4, which depends upon independent claim 1,
	Fagin and Bull teach the system of claim 1, wherein the document includes a document title and the sets of context words corresponding to the individual ones of the key groups of words include words within the document title. (Fagin: See above citations relating to indexes, which are hierarchical structures according to ¶0041, i.e. “hierarchical structure of the document”. Examiner further notes the 
	Fagin and Bull do not fully and explicitly teach sections, and section titles, words within a corresponding section, and words within a corresponding section title.
	Datar teaches a system wherein a document includes sections (Datar: col. 3, ll. 48-63 – “In some implementations, the documents in collection 104 include structured documents. In general, a structured document is a defined hierarchy of elements and structure rules. Elements can include a table of contents, an index, chapters, sections, subsections, nodes, and so on. Structure rules can be applied to one or more elements. For example, a structured document can be described as including a table of contents, a series of section headers, sections, subsections, appendices, and an index. A structure rule can dictate which order the above document portions are structured and which elements can be inserted between the existing elements, for example. Furthermore, within each of the sections and subsections, a number of additional structure rules can be defined. For example, one structure rule describes a section as being made up of a chapter title and a preamble followed by a series of subsections.”), and section titles (See above citation, particularly as directed to section headers, i.e. “section titles”.), and sets of context words corresponding to individual ones of key groups of words include words within a corresponding section, and words within a corresponding section title. (Datar: col. 7, l. 57 through col. 8, l. 9 reads, “At some point, the users 106 can submit a search query to the search engine 102 over the network 108, for example. The query can include one or more search terms, and the search engine 102 can identify and, optionally, rank documents based on document object model element weights. For example, the search engine 102 can count the number of times search terms were found in each section of the document object model 122 based on the search query. Further, the search engine 102 can analyze which sections of the DOM 122 led to more or fewer clicks on documents in the document collection 104. For example, some document object model sections are more likely to be clicked on than other sections. As is the case when comparing a footer section (less clicks) to a title section (more clicks). The likelihood of selecting one document object model section over another can be analyzed by the search engine 102. The analysis can be used to provide an enhanced relevance of search results to the user. For example, the search results analyzed for location in the document can provide search results with semantic relevance, rather than simply matched text or matched formatting.”. See also above citations directed to content and title structural determinations in Fagin.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the structured document section and section title functionality of Datar into the word feature score aggregation system of Fagin by programming the instructions of Fagin (Fagin: ¶0014) to adjust search results based upon sections, section title, and their associated words, as taught by Datar. Both systems are directed to document and word scoring/criteria (Fagin: ¶0041; Datar: col. 1, ll. 58-60) and aggregation of the scores for search/ranking (Fagin: abstract; Datar: col. 2, l. 64 through col. 3, l. 9). An advantage obtained through adjusting search results 

With regard to dependent claim 5, which depends upon dependent claim 4,
	Fagin, Bull and Datar teach the system of claim 4, wherein the corresponding section and the corresponding section title for the individual ones of the key groups of words is determined based on upper level relationships within the hierarchical structure of the document. (Datar: col. 2, ll. 49-63 reads, “FIG. 1 is a block diagram of an example system 100 for assigning weights in a document object model. A document object model generally represents a hierarchy of nodes in an electronic document (e.g., HTML elements, XML elements, etc.). A DOM node represents a fundamental component of a hypertext electronic document, such as a title, a header, footer, an advertisement, an image, a menu, etc. As described below, the system 100 can generate one or more document object models for a collection of documents. In addition, the system 100 can monitor and/or analyze user activity logs to determine a weighted score for each DOM node in the generated document object models. The weighted score can be used in a process that selects for a user 

With regard to dependent claim 6, which depends upon dependent claim 5,
	Fagin, Bull and Datar teach the system of claim 5, wherein the document further includes footnotes, and the sets of context words corresponding to the individual ones of the key groups of words further include words within a corresponding footnote. (Datar: col. 8, ll. 21-38 – reads in part, “In some implementations, the search query additionally includes one or more weight criteria which require that the one or more search terms are located in a section selected from a header section, a caption section, an abstract section, a footnote section,  a summary section,  or a title section. For example, the user may indicate search terms and a condition of finding the term in the summary section of the document.” See also above citations for claim 4 regarding Datar col. 7, l. 57 through col. 8, l. 9 discussion of word functionality in footers, i.e. “footnotes” of a document.)


With regard to dependent claim 7, which depends upon dependent claim 6,
	Fagin, Bull and Datar teach the system of claim 6, wherein the corresponding footnote for the individual ones of the key groups of words is determined based on lower level relationships within the hierarchical structure of the document. (Datar: col. 2, ll. 49-63 reads, “FIG. 1 is a block diagram of an example system 100 for footer, an advertisement, an image, a menu, etc. As described below, the system 100 can generate one or more document object models for a collection of documents. In addition, the system 100 can monitor and/or analyze user activity logs to determine a weighted score for each DOM node in the generated document object models. The weighted scores can be used in a process that selects for a user semantically relevant search results in response to a search query submitted by the user.” The examiner notes that the structure(s) of certain, e.g. footer, nodes are understood to be “lower level relationships”. See also above citations directed to footnotes.)


With regard to dependent claim 14-17,
	Claims 14-17 are each similar to claims 4-7 respectively and are each rejected under a similar respective rationale.

Claims 9-10 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Fagin in view of Bull, in further view of US Pre-Grant Publication 2020/0210526 to Leibovitz.
With regard to dependent claim 9, which depends upon independent claim 1,
	Fagin and Bull teach the system of claim 1, wherein the word feature scores are generated based on processing of the document information through a context-sensitive document-to-vector model (Fagin: ¶0061 – eigenvector computation for rank aggregation. See Fagin background, ¶¶0012-0013, discussion of probabilistic modeling for document ranking being at the center of the customizability, e.g. user requirements, of the teachings. See also discussion of generation of a document feature score aggregation at ¶¶0013-0015, as discussed above.).
	Fagin and Bull do not fully and explicitly teach the context-sensitive document-to-vector model including an attention distribution, a partial summary, and a vocabulary distribution.  
	Leibovitz teaches a system with a context-sensitive document-to-vector model including an attention distribution (Leibovitz: ¶0044 – reads: “In some embodiments, at a step 302, analyzer 206 is configured for first performing a word embedding into vectors with respect to each word, based on semantic similarities. In some embodiments, the word embedding may be performed using a pre-trained embedding model, such as Word2Vec (see, T. Mikolov et al.; "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013), or GloVe (see Jeffrey Pennington, Richard Socher, and Christopher D. Manning; 2014; "GloVe: Global Vectors for Word Representation"). Examiner notes the abstract’s discussion of the applicability of the reference’s teachings to attention distribution. This connection with scores/weights/vectors is further made clear in attention distribution weight discussion in paras. ¶¶0006-0009.), a partial summary (Leibovitz: ¶0046 reads, “In some embodiments, at a next step 304, a bidirectional GRU (see, D. Bandanau et al.; 2014; "Neural machine translation by jointly learning to align and translate"; arXiv preprint arXiv:1409.0473) is applied to get annotations of words by summarizing information from both directions in the sentence, thereby incorporating the contextual information in the annotation.”), and a vocabulary distribution. (Leibovitz: ¶0042 reads in part: “As can be seen, the network consists of a … sentence encoder …, wherein a document vector v is derived progressively from word vectors and sentence vectors.” Examiner notes the further discussion through the paragraphs above that involve combinations that would read upon the broadest reasonable interpretation of those claimed herein, i.e. “vocabulary distribution” with plain meaning interpreted in view of the below claim 10 limitations.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated attention distribution, partial summary and vocabulary distribution of Leibovitz into the word feature score aggregation system of Fagin by programming the instructions of Fagin (Fagin: ¶0014) to consider attention distribution, partial summary and vocabulary distribution in vector-modeling documents, as taught by Leibovitz. Both systems are directed to document and word scoring/criteria (Fagin: ¶0041; Leibovitz: ¶0048) and aggregation of the scores for search/ranking (Fagin: abstract; Leibovitz: ¶0048). An advantage obtained through considering attention distribution, partial summary and vocabulary distribution in vector-modeling documents would have been advantageous to implement in the word feature score aggregation system of Fagin. In particular, the motivation to combine the Fagin and Leibovitz references would have been to improve the speed and turnaround time 

With regard to dependent claim 10, which depends upon dependent claim 9,
	Fagin, Bull and Leibovitz teach the system of claim 9, wherein: 
	the attention distribution facilitates generation of context-aware vector representation of words (Leibovitz: ¶0044 – “In some embodiments, at a step 302, analyzer 206 is configured for first performing a word embedding into vectors with respect to each word, based on semantic similarities. In some embodiments, the word embedding may be performed using a pre-trained embedding model, such as Word2Vec (see, T. Mikolov et al.; "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013), or GloVe (see Jeffrey Pennington, Richard Socher, and Christopher D. Manning; 2014; "GloVe: Global Vectors for Word Representation").” See also ¶0016, which reads, “In some embodiments, the program instructions are further executable to calculate, and in the case of the method, the method further comprises calculating, a vector representation of each of said plurality of electronic documents, based, at least in part, on a weighted sum of said sentence attention weights, wherein said classifying is based, at least in part, on said vector representation.” See also citations regarding context, words in Fagin, as well as the discussion of the combination of references’ teachings set forth above in support of the grounds of rejection of claim 9.); 
	the partial summary facilitates validation of the attention distribution (Leibovitz: ¶0046 reads, “In some embodiments, at a next step 304, a bidirectional GRU (see, D. Bandanau et al.; 2014; "Neural machine translation by jointly learning to align and translate"; arXiv preprint arXiv:1409.0473) is applied to get annotations of words by summarizing information from both directions in the sentence, thereby incorporating the contextual information in the annotation.” Examiner notes the further discussion, continuing through ¶0048, of association with “attention distribution validation”.); and
 	the vocabulary distribution facilitates combination of multiple words into a phrase. (Leibovitz: ¶0042 reads in part: “As can be seen, the network consists of a … sentence encoder …, wherein a document vector v is derived progressively from word vectors and sentence vectors.” Examiner notes the further discussion through the paragraphs above that involve combinations that would read upon the broadest reasonable interpretation of those claimed herein.)

With regard to dependent claims 19-20,
	Claims 19-20 are each similar in scope to claims 9-10 respectively and are each rejected under a similar respective rationale.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:

	-US Pre-Grant Publication 2016/0070731 to Chang for hierarchical categorization
	-US Pre-Grant Publication 2010/0199168 to Balinsky for structural document analysis, processing

	-US Pre-Grant Publication 2004/0139059 to Conroy for rule-based keyword extraction from documents
	-US Pre-Grant Publication 2012/0233127 to Solmer for feature-based vector generation for documents
	-US Pre-Grant Publication 2017/0322939 to Byron for query document vector based relevance ranking
	-US Pre-Grant Publication 2011/0251839 to Achtermann for word based document scoring
	-US Pre-Grant Publication 2010/0049498 to Cao for distributions of vocab, attention
	-US Pre-Grant Publication 2018/0300315 to Leal for feature-based vector generation for documents
	-US Patent No. 10,909,157 to Paulus for document text vector encoding

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAL L BOGACKI whose telephone number is (571)270-5125.  The examiner can normally be reached on Monday - Thursday 9:30am - 7:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JAMES K TRUJILLO can be reached on (571)272-3677.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


MICHAL BOGACKI
Examiner
Art Unit 2157



/M.L.B./           Examiner, Art Unit 2157                                                                                                                                                                                             
/MATTHEW ELL/           Primary Examiner, Art Unit 2145