DETAILED ACTION
Notice of Pre-AIA  or AIA  Status	
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action has been issued in response to Applicant’s Communication of application S/N 15/838,700 filed on January 15, 2021. Claims 1, 2, 4-7, 9, 10, 14, 16, 17, and 21-29 are currently pending with the application.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1, 2, 4-6, 17, 21-29 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
With respect to claims 1 and 17, the claims recite a method performed by one or more computers, the method comprising: accessing, by the one or more computers, first metadata identifying a first set of objects identified in a metadata repository that are included in, are referenced by, or were used to create a first document; accessing, by the one or more computers, second metadata identifying second sets of objects identified in the metadata repository, wherein the second sets of objects respectively are included in, are referenced by, or were used to create documents in a set of multiple second documents; generating, by the one or more computers, a similarity score for each of the multiple second documents, wherein each similarity score indicates similarity of the corresponding second document with respect to the first document, wherein for each of the multiple second documents the similarity score the second document with respect to the first document is generated 
The limitations directed towards generating, by the one or more computers, a similarity score for each of the multiple second documents, wherein each similarity score indicates similarity of the corresponding second document with respect to the first document, wherein for each of the multiple second documents the similarity score the second document with respect to the first document is generated by: determining, for each of multiple different categories of objects, a similarity measure based on an amount of objects in the category that are in common between (i) the first set of objects that are included in, are referenced by, or were used to create the first document and (ii) the second set of objects that are included in, are referenced by, or were used to create the second document; generating, for each similarity measure, a weighted similarity measure by applying, to the similarity measure, a weight value corresponding to the category of objects, wherein each similarity measure based on an amount of objects in a category of objects in common between the first document and 
For example, but for the limitations stating “accessing”, “providing”, “by the one or more computers”, and “client device over a computer network” in claim 1 and 17 and “one or more non-transitory computer-readable media comprising instructions that, when executed by the one or more computer-readable media, cause the one or more computers to perform operations” in claim 17,  generating, by the one or more computers, a similarity score for each of the multiple second documents, wherein each similarity score indicates similarity of the corresponding second document with respect to the first document, wherein for each of the multiple second documents the similarity score the second document with respect to the first document is generated by: determining, for each of multiple different categories of objects, a similarity measure based on an amount of objects in the category that are in common between (i) the first set of objects that are included in, are referenced by, or were used to create the first document and (ii) the second set of objects that are included in, are referenced by, or were used to create the second document; generating, for each similarity measure, a weighted similarity measure by applying, to the similarity measure, a weight value corresponding to the category of objects, wherein each similarity measure based on an amount of objects in a category of objects in common between the first document and second document is weighted using the weight value corresponding to the category of objects, determining the similarity score for the second selecting, by the one or more computers, a subset of the second documents based on the similarity scores encompasses a user mentally evaluating the similarity between the objects and metadata of the objects that define the document’s characteristics between the first document and a set of second documents based on mentally assigned weights to categories of objects.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, these claims recite an abstract idea.
The judicial exception is not integrated into a practical application by additional elements. In particular, limitations reciting accessing, by the one or more computers, first metadata identifying a first set of objects identified in a metadata repository that are included in, are referenced by, or were used to create a first document, accessing, by the one or more computers, second metadata identifying second sets of objects identified in the metadata repository, wherein the second sets of objects respectively are included in, are referenced by, or were used to create documents in a set of multiple second documents, and providing, by the one or more computers, data indicating the selected subset of the second documents to a client device over a computer network, by the one or more computers, client device over a computer network in claim 1 and 17, and one or more non-transitory computer-readable media comprising instructions that, when executed by the one or more computer-readable media, cause the one or more computers to perform operations in claim 17.  By the one or more computers, client device over a computer network in claim 1 and 17, and one or more non-transitory computer-readable media comprising instructions that, when executed by the one or more computer-readable media, cause the one or more computers to perform operations in claim 17 are recited at a high-level of generality (i.e., as a generic computer components performing a generic computer function of searching via accessing and providing) such that it amounts to no more than mere instructions to 
These claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “by the one or more computers”, and “client device over a computer network” in claim 1 and 17 and “one or more non-transitory computer-readable media comprising instructions that, when executed by the one or more computer-readable media, cause the one or more computers to perform operations” in claim 17,  are recited at a high level of generality to apply the exception using generic computer components. Accessing, by the one or more computers, first metadata identifying a first set of objects identified in a metadata repository that are included in, are referenced by, or were used to create a first document, accessing, by the one or more computers, second metadata identifying second sets of objects identified in the metadata repository, wherein the second sets of objects respectively are included in, are referenced by, or were used to create documents Versata (see MPEP 2106.05(d))). Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. To further elaborate, the additional limitations accessing, by the one or more computers, first metadata identifying a first set of objects identified in a metadata repository that are included in, are referenced by, or were used to create a first document, accessing, by the one or more computers, second metadata identifying second sets of objects identified in the metadata repository, wherein the second sets of objects respectively are included in, are referenced by, or were used to create documents in a set of second documents, and providing, by the one or more computers, data indicating the selected subset of the second documents to a client device over a computer network does not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use. These claims are not patent eligible.
With respects to claim 2, the limitations are directed towards determining, based on the first metadata, first identifiers for the objects in the first set of objects, and determining, based on the second metadata, second identifiers for the objects in the second set of objects that are included in or were used to create content of a particular document in the set of multiple second documents, wherein generating the similarity scores comprises determining a similarity score for the particular document based on identifying one or more matches between the first identifiers and the second identifiers. These elements are merely insignificant extra-solution activity and fails the practical application prong test. Determining identifiers of metadata objects appears to be a function of data collection, recognition, and 
With respects to claim 4, the limitations are directed towards determining first objects are referenced by the first document, determining that the first objects depend on additional objects that are not referenced by the first document, determining, as the first set of objects, a combined set of objects that includes the first objects and the additional objects. These elements further elaborate the abstract idea and the human mind and/or with pen and paper can determining whether an object is referenced in a document and object dependencies. This element is merely an insignificant extra-solution activity and fails the practical application test appears to be a function of data recognition, and organization and does not integrate the abstract idea into a practical application. Therefore claim 4 does not recite additional limitations that amount to significantly more than the identified judicial exception.
With respects to claim 5, the limitations are directed towards the metadata repository includes object definitions for objects of multiple different object types wherein generating the similarity scores comprises determining a similarity score for a particular document of the second documents based on determining that the first document and the particular document each reference objects of a same object type. These elements further elaborate the abstract idea and the human mind and/or with pen and paper can derive a similarity scores based on documents references similar object types. Therefore claim 5 does not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.
With respects to claim 6 , the limitations are directed towards receiving data indicating user input to select the first document; receiving data indicating that a user accessed the first document; receiving data indication an end of the first document is reached; or determining that the first document is in a document collection of the first user. These additional limitations appear to be insignificant extra solution activity and are interpreted to be well understood, routine and conventional (Receiving or Symantec (see MPEP 2106.05(d))). Therefore claim 6 does not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.
With respects to claims 7, the limitations are directed towards generating the similarity scores comprises determining a similarity score for a particular document of the second documents based on data indicating a frequency of access of the second document. These elements further elaborate the abstract idea and the human mind and/or with pen and paper can generate similarity scores comprising determining a similarity score for a particular document of the second documents based on data indicating a frequency of access of the second document. Therefore, claim 7 does not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.
With respects to claim 21, the limitations are directed to the metadata repository is an object database that includes identifiers for objects and object definitions for the objects, wherein the first document and the second document were each generated using the object definitions in the metadata repository. These elements are merely linking the use of the identified judicial exception to a particular technological environment and does not integrate the abstract idea into a practical application. Therefore, the claim 21 does not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.
With respects to claim 22, the limitations are directed to the objects in the first set of objects comprise components of the first document; and wherein each of the second sets of objects includes objects that comprise components of the corresponding second document. These elements are merely insignificant extra-solution activity and does not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.

With respects to claim 24, the limitations are directed to the first set of objects includes one or more objects selected by a user, from among objects registered in the metadata repository, to define characteristics of or content of the first document. These elements are merely insignificant extra-solution activity and does not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.
With respects to claim 25, the limitations are directed to generating the similarity scores comprises determining a similarity score for a particular document based on an amount of matches between (i) identifiers for objects used to generate displayable elements of the first document and (ii) identifiers for objects used to generate displayable elements of the second document. These elements further elaborate the abstract idea and the human mind and/or with pen and paper can generate similarity scores comprising determining a similarity score for a particular document based on an amount of matches between (i) identifiers for objects used to generate displayable elements of the first document and (ii) identifiers for objects used to generate displayable elements of the second document. Therefore, claim 25 does not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.
With respects to claim 26, the limitations are directed to tracking, while the first document is created or edited, which objects from the metadata repository used to create or edit the first document and storing the first metadata identifying the objects used to create or edit the first document in a data 
With respects to claim 27, the limitations are directed to the multiple different categories of objects include at least one of datasets, attributes, metrics, filters, prompts, charts, or graphs. These elements are merely insignificant extra-solution activity and does not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.
With respects to claim 28, the limitations are directed towards receiving, over the computer network, data indicating user selection of the first document using a user interface of the client device; and wherein the data indicating the selected subset of the second documents is provided to the client device over a computer network in response to receiving the data indicating the user selection of the first document. These additional limitations appear to be insignificant extra solution activity and are interpreted to be well understood, routine and conventional (Receiving or transmitting data over a network e.g., using the internet to gather data, Symantec (see MPEP 2106.05(d))). Therefore claim 28 does not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.
With respects to claim 29, the limitations are directed to wherein the metadata repository is a shared, centralized metadata repository shared by a plurality of users in an enterprise computing system, the first document and the second documents each being generated from the objects defined in the shared, centralized metadata repository, the first set of objects and the second sets of objects being sets of objects defined in the shared, centralized metadata repository. These elements are merely linking the use of the identified judicial exception to a particular technological environment and does 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 9, 17, 21-24, 26-29 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gorbansky et al. (US 20160098405 A1) hereinafter Gorbansky, in view of Baras et al. (U.S. Patent No.: US  8285734 B1) hereinafter Baras, and further in view of Shukla et al. (US Patent No.: US 9378432 B2) hereinafter Shukla.
As to claim 1:
Gorbansky (US 20160098405 A1) discloses:
A method performed by one or more computers [Paragraph 0153 teaches all or a portion of each block, or a combination of blocks, may be implemented as computer program instructions] the method comprising: 
accessing, by the one or more computers, first metadata identifying a first set of objects identified in a metadata repository that are included in, are referenced by, or were used to create a first document [Paragraph 0017 teaches the document curation system also includes an indexer. The indexer stores the normalized version of the identified object, the hash value, the relevance score and the metadata in the index database for each of a plurality of objects identified by the document analyzer. Paragraph 0058 teaches each such document contains at least one object, such as a paragraph, slide, chart, graph, image, etc. Paragraph 0062 and Figure 1 teach the indexing portion 100 of the document curation system automatically analyzes documents in the document storage system 102, including automatically identifying objects within the documents, according to the documents' native object models. The indexing portion 100 parses the document to identify objects in the document. For each object, the indexing portion 100 of the document curation system automatically identifies metadata, assigns the object a relevance score and indexes the objects to support future searches. The relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the object has been referenced (found in a search or included in a new composite document). Paragraph 0103 teaches a metadata generator 118 is configured to generate metadata about each object identified by the document analyzer 110. Paragraph 0104 teaches metadata may further include information identifying an author of the object and information identifying each user who has used the object in a newly created document. Paragraph 0107 teaches the metadata may also include usage data for objects and documents, such as number of times included in a composite document. 
The examiner interprets calculating the relevance score using metadata describing and identifying objects contained in a document stored in an index database to be the claimed accessing metadata identifying a set of objects, wherein the relevance score calculation using the number of times an object from a document appears in other documents is interpreted to the first set of objects in a first 
accessing, by the one or more computers, second metadata identifying second sets of objects identified in the metadata repository, wherein the second sets of objects respectively are included in, are referenced by, or were used to create documents in a set of multiple second documents [Paragraph 0017 teaches the document curation system also includes an indexer. The indexer stores the normalized version of the identified object, the hash value, the relevance score and the metadata in the index database for each of a plurality of objects identified by the document analyzer. Paragraph 0058 teaches each such document contains at least one object, such as a paragraph, slide, chart, graph, image, etc. Paragraph 0062 and Figure 1 teach the indexing portion 100 of the document curation system automatically analyzes documents in the document storage system 102, including automatically identifying objects within the documents, according to the documents' native object models. The indexing portion 100 parses the document to identify objects in the document. For each object, the indexing portion 100 of the document curation system automatically identifies metadata, assigns the object a relevance score and indexes the objects to support future searches. The relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the object has been referenced (found in a search or included in a new composite document). Paragraph 0103 teaches a metadata generator 118 is configured to generate metadata about each object identified by the document analyzer 110. Paragraph 0104 teaches metadata may further include information identifying an author of the object and information identifying each user who has used the object in a newly 
The examiner interprets calculating the relevance score using metadata describing and identifying objects contained in a document stored in an index database to be the claimed accessing metadata identifying a set of objects, wherein the relevance score calculation using the number of times an object from a document appears in other documents is interpreted to the first set of objects in a first document compared to a second set of objects in a second set of documents. The object used to count the number of times that object appears in other documents is interpreted to be the claimed first set of objects and the objects that appear in other documents is interpreted to be the second set of objects. The document containing the object used to count the number of times that object appears in other documents is interpreted to be the claimed first set of documents and the documents containing the objects that appear in the other documents is interpreted to be the set of multiple second documents.]; 
determining, for each of multiple different categories of objects, a similarity measure based on an amount of objects in the category are in common and each similarity measure based on an amount of objects in a category of objects in common between the first document and second document [Paragraph 0013 teaches hash calculator automatically calculates a hash value based on each identified object. Paragraph 0056 teaches hash value for each generic object, thereby enabling the document curation system to easily identify content-wise identical objects, even if the objects would be displayed differently. Paragraph 0085 teaches a hash calculator 114 is configured to calculate a hash value for each object identified by the document analyzer 110. The hash value is a numeric value. Hash values may be stored in any suitable format, such as unsigned longwords, hexadecimal or encoded as alphanumeric strings. The hash calculator 114 calculates the hash value based on contents of the object after it has been normalized. Therefore, objects that may be rendered differently according to their native object models, yet contain identical semantic content, have identical hash values. Paragraph 0093 
Note: The examiner interprets hash values based on content of the objects as part of documents that can stored in any suitable format to be the claimed multiple different categories of objects, wherein the hash values are mapped to objects that are semantically similar, therefore hash values are interpreted categories that are used to identify similar objects. Identify identical objects, due to their identical hash values allowing for the number of identical objects in the document storage system 102 to be counted, and the relevance score calculated based on the number (absolute number) of identical objects, on a ratio (relative number) of the number of identical objects to the total number of objects in the document storage system 102 or according to some other suitable formula is interpreted to be the claimed a similarity measure based on an amount of objects in the category are in common between, wherein the relevance score is interpreted to be the claimed similarity measure and counting objects that have identical hash values is interpreted to be the claimed an amount of objects in the category are in common. In the context of the cited prior art, the relevance score is also determined is calculated based at least in part on frequency with which identical objects exist in other documents (second documents) and the document containing the object used to count the number of times that object appears in other documents is interpreted to be the claimed first set of documents in the document  between
the first set of objects that are included in, are referenced by, or were used to create the first document [Paragraph 0062 teaches the relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the object has been referenced (found in a search or included in a new composite document). Paragraph 0093 teaches the relevance score is calculated based at least in part on frequency with which identical objects exist in other documents in the document storage system 102. The examiner interprets the object used to find the number of identical objects in other documents to be the claimed first set of objects and the document containing (included in) that object is the claimed first document.] and 
(ii) the second set of objects that are included in, are referenced by, or were used to create the second document [Paragraph 0062 teaches the relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the object has been referenced (found in a search or included in a new composite document). Paragraph 0093 teaches the relevance score is calculated based at least in part on frequency with which identical objects exist in other documents in the document storage system 102. The examiner interprets the other identical objects to be the claimed second set of objects and the other documents to be the claimed second document.]; 

selecting, by the one or more computers, a subset of the second documents based on the similarity scores [Paragraph 0138 teaches the example, search criterion is "patent" 800. In the display shown in FIG. 8, two found objects 802 and 804 are shown. To the right of each found object 802 and 804, the search results user interface 508 displays information 806 about the source document, as well as information 808 about other source documents that contain identical objects. Paragraph 0139 teaches information in the index database 108 about documents and objects may be used to calculate relevance scores and, therefore, affect whether the documents and objects are returned in response to searches.
The examiner interprets the display of documents relevant to a first document returned as a result of the search criterion “patent” to be a display of the claimed selected subset of the second documents, wherein the display of documents relevant to a first document returned as a result of the search criterion “patent” is dependent on the relevance scores of the documents and objects and is interpreted to be the claimed based on the similarity scores.]; and 
providing, by the one or more computers, data indicating the selected subset of the second documents to a client device over a computer network [Paragraph 0138 teaches the example, search criterion is "patent" 800. In the display shown in FIG. 8, two found objects 802 and 804 are shown. To the right of each found object 802 and 804, the search results user interface 508 displays information 806 about the source document, as well as information 808 about other source documents that contain identical objects. Paragraph 0139 teaches information in the index database 108 about documents and objects may be used to calculate relevance scores and, therefore, affect whether the documents and objects are returned in response to searches.].

Gorbansky discloses most of the limitation as set forth in claim 1 but does not appear to expressly disclose generating, by the one or more computers, a similarity score for each of the multiple 
Baras discloses:
generating, by the one or more computers, a similarity score for each of the multiple second documents[Column 5 Lines 49-52 teach the one or more comparison scores are then processed by the server 202. As described with reference to step 108 of FIG. 1, the server 202 may generate a similarity score for each compared second document. Column 5 Lines 65-61 teaches a list of compared documents with corresponding similarity scores is presented to the user via the user interface 204. As described above with reference to step 110 of FIG. 1, the list may be outputted in accordance with a similarity score threshold and/or a maximum output number.], wherein each similarity score indicates similarity of the corresponding second document with respect to the first document [Column 5 Lines 54-56 teach the similarity score may indicate how similar a second document is to the first document.], wherein for each of the multiple second documents the similarity score the second document with respect to the first document is generated [Column 5 Lines 49-52 teach the one or more comparison scores are then processed by the server 202. As described with reference to step 108 of FIG. 1, the server 202 may generate a similarity score for each compared second document. Column 5 Lines 65-61 teaches a list of compared documents with corresponding similarity scores is presented to the user via the user interface 204. As described above with reference to step 110 of FIG. 1, the list may be outputted in accordance with a similarity score threshold and/or a maximum output number.] by: 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Gorbansky, by incorporating a similarity score for each of compared second documents with respect to a compared first document, as taught by Baras (see Column 5 Lines 49-52, Column 5 Lines 54-56, and, Column 5 Lines 65-61), because both applications are directed to identifying and processing 

Gorbansky and Baras disclose most of the limitation as set forth in claim 1 but does not appear to expressly disclose generating weighted similarity measures by applying weights for the categories of objects to the similarity measures, wherein each similarity measure is weighted using a weight value corresponding to the category of objects; and determining the similarity score based on the weighted similarity measures for the multiple different categories of objects.
Shukla discloses:
generating, for each similarity measure, a weighted similarity measure by applying, to the similarity measure, a weight value corresponding to the category of objects, wherein each similarity measure is weighted using the weight value corresponding to the category of objects [Column 2 Lines 62-66 teach the categories (e.g., nodes) of the hierarchies may be assigned to each of the objects, such as by simply assigning one or more categories of a hierarchy that are related to an object and/or by weighting the categories in the hierarchy based on how relevant those categories are to the object. Column 12 Lines 41-51 teaches based on the distance and siblings factors, an effect that each category in the first and second hierarchy has on the other categories in the respective first or second hierarchy may be calculated (block 508). For example, the category effect module 118 may calculate a distance factor and a siblings factor for each of categories 206, 208, 210, 212, 214, 216, 218, 220 relative to the others. Based on the distance and siblings factors calculated for each of the categories, the category effect module 118 may calculate an overall effect that each of the categories 206, 208, 210, 212, 214, ; and determining the similarity score for the second document with respect to the first document based on the weighted similarity measures each of the multiple different categories of objects [Column 2 Line 52 teaches objects compared are two written articles. Figure 5 is a flow diagram depicting a procedure in an example implementation in which a first and second hierarchy of categories are formed to represent a first and second object. Figure 5:510 teaches computing a similarity score. Column 12 Lines 52-54 teaches using the calculated effect of each category in the first and second hierarchies, a similarity score may be computed to determine a similarity between the first and second object. For example, the effect combining module 120 may combine the effect that each category of hierarchy 306 has on the other categories in that hierarchy. To do so, the effect combining module 120 may combine relevance vectors computed for each of the categories of hierarchy 306. Specifically, the combining may be performed by taking a weighted sum of the relevance vectors computed for hierarchy 306. Note: The computed similarity score between first two written articles incorporating each objects respective hierarchy of 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Gorbansky and Baras, by incorporating category hierarchies representing objects, applying weights to the categories, calculating a relevance vector, and similarity score using the weights and relevance vector, as taught by Skuhla (see Column 2 Line 52, Column 2 Lines 62-66, Column 12 Lines 41-51, Figure 2, and Figure 5), because the three applications are directed to identifying and processing objects included in documents; configuring the document curation system to incorporating category hierarchies representing objects, applying weights to the categories, calculating a relevance vector, and similarity score using the weights and relevance vector improves document recommendation, a feature widely used in search engines, product recommendation features of e-commerce websites, news websites, content suggestions (see Shukla Column 3 Lines 37-40).

As to claim 9:
Gorbansky discloses:
A system comprising: 
one or more computers; and 
one or more computer-readable media comprising instructions that, when executed by the one or more computer-readable media, cause the one or more computers to perform operations comprising:
identifying, by the one or more computers, a first document that is indicated on a user interface of a client device [Paragraph 0060 teaches the document storage system 102 includes an ; 
accessing, by the one or more computers, first metadata identifying a first set of objects identified in a metadata repository that are included in, are referenced by, or were used to create a first document [Paragraph 0017 teaches the document curation system also includes an indexer. The indexer stores the normalized version of the identified object, the hash value, the relevance score and the metadata in the index database for each of a plurality of objects identified by the document analyzer. Paragraph 0058 teaches each such document contains at least one object, such as a paragraph, slide, chart, graph, image, etc. Paragraph 0062 and Figure 1 teach the indexing portion 100 of the document curation system automatically analyzes documents in the document storage system 102, including automatically identifying objects within the documents, according to the documents' native object models. The indexing portion 100 parses the document to identify objects in the document. For each object, the indexing portion 100 of the document curation system automatically identifies metadata, assigns the object a relevance score and indexes the objects to support future searches. The relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the 
The examiner interprets calculating the relevance score using metadata describing and identifying objects contained in a document stored in an index database to be the claimed accessing metadata identifying a set of objects, wherein the relevance score calculation using the number of times an object from a document appears in other documents is interpreted to the first set of objects in a first document compared to a second set of objects in a second set of documents. The object used to count the number of times that object appears in other documents is interpreted to be the claimed first set of objects. The document containing the object used to count the number of times that object appears in other documents is interpreted to be the claimed first set of documents.];  
accessing, by the one or more computers, second metadata identifying second sets of objects identified in the metadata repository, wherein the second sets of objects respectively  are included in, are referenced by, or were used to create documents in a set of multiple second documents [Paragraph 0017 teaches the document curation system also includes an indexer. The indexer stores the normalized version of the identified object, the hash value, the relevance score and the metadata in the index database for each of a plurality of objects identified by the document analyzer. Paragraph 0058 teaches each such document contains at least one object, such as a paragraph, slide, chart, graph, image, etc. Paragraph 0062 and Figure 1 teach the indexing portion 100 of the document curation system automatically analyzes documents in the document storage system 102, including automatically identifying objects within the documents, according to the documents' native object models. The 
The examiner interprets calculating the relevance score using metadata describing and identifying objects contained in a document stored in an index database to be the claimed accessing metadata identifying a set of objects, wherein the relevance score calculation using the number of times an object from a document appears in other documents is interpreted to the first set of objects in a first document compared to a second set of objects in a second set of documents. The object used to count the number of times that object appears in other documents is interpreted to be the claimed first set of objects and the objects that appear in other documents is interpreted to be the second set of objects. The document containing the object used to count the number of times that object appears in other documents is interpreted to be the claimed first set of documents and the documents containing the objects that appear in the other documents is interpreted to be the set of multiple second documents.]; 
obtaining, by the one or more computers, a similarity score for each of the multiple second documents, wherein each similarity score indicates similarity of the corresponding second document with respect to the first document, wherein for each of the multiple second documents the similarity score the second document with respect to the first document is generated by [Paragraph 0062 teaches the relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the object has been referenced (found in a search or included in a new composite document). Paragraph 0069 teaches relevance scores may also be calculated and assigned to documents and pages. Paragraph 0073 teaches if an existing document has a relevance score with positive contributions as a result of having been accessed many times or recently, the relevance score of the content-wise identical new version of the document may be given a positive relevance score, or its relevance score may be increased, by a value calculated from the relevance score of the existing document. Paragraph 0093 and Figure 1 teach the relevance score is calculated based at least in part on frequency with which identical objects exist in other documents in the document storage system 102. Paragraph 0094 teaches the relevance score is calculated based at least in part on frequency with which similar, but not identical, objects exist in other documents in the document storage system 102. Paragraph 0101 and Figure 1 teach In addition to calculating a relevance score for each object, the document analyzer 110 also calculates a relevance score for each document found in the document storage system 102. The relevance score for a document may be calculated as an aggregation of the relevance scores of objects found within the document. FIG. 8 and Paragraph 0138 teaches to the right of each found object 802 and 804, the search results user interface 508 displays information 806 about the source document, as well as information 808 about other source documents that contain identical objects. The user can click a check box control, exemplified at 810 and 812, to the left of the document icon to command the search results user interface 508 to select the object for some further operation, such as adding the object to a topic 814 or to a clipboard 816 or viewing version information, display any identical object(s) instead of, or in addition to, the already displayed object 802. Note: The cited each found object 802 is interpreted to read on the claimed first document and the cited exemplified source :
determining, for each of multiple different categories of objects, a similarity measure based on an amount of objects in the category are in common and each similarity measure based on an amount of objects in a category of objects in common between the first document and second document [Paragraph 0013 teaches hash calculator automatically calculates a hash value based on each identified object. Paragraph 0056 teaches hash value for each generic object, thereby enabling the document curation system to easily identify content-wise identical objects, even if the objects would be displayed differently. Paragraph 0085 teaches a hash calculator 114 is configured to calculate a hash value for each object identified by the document analyzer 110. The hash value is a numeric value. Hash values may be stored in any suitable format, such as unsigned longwords, hexadecimal or encoded as alphanumeric strings. The hash calculator 114 calculates the hash value based on contents of the object after it has been normalized. Therefore, objects that may be rendered differently according to their native object models, yet contain identical semantic content, have identical hash values. Paragraph 0093 teaches the relevance score is calculated based at least in part on frequency with which identical objects exist in other documents in the document storage system 102. As noted, the indexing portion 100 of the document curation system can identify identical objects, due to their identical hash values. Thus, the number of identical objects in the document storage system 102 may be counted, and the relevance 
Note: The examiner interprets hash values based on content of the objects as part of documents that can stored in any suitable format to be the claimed multiple different categories of objects, wherein the hash values are mapped to objects that are semantically similar, therefore hash values are interpreted categories that are used to identify similar objects. Identify identical objects, due to their identical hash values allowing for the number of identical objects in the document storage system 102 to be counted, and the relevance score calculated based on the number (absolute number) of identical objects, on a ratio (relative number) of the number of identical objects to the total number of objects in the document storage system 102 or according to some other suitable formula is interpreted to be the claimed a similarity measure based on an amount of objects in the category are in common between, wherein the relevance score is interpreted to be the claimed similarity measure and counting objects that have identical hash values is interpreted to be the claimed an amount of objects in the category are in common. In the context of the cited prior art, the relevance score is also determined is calculated based at least in part on frequency with which identical objects exist in other documents (second documents) and the document containing the object used to count the number of times that object appears in other documents is interpreted to be the claimed first set of documents in the document storage system, where the objects are reasonably represented by the cited hash values (categories), therefore reading on the claimed similarity measure based on an amount of objects in a category of objects in common between the first document and second document.] between
the first set of objects that are included in, are referenced by, or were used to create the first document [Paragraph 0062 teaches the relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the object has been referenced (found in a search or included in a new composite document). Paragraph 0093 teaches the relevance score is calculated based at least in part on frequency with which identical objects exist in other documents in the document storage system 102. The examiner interprets the object used to find the number of identical objects in other documents to be the claimed first set of objects and the document containing (included in) that object is the claimed first document.] and 
(ii) the second set of objects that are included in, are referenced by, or were used to create the second document [Paragraph 0062 teaches the relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the object has been referenced (found in a search or included in a new composite document). Paragraph 0093 teaches the relevance score is calculated based at least in part on frequency with which identical objects exist in other documents in the document storage system 102. The examiner interprets the other identical objects to be the claimed second set of objects and the other documents to be the claimed second document.]; 
selecting, by the one or more computers, a subset of the second documents based on the similarity scores [Paragraph 0138 teaches the example, search criterion is "patent" 800. In the display shown in FIG. 8, two found objects 802 and 804 are shown. To the right of each found object 802 and 804, the search results user interface 508 displays information 806 about the source document, as well 
The examiner interprets the display of documents relevant to a first document returned as a result of the search criterion “patent” to be a display of the claimed selected subset of the second documents, wherein the display of documents relevant to a first document returned as a result of the search criterion “patent” is dependent on the relevance scores of the documents and objects and is interpreted to be the claimed based on the similarity scores.]; and 
providing, by the one or more computers, data indicating the selected subset of the second documents to the client device for display on the user interface of the client device [Paragraph 0138 teaches the example, search criterion is "patent" 800. In the display shown in FIG. 8, two found objects 802 and 804 are shown. To the right of each found object 802 and 804, the search results user interface 508 displays information 806 about the source document, as well as information 808 about other source documents that contain identical objects. Paragraph 0139 teaches information in the index database 108 about documents and objects may be used to calculate relevance scores and, therefore, affect whether the documents and objects are returned in response to searches.]

Gorbansky discloses most of the limitation as set forth in claim 9 but does not appear to expressly disclose generating, by the one or more computers, a similarity score for each of the multiple second documents, wherein each similarity score indicates similarity of the corresponding second document with respect to the first document is generated.
Baras discloses:
generating, by the one or more computers, a similarity score for each of the multiple second documents[Column 5 Lines 49-52 teach the one or more comparison scores are then processed by the server 202. As described with reference to step 108 of FIG. 1, the server 202 may generate a similarity score for each compared second document. Column 5 Lines 65-61 teaches a list of compared documents with corresponding similarity scores is presented to the user via the user interface 204. As described above with reference to step 110 of FIG. 1, the list may be outputted in accordance with a similarity score threshold and/or a maximum output number.], wherein each similarity score indicates similarity of the corresponding second document with respect to the first document [Column 5 Lines 54-56 teach the similarity score may indicate how similar a second document is to the first document.], wherein for each of the multiple second documents the similarity score the second document with respect to the first document is generated [Column 5 Lines 49-52 teach the one or more comparison scores are then processed by the server 202. As described with reference to step 108 of FIG. 1, the server 202 may generate a similarity score for each compared second document. Column 5 Lines 65-61 teaches a list of compared documents with corresponding similarity scores is presented to the user via the user interface 204. As described above with reference to step 110 of FIG. 1, the list may be outputted in accordance with a similarity score threshold and/or a maximum output number.] by: 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Gorbansky, by incorporating a similarity score for each of compared second documents with respect to a compared first document, as taught by Baras (see Column 5 Lines 49-52, Column 5 Lines 54-56, and, Column 5 Lines 65-61), because both applications are directed to identifying and processing objects included in documents; configuring the document curation system to incorporate a similarity score for each of compared second documents with respect to a compared first document provides a technique for comparing documents that addresses exceedingly difficult document analysis and retrieval 

Gorbansky and Baras disclose most of the limitation as set forth in claim 9 but does not appear to expressly disclose generating weighted similarity measures by applying weights for the categories of objects to the similarity measures, wherein each similarity measure is weighted using a weight value corresponding to the category of objects; and determining the similarity score based on the weighted similarity measures for the multiple different categories of objects.
Shukla discloses:
generating, for each similarity measure, a weighted similarity measure by applying, to the similarity measure, a weight value corresponding to the category of objects, wherein each similarity measure is weighted using the weight value corresponding to the category of objects [Column 2 Lines 62-66 teach the categories (e.g., nodes) of the hierarchies may be assigned to each of the objects, such as by simply assigning one or more categories of a hierarchy that are related to an object and/or by weighting the categories in the hierarchy based on how relevant those categories are to the object. Column 12 Lines 41-51 teaches based on the distance and siblings factors, an effect that each category in the first and second hierarchy has on the other categories in the respective first or second hierarchy may be calculated (block 508). For example, the category effect module 118 may calculate a distance factor and a siblings factor for each of categories 206, 208, 210, 212, 214, 216, 218, 220 relative to the others. Based on the distance and siblings factors calculated for each of the categories, the category effect module 118 may calculate an overall effect that each of the categories 206, 208, 210, 212, 214, 216, 218, 220 has on the others. The effect of each of the categories 206, 208, 210, 212, 214, 216, 218, 220 may be represented as a relevance vector, indicative of how relevant a category is to each of the other categories. Note: The examiner interprets the cited weighting the categories of objects to be the ; and determining the similarity score for the second document with respect to the first document based on the weighted similarity measures each of the multiple different categories of objects [Column 2 Line 52 teaches objects compared are two written articles. Figure 5 is a flow diagram depicting a procedure in an example implementation in which a first and second hierarchy of categories are formed to represent a first and second object. Figure 5:510 teaches computing a similarity score. Column 12 Lines 52-54 teaches using the calculated effect of each category in the first and second hierarchies, a similarity score may be computed to determine a similarity between the first and second object. For example, the effect combining module 120 may combine the effect that each category of hierarchy 306 has on the other categories in that hierarchy. To do so, the effect combining module 120 may combine relevance vectors computed for each of the categories of hierarchy 306. Specifically, the combining may be performed by taking a weighted sum of the relevance vectors computed for hierarchy 306. Note: The computed similarity score between first two written articles incorporating each objects respective hierarchy of weighted categories and weighted relevance vectors is interpreted to be teach the claimed determining the similarity score based on the weighted similarity measures for the multiple different categories of objects.]


As to claim 17:
Gorbansky discloses:
One or more non-transitory computer-readable media comprising instructions that, when executed by the one or more computer-readable media, cause the one or more computers to perform operations comprising: 
accessing, by the one or more computers, first metadata identifying a first set of objects identified in a metadata repository that are included in, are referenced by, or were used to create a first document [Paragraph 0017 teaches the document curation system also includes an indexer. The indexer stores the normalized version of the identified object, the hash value, the relevance score and the metadata in the index database for each of a plurality of objects identified by the document analyzer. Paragraph 0058 teaches each such document contains at least one object, such as a paragraph, slide, chart, graph, image, etc. Paragraph 0062 and Figure 1 teach the indexing portion 100 of the 
The examiner interprets calculating the relevance score using metadata describing and identifying objects contained in a document stored in an index database to be the claimed accessing metadata identifying a set of objects, wherein the relevance score calculation using the number of times an object from a document appears in other documents is interpreted to the first set of objects in a first document compared to a second set of objects in a second set of documents. The object used to count the number of times that object appears in other documents is interpreted to be the claimed first set of objects. The document containing the object used to count the number of times that object appears in other documents is interpreted to be the claimed first set of documents. 
accessing, by the one or more computers, second metadata identifying second sets of objects identified in the metadata repository, wherein the second sets of objects respectively are included in, are referenced by, or were used to create documents in a set of multiple second documents 
The examiner interprets calculating the relevance score using metadata describing and identifying objects contained in a document stored in an index database to be the claimed accessing metadata identifying a set of objects, wherein the relevance score calculation using the number of times an object from a document appears in other documents is interpreted to the first set of objects in a first document compared to a second set of objects in a second set of documents. The object used to count the number of times that object appears in other documents is interpreted to be the claimed first set of ; 
generating, by the one or more computers, a similarity score for each of the multiple second documents, wherein each similarity score indicates similarity of the corresponding second document with respect to the first document, wherein for each of the multiple second documents the similarity score the second document with respect to the first document is generated by [Paragraph 0062 teaches the relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the object has been referenced (found in a search or included in a new composite document). Paragraph 0069 teaches relevance scores may also be calculated and assigned to documents and pages. Paragraph 0073 teaches if an existing document has a relevance score with positive contributions as a result of having been accessed many times or recently, the relevance score of the content-wise identical new version of the document may be given a positive relevance score, or its relevance score may be increased, by a value calculated from the relevance score of the existing document. Paragraph 0093 and Figure 1 teach the relevance score is calculated based at least in part on frequency with which identical objects exist in other documents in the document storage system 102. Paragraph 0094 teaches the relevance score is calculated based at least in part on frequency with which similar, but not identical, objects exist in other documents in the document storage system 102. Paragraph 0101 and Figure 1 teach In addition to calculating a relevance score for each object, the document analyzer 110 also calculates a relevance score for each document found in the document storage system 102. The relevance score for a document may be calculated as an aggregation of the relevance scores of objects found within the document. FIG. 8 and Paragraph 0138 teaches to the right : 
determining, for each of multiple different categories of objects, a similarity measure based on an amount of objects in the category are in common and each similarity measure based on an amount of objects in a category of objects in common between the first document and second document [Paragraph 0013 teaches hash calculator automatically calculates a hash value based on each identified object. Paragraph 0056 teaches hash value for each generic object, thereby enabling the document curation system to easily identify content-wise identical objects, even if the objects would be displayed differently. Paragraph 0085 teaches a hash calculator 114 is configured to calculate a hash value for each object identified by the document analyzer 110. The hash value is a numeric value. Hash values may be stored in any suitable format, such as unsigned longwords, hexadecimal or encoded as 
Note: The examiner interprets hash values based on content of the objects as part of documents that can stored in any suitable format to be the claimed multiple different categories of objects, wherein the hash values are mapped to objects that are semantically similar, therefore hash values are interpreted categories that are used to identify similar objects. Identify identical objects, due to their identical hash values allowing for the number of identical objects in the document storage system 102 to be counted, and the relevance score calculated based on the number (absolute number) of identical objects, on a ratio (relative number) of the number of identical objects to the total number of objects in the document storage system 102 or according to some other suitable formula is interpreted to be the claimed a similarity measure based on an amount of objects in the category are in common between, wherein the relevance score is interpreted to be the claimed similarity measure and counting objects that have identical hash values is interpreted to be the claimed an amount of objects in the category are in common. In the context of the cited prior art, the relevance score is also determined is calculated   between
the first set of objects that are included in, are referenced by, or were used to create the first document [Paragraph 0062 teaches the relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the object has been referenced (found in a search or included in a new composite document). Paragraph 0093 teaches the relevance score is calculated based at least in part on frequency with which identical objects exist in other documents in the document storage system 102. The examiner interprets the object used to find the number of identical objects in other documents to be the claimed first set of objects and the document containing (included in) that object is the claimed first document.] and 
(ii) the second set of objects that are included in, are referenced by, or were used to create the second document [Paragraph 0062 teaches the relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the object has been referenced (found in a search or included in a new composite document). Paragraph 0093 teaches the relevance score is calculated based at least in part on frequency with which identical objects exist in other documents in the document storage system 102. The examiner interprets the other identical objects to be ; 

selecting, by the one or more computers, a subset of the second documents based on the similarity scores [Paragraph 0138 teaches the example, search criterion is "patent" 800. In the display shown in FIG. 8, two found objects 802 and 804 are shown. To the right of each found object 802 and 804, the search results user interface 508 displays information 806 about the source document, as well as information 808 about other source documents that contain identical objects. Paragraph 0139 teaches information in the index database 108 about documents and objects may be used to calculate relevance scores and, therefore, affect whether the documents and objects are returned in response to searches.
The examiner interprets the display of documents relevant to a first document returned as a result of the search criterion “patent” to be a display of the claimed selected subset of the second documents, wherein the display of documents relevant to a first document returned as a result of the search criterion “patent” is dependent on the relevance scores of the documents and objects and is interpreted to be the claimed based on the similarity scores.]; and 
providing, by the one or more computers, data indicating the selected subset of the second documents to a client device over a computer network [Paragraph 0138 teaches the example, search criterion is "patent" 800. In the display shown in FIG. 8, two found objects 802 and 804 are shown. To the right of each found object 802 and 804, the search results user interface 508 displays information 806 about the source document, as well as information 808 about other source documents that contain identical objects. Paragraph 0139 teaches information in the index database 108 about documents and objects may be used to calculate relevance scores and, therefore, affect whether the documents and objects are returned in response to searches.]

Gorbansky discloses most of the limitation as set forth in claim 17 but does not appear to expressly disclose generating, by the one or more computers, a similarity score for each of the multiple second documents, wherein each similarity score indicates similarity of the corresponding second document with respect to the first document is generated.
Baras discloses:
generating, by the one or more computers, a similarity score for each of the multiple second documents[Column 5 Lines 49-52 teach the one or more comparison scores are then processed by the server 202. As described with reference to step 108 of FIG. 1, the server 202 may generate a similarity score for each compared second document. Column 5 Lines 65-61 teaches a list of compared documents with corresponding similarity scores is presented to the user via the user interface 204. As described above with reference to step 110 of FIG. 1, the list may be outputted in accordance with a similarity score threshold and/or a maximum output number.], wherein each similarity score indicates similarity of the corresponding second document with respect to the first document [Column 5 Lines 54-56 teach the similarity score may indicate how similar a second document is to the first document.], wherein for each of the multiple second documents the similarity score the second document with respect to the first document is generated [Column 5 Lines 49-52 teach the one or more comparison scores are then processed by the server 202. As described with reference to step 108 of FIG. 1, the server 202 may generate a similarity score for each compared second document. Column 5 Lines 65-61 teaches a list of compared documents with corresponding similarity scores is presented to the user via the user interface 204. As described above with reference to step 110 of FIG. 1, the list may be outputted in accordance with a similarity score threshold and/or a maximum output number.] by: 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as 


Gorbansky and Baras disclose most of the limitation as set forth in claim 17 but does not appear to expressly disclose generating weighted similarity measures by applying weights for the categories of objects to the similarity measures, wherein each similarity measure is weighted using a weight value corresponding to the category of objects; and determining the similarity score based on the weighted similarity measures for the multiple different categories of objects.
Shukla discloses:
generating, for each similarity measure, a weighted similarity measure by applying, to the similarity measure, a weight value corresponding to the category of objects, wherein each similarity measure is weighted using the weight value corresponding to the category of objects [Column 2 Lines 62-66 teach the categories (e.g., nodes) of the hierarchies may be assigned to each of the objects, such as by simply assigning one or more categories of a hierarchy that are related to an object and/or by weighting the categories in the hierarchy based on how relevant those categories are to the object. Column 12 Lines 41-51 teaches based on the distance and siblings factors, an effect that each category in the first and second hierarchy has on the other categories in the respective first or second hierarchy ; and determining the similarity score for the second document with respect to the first document based on the weighted similarity measures each of the multiple different categories of objects [Column 2 Line 52 teaches objects compared are two written articles. Figure 5 is a flow diagram depicting a procedure in an example implementation in which a first and second hierarchy of categories are formed to represent a first and second object. Figure 5:510 teaches computing a similarity score. Column 12 Lines 52-54 teaches using the calculated effect of each category in the first and second hierarchies, a similarity score may be computed to determine a similarity between the first and second object. For example, the effect combining module 120 may combine the effect that each category of hierarchy 306 has on the other 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Gorbansky and Baras, by incorporating category hierarchies representing objects, applying weights to the categories, calculating a relevance vector, and similarity score using the weights and relevance vector, as taught by Skuhla (see Column 2 Line 52, Column 2 Lines 62-66, Column 12 Lines 41-51, Figure 2, and Figure 5), because the three applications are directed to identifying and processing objects included in documents; configuring the document curation system to incorporating category hierarchies representing objects, applying weights to the categories, calculating a relevance vector, and similarity score using the weights and relevance vector improves document recommendation, a feature widely used in search engines, product recommendation features of e-commerce websites, news websites, content suggestions (see Shukla Column 3 Lines 37-40).

As to claim 21:
Gorbansky discloses:
The method of claim 1, wherein the metadata repository is an object database that includes identifiers for objects and object definitions for the objects, wherein the first document and the second document were each generated using the object definitions in the metadata repository .

As to claim 22:
Gorbansky discloses:
 The method of claim 1, wherein the objects in the first set of objects comprise components of the first document; and wherein each of the second sets of objects includes objects that comprise components of the corresponding second document [Paragraph 0062 teaches the relevance score is .

As to claim 23:
Gorbanksy discloses:
The method of claim 1, wherein the objects in the first set of objects comprise a first object that represents a data set, an attribute, a metric, a filter, a prompt, or a visualization, the first object defining a portion of the content of the first document [Paragraph 0054 teaches previously-created elements, such as text, paragraphs, charts, graphs, slides, spreadsheets, images, audio files, video files and the like, in electronic business documents… refer to such elements as "objects". Paragraph 0062 teaches the relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the object has been referenced (found in a search or included in a new composite document). The examiner interprets the object used to find the number of identical objects in other documents to be the claimed first set of objects and the document containing (included in) that object is the claimed first document. The examiner interprets spreadsheets, graphs, charts, paragraphs, and text referred to as objects to be the claimed data set or visualization objects. ].

As to claim 24:
Gorbansky discloses:
The method of claim 1, wherein the first set of objects includes one or more objects selected by a user, from among objects registered in the metadata repository, to define characteristics of or content of the first document [Paragraph 0009 teaches the system presents found objects, as well as objects that are similar to the found objects, and allows the user to select one or more of the presented objects. The system harmonizes display aspects of the user-selected objects and generates a new document from them. Paragraph 0056 teaches the index contains a normalized copy of each object. The normalized copy is generic, in that it includes the semantic contents of the object, such as text of a paragraph, but the normalized copy does not include display aspects, such as font, color or type size. Paragraph 0149, Figure 1, and Figure 9 teach the document generator 914 fetches the normalized versions of the user-selected objects from the index database 108 and applies a single format, including font, type size, bolding, orientation, color, etc., to the normalized objects as the objects are being placed into the new document 916, thereby generating the new document 916 with a uniform format. 
The examiner interprets user-selected objects and objects that are similar to the found objects to be claimed the first set of objects including one or more objects selected by a user. The document newly generated document is interpreted to be the first document and the normalized version of the object, wherein the normalized version of the object is generic and includes semantic content, applied to the newly created document is interpreted to be the claimed define content of the first document].

As to claim 26:
Gorbansky discloses:
The method of claim 1, comprising: 
tracking, while the first document is created or edited, which objects from the metadata repository used to create or edit the first document [Paragraph 0008 teaches a user can query the document curation system, and the system accesses the index to fulfill the query. Paragraph 0016 teaches the index database is configured to store information about individual objects. Paragraph 0017 teaches an indexer stores the normalized version of the identified object, the hash value, the relevance score and the metadata in the index database for each of a plurality of objects identified by the document analyzer. Paragraph 0092 as users perform queries searching for objects and select objects for newly created documents, the indexing portion 100 of the document curation system may keep track of authors of documents whose objects are selected]; and storing the first metadata identifying the objects used to create or edit the first document in a data repository in association with the first document [Paragraph 0063, Figure 1, and Figure 10 teach at 1000, the indexing portion 100 calls the API 104 (FIG. 1) of the document storage system 102 to request notification of newly-created documents. At 1002, the indexing portion 100 receives notification of a newly-created document. Paragraph 0066 teaches at 1012 the indexing portion 100 identifies object metadata related to the object and stored in the document or elsewhere in the document storage system 102. Paragraph 0126 teaches FIG. 16 is a flowchart illustrating operations performed by the indexer 120. At 1600, the indexer 120 stores the normalized version of the identified object in the index database 108 for each of the objects identified by the document analyzer 110. At 1602, the indexer 120 stores the hash value. At 1604, the indexer 120 stores the relevance score. At 1606, the indexer 120 stores the metadata. Paragraph 0103 teaches a metadata generator 118 is configured to generate metadata about each object identified by the document analyzer 110. Metadata includes information sufficient to fetch the object from the document storage system 102. This information may include, for example, a path, for example device, directory, file name and file type. Paragraph 0107 teaches the metadata may also include usage data for objects and documents, such as number of times an object has been returned in response to a query, 
The examiner interprets the object metadata stored in the index database used to create the newly created document stored in the index database to be the claimed storing the first metadata identifying the objects used to create or edit the first document in a data repository in association with the first document. The index database is interpreted to be the claimed data repository. The metadata stored in the database that includes file names and a path is interpreted to be metadata identifying objects used to create the document, and the newly created document is interpreted to be the claimed first document.]; wherein accessing the first metadata comprises accessing the first metadata from the data repository [Paragraph 0017 teaches indexer stores the normalized version of the identified object, the hash value, the relevance score and the metadata in the index database for each of a plurality of objects identified by the document analyzer. Paragraph 0062 teaches for each object, the indexing portion 100 of the document curation system automatically identifies metadata, assigns the object a relevance score and indexes the objects to support future searches. The examiner interprets indexing the objects to support futures searches to be the claimed accessing the first metadata comprises accessing the first metadata from the data repository. Searches that require the support of metadata is interpreted to be the claimed accessing the first metadata, index used to store the metadata is interpreted to be the claimed accessing the first metadata from the data repository, wherein the index is interpreted to be the claimed data repository].

As to claim 27:
Gorbansky discloses:
The method of claim 1, wherein the multiple different categories of objects include at least one of datasets, attributes, metrics, filters, prompts, charts, or graphs [Paragraph 0013 teaches hash calculator automatically calculates a hash value based on each identified object. Paragraph 0056 teaches hash value for each generic object, thereby enabling the document curation system to easily identify content-wise identical objects, even if the objects would be displayed differently. Paragraph 0058 teaches each such document contains at least one object, such as a paragraph, slide, chart, graph, image, etc.  Paragraph 0085 teaches a hash calculator 114 is configured to calculate a hash value for each object identified by the document analyzer 110. The hash value is a numeric value. Hash values may be stored in any suitable format, such as unsigned longwords, hexadecimal or encoded as alphanumeric strings. The hash calculator 114 calculates the hash value based on contents of the object after it has been normalized. Therefore, objects that may be rendered differently according to their native object models, yet contain identical semantic content, have identical hash values. Paragraph 0089 teaches various object types, such as text, graphs. Note: The examiner interprets objects that are of paragraph, slide, chart, graph, or image object types includes the claimed charts or graphs. The hash values based on the content of objects that fall under charts or graphs object types is interpreted to be the claimed categories of objects, wherein the hash values are the categories and content used to calculate the hash values are from graphs or charts.]

As to claim 28:
Gorbansky discloses:
The method of claim 1, comprising receiving, over the computer network, data indicating user selection of the first document using a user interface of the client device; and wherein the data indicating the selected subset of the second documents is provided to the client device over a computer network in response to receiving the data indicating the user selection of the first document 

As to claim 29:
Gorbansky discloses:
The method of claim 1, wherein the metadata repository is a shared, centralized metadata repository shared by a plurality of users in an enterprise computing system [Paragraph 0055 teaches the document curation system acts as an intermediary between users and document storage systems and/or document management systems (collectively “document storage systems”). A user can query the document curation system, and the system accesses an index, which stores normalized versions of , the first document and the second documents each being generated from the objects defined in the shared, centralized metadata repository, the first set of objects and the second sets of objects being sets of objects defined in the shared, centralized metadata repository [Paragraph 0062 and Figure 1 teach the indexing portion 100 of the document curation system automatically analyzes documents in the document storage system 102, including automatically identifying objects within the documents, according to the documents' native object models. The indexing portion 100 parses the document to identify objects in the document. For each object, the indexing portion 100 of the document curation system automatically identifies metadata, assigns the object a relevance score and indexes the objects to support future searches. The relevance score is based on information such as who generated the document or object, how many times the object (or a similar object) appears in other documents, and how often or how many times the object has been referenced (found in a search or included in a new composite document). Paragraph 0062 also teaches the document curation system stores a normalized version of each found object in an index database 108, as well as a pointer to the source document in the document storage system 102 where the object was found, so the source document can later be fetched. Note: The examiner interprets fetching source documents reads on the claimed generating the first document and the second document and the index database is interpreted to read on the claimed centralized metadata repository. The object used to count the number of times that object appears in other documents is interpreted to be the claimed first set of objects and the objects that appear in other documents is interpreted to be the second set of objects. The document containing the object used to count the number of times that object appears in  .

Claims 2, 4, 10, 12, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Gorbansky et al. (US 20160098405 A1) hereinafter Gorbansky, in view of Baras et al. (U.S. Patent No.: US  8285734 B1) hereinafter Baras, in view of Shukla et al. (US Patent No.: US 9,378,432 B2) hereinafter Shukla, and further in view of Paled et al. (US Publication No.: US 20090063470 A1) hereinafter Paled.
As to claim 2:
Gorbansky, Baras, and Shukla disclose all of the limitation as set forth in claim 1 but does not appear to expressly disclose the method of claim 1, comprising: determining, based on the first metadata, first identifiers for the objects in the first set of objects; and determining, based on the second metadata, second identifiers for the objects in the second set of objects that are included in or were used to create content of a particular document in the second set of documents; wherein generating the similarity scores comprises determining a similarity score for the particular document based on identifying one or more matches between the first identifiers and the second identifiers.
Peled discloses:
The method of claim 1, comprising: 
determining, based on the first metadata, first identifiers for the objects in the first set of objects [Paragraph 0034 and Figure 2 teach business object analyzer function 52 of classifier 34 uses the information provided by crawler 38 and comparer function 50 in building, for each business object, a set of identifiers, including variants, that will serve as the basis for tagging instances of the business object in the documents in system 20. Paragraph 0045 teaches tagged document itself may be stored in a document repository 68 (which may also be part of repository 36). Rather than storing the entire document, however, it may be sufficient for the classifier to store document metadata, containing the 
The examiner interprets metadata containing tagged information based on identifiers of business objects in a document wherein that business object is chosen from a list of business objects held in a repository to be the claimed first identifiers (tags) based on metadata for objects in first set of objects. The business object chosen from a list of business objects is interpreted to be the first set of objects, the tags for that business objects is interpreted to be the first set of identifiers, and metadata containing tag information for the document containing the business object is interpreted to be the claimed first metadata.]  and  Application No. : 15/838,700 Filed: December 12, 2017 Page: 4of12 
determining, based on the second metadata, second identifiers for the objects in the second set of objects that are included in or were used to create content of a particular document in the set of multiple second documents [Paragraph 0034 and Figure 2 teach business object analyzer function 52 of classifier 34 uses the information provided by crawler 38 and comparer function 50 in building, for each business object, a set of identifiers, including variants, that will serve as the basis for tagging instances of the business object in the documents in system 20. Paragraph 0045 teaches tagged document itself may be stored in a document repository 68 (which may also be part of repository 36). Rather than storing the entire document, however, it may be sufficient for the classifier to store 
The examiner interprets metadata containing tagged information based on identifiers of business objects in a document wherein that business object is chosen from a list of business objects held in a repository for search query resulting in the return of documents with the highest document score to be the claimed second identifiers (tags) based on metadata for objects in the second set of objects. The business object chosen from a list of business objects is interpreted to be the first set of objects and the instances of the occurring first set of business objects is interpreted to be the claimed second set of objects that are included in or were used to create content of a particular document in the second set of documents, wherein the documents containing the instances of business objects are interpreted to be the claimed set of multiple second documents. The tags for the instances of business objects is interpreted to be the second set of identifiers, and metadata containing tag information for the document containing the instances of business object is interpreted to be the claimed second metadata.]; 
wherein generating the similarity scores comprises determining a similarity score for the particular document based on identifying one or more matches between the first identifiers and the second identifiers [Paragraph 0034 and Figure 2 teach business object analyzer function 52 of classifier 
The examiner interprets the indices with tagged data objects is used for searching, wherein a searcher provides a score for documents based on data object instance occurring in a given document and returning all documents with the highest scores or scores above a threshold to be the claimed generating similarity score comprising a determined similarity score for a particular document based on identifying one or more matches between the first identifiers and the second identifiers. The tags are interpreted to be the claimed first and second identifiers, document score is interpreted to be the claimed similarity score, a given document is interpreted to be the particular document, and contributing to the score by identifying each instance of the selected business object in a given document to be the claimed identifying one or more matches.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Gorbansky, Baras, and Shukla, by incorporating identifying and tagging business objects to determine a score for documents based on data object instances in other documents, as taught by Paled (see Paragraph 0034, 0042, 0045, 0049, 0051), because all four applications are directed to identifying 

As to claim 4:
Gorbansky, Baras, and Shukla disclose all of the limitation as set forth in claim 1 but does not appear to expressly disclose the method of claim 1, wherein accessing the metadata indicating the elements of the first document comprises, determining first objects are referenced by the first document, determining that the first objects depend on additional objects that are not referenced by the first document, and determining, as the first set of objects, a combined set of objects that includes the first objects and the additional objects. 
Paled discloses:
The method of claim 1, wherein accessing the metadata indicating the elements of the first document comprises: 
determining first objects are referenced by the first document [Paragraph 0016 teaches business objects may be referred to in documents by partial names, such as the first name or nickname of a person, an abbreviation of a product, or an organization name without the usual prefix or suffix. Paragraph 0034 teaches business object analyzer function 52 of classifier 34 uses the information provided by crawler 38 and comparer function 50 in building, for each business object, a set of identifiers, including variants, that will serve as the basis for tagging instances of the business object in the documents in system 20.]; 
determining that the first objects depend on additional objects that are not referenced by the first document [Paragraph 0061 teaches business objects are complex objects, each business object may contain other business objects, or be a member in another business object, or have another relation with a second business object. Paragraph 0062 teaches for example, if John Adams, director of Technical Services of Alcatel, is mentioned in an e-mail, then this e-mail may be tagged as having a reference to "Alcatel" (albeit with a relatively low score), even if Alcatel is not mentioned at all within the mail. The examiner interprets the email to be the claimed first document]; and 
determining, as the first set of objects, a combined set of objects that includes the first objects and the additional objects [Paragraph 0045 teaches Classifier 34 classifies each document 56 according to the business objects that it has tagged in and with respect to the document, and stores the results in a classification repository 66 (which may be part of repository 36). tagged document itself may be stored in a document repository 68 (which may also be part of repository 36). Rather than storing the entire document, however, it may be sufficient for the classifier to store document metadata, containing the tag information for the document and pointing to the location of the document in system 20. Each instance is thus saved and later retrieved by the document ID of the document in which it was found. Paragraph 0061 teaches relations are also processed by the business object analyzer, for subsequent consideration in classifying and tagging documents 56. In other words, when the business object analyzer deals with a given business object, it also identifies and marks the related business objects, as indicated by the organizational repository. Paragraph 0062 teaches for example, if John Adams, director of Technical Services of Alcatel, is mentioned in an e-mail, then this e-mail may be tagged as having a reference to "Alcatel" (albeit with a relatively low score), even if Alcatel is not mentioned at all within the mail. The examiner interprets the email to be the claimed first document.
The examiner interprets tagged references with an associated document id in which those references were found to be the claimed combines set of object that include the first objects and the 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Gorbansky, Baras, and Shukla, by incorporating identifying and tagging business objects to that are in referenced in a document and identifying and tagging business objects to that are not referenced in a document, as taught by Paled (see Paragraph 0016, 0034, 0045, 0061, 0062), because all four applications are directed to identifying and processing objects included in documents; configuring the document curation system to identify and tag business objects to that are in referenced in a document and identify and tag business objects to that are not referenced in a document improves the efficiency for analyzing a set of data objects in a data repository of an organization, and using these data objects in tagging, classifying and then searching a corpus of data. (see Paled Paragraph 0004).

As to claim 10:
Gorbansky, Baras, and Shukla disclose all of the limitation as set forth in claim 9 but does not appear to expressly disclose the system of claim 9, wherein the operations comprise: determining, based on the first metadata, first identifiers for the objects in the first set of objects; and determining, based on the second metadata, second identifiers for the objects in the second set of objects that are included in or were used to create content of a particular document in the second set of documents; wherein generating the similarity scores comprises determining a similarity score for the particular document based on identifying one or more matches between the first identifiers and the second identifiers.
Paled discloses:
The system of claim 9, wherein the operations comprise: 
determining, based on the first metadata, first identifiers for the objects in the first set of objects [Paragraph 0034 and Figure 2 teach business object analyzer function 52 of classifier 34 uses the information provided by crawler 38 and comparer function 50 in building, for each business object, a set of identifiers, including variants, that will serve as the basis for tagging instances of the business object in the documents in system 20. Paragraph 0045 teaches tagged document itself may be stored in a document repository 68 (which may also be part of repository 36). Rather than storing the entire document, however, it may be sufficient for the classifier to store document metadata, containing the tag information for the document and pointing to the location of the document in system 20. Paragraph 0049 teaches as part of the query, the user specifies one or more business objects, at an object specification step 72. These objects may be chosen by the user from a list of the objects held in repository 36. Paragraph 0051 teaches a searcher 40 scores the documents in the repository or repositories of system 20 according to the search query, at a document scoring step 78. Typically, this stage in the process uses the indices of business objects and, if appropriate, keywords that have been stored in repository 36. For each business object in the query, each instance occurring in a given document contributes to the score of that document. The searcher returns a certain number of the documents that had the highest scores, or all documents with scores above some threshold.
The examiner interprets metadata containing tagged information based on identifiers of business objects in a document wherein that business object is chosen from a list of business objects held in a repository to be the claimed first identifiers (tags) based on metadata for objects in first set of objects. The business object chosen from a list of business objects is interpreted to be the first set of objects, the tags for that business objects is interpreted to be the first set of identifiers, and metadata containing tag information for the document containing the business object is interpreted to be the claimed first metadata.]  and  Application No. : 15/838,700 Filed: December 12, 2017 Page: 4of12 
determining, based on the second metadata, second identifiers for the objects in the second set of objects that are included in or were used to create content of a particular document in the set multiple second documents [Paragraph 0034 and Figure 2 teach business object analyzer function 52 of classifier 34 uses the information provided by crawler 38 and comparer function 50 in building, for each business object, a set of identifiers, including variants, that will serve as the basis for tagging instances of the business object in the documents in system 20. Paragraph 0045 teaches tagged document itself may be stored in a document repository 68 (which may also be part of repository 36). Rather than storing the entire document, however, it may be sufficient for the classifier to store document metadata, containing the tag information for the document and pointing to the location of the document in system 20. Paragraph 0049 teaches as part of the query, the user specifies one or more business objects, at an object specification step 72. These objects may be chosen by the user from a list of the objects held in repository 36. Paragraph 0051 teaches a searcher 40 scores the documents in the repository or repositories of system 20 according to the search query, at a document scoring step 78. Typically, this stage in the process uses the indices of business objects and, if appropriate, keywords that have been stored in repository 36. For each business object in the query, each instance occurring in a given document contributes to the score of that document. The searcher returns a certain number of the documents that had the highest scores, or all documents with scores above some threshold.
The examiner interprets metadata containing tagged information based on identifiers of business objects in a document wherein that business object is chosen from a list of business objects held in a repository for search query resulting in the return of documents with the highest document score to be the claimed second identifiers (tags) based on metadata for objects in the second set of objects. The business object chosen from a list of business objects is interpreted to be the first set of objects and the instances of the occurring first set of business objects is interpreted to be the claimed second set of objects that are included in or were used to create content of a particular document in the ; 
wherein generating the similarity scores comprises determining a similarity score for the particular document based on identifying one or more matches between the first identifiers and the second identifiers [Paragraph 0034 and Figure 2 teach business object analyzer function 52 of classifier 34 uses the information provided by crawler 38 and comparer function 50 in building, for each business object, a set of identifiers, including variants, that will serve as the basis for tagging instances of the business object in the documents in system 20. Paragraph 0047 teaches search is performed by searcher 40 after the documents have been tagged and indexed. Paragraph 0051 teaches a searcher 40 scores the documents in the repository or repositories of system 20 according to the search query, at a document scoring step 78. Typically, this stage in the process uses the indices of business objects and, if appropriate, keywords that have been stored in repository 36. For each business object in the query, each instance occurring in a given document contributes to the score of that document. The searcher returns a certain number of the documents that had the highest scores, or all documents with scores above some threshold.
The examiner interprets the indices with tagged data objects is used for searching, wherein a searcher provides a score for documents based on data object instance occurring in a given document and returning all documents with the highest scores or scores above a threshold to be the claimed generating similarity score comprising a determined similarity score for a particular document based on identifying one or more matches between the first identifiers and the second identifiers. The tags are interpreted to be the claimed first and second identifiers, document score is interpreted to be the .]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Gorbansky, Baras, and Shukla, by incorporating identifying and tagging business objects to determine a score for documents based on data object instances in other documents, as taught by Paled (see Paragraph 0034, 0042, 0045, 0049, 0051), because all four applications are directed to identifying and processing objects included in documents; configuring the document curation system to identify and tag business objects to determine a score for documents based on data object instances in other documents improves the efficiency for analyzing a set of data objects in a data repository of an organization, and using these data objects in tagging, classifying and then searching a corpus of data. (see Paled Paragraph 0004).

As to claim 12:
Gorbansky, Baras, and Shukla disclose all of the limitation as set forth in claim 9 but does not appear to expressly disclose the system of claim 9, wherein the operations comprise: determining, based on the first metadata, first identifiers for the objects in the first set of objects; and determining, based on the second metadata, second identifiers for the objects in the second set of objects that are included in or were used to create content of a particular document in the second set of documents; wherein generating the similarity scores comprises determining a similarity score for the particular document based on identifying one or more matches between the first identifiers and the second identifiers.
Paled discloses:
The system of claim 9, wherein accessing the metadata indicating the elements of the first document comprises: 
determining first objects are referenced by the first document [Paragraph 0016 teaches business objects may be referred to in documents by partial names, such as the first name or nickname of a person, an abbreviation of a product, or an organization name without the usual prefix or suffix. Paragraph 0034 teaches business object analyzer function 52 of classifier 34 uses the information provided by crawler 38 and comparer function 50 in building, for each business object, a set of identifiers, including variants, that will serve as the basis for tagging instances of the business object in the documents in system 20.]; 
determining that the first objects depend on additional objects that are not referenced by the first document [Paragraph 0061 teaches business objects are complex objects, each business object may contain other business objects, or be a member in another business object, or have another relation with a second business object. Paragraph 0062 teaches for example, if John Adams, director of Technical Services of Alcatel, is mentioned in an e-mail, then this e-mail may be tagged as having a reference to "Alcatel" (albeit with a relatively low score), even if Alcatel is not mentioned at all within the mail. The examiner interprets the email to be the claimed first document]; and 
determining, as the first set of objects, a combined set of objects that includes the first objects and the additional objects [Paragraph 0045 teaches Classifier 34 classifies each document 56 according to the business objects that it has tagged in and with respect to the document, and stores the results in a classification repository 66 (which may be part of repository 36). tagged document itself may be stored in a document repository 68 (which may also be part of repository 36). Rather than storing the entire document, however, it may be sufficient for the classifier to store document metadata, containing the tag information for the document and pointing to the location of the document in system 20. Each instance is thus saved and later retrieved by the document ID of the document in which it was found. 
The examiner interprets tagged references with an associated document id in which those references were found to be the claimed combines set of object that include the first objects and the additional objects. The tagged objects including related business objects not mentioned within a document wherein the document id references a document is interpreted to be the claimed first set of objects and the additional objects].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Gorbansky, Baras, and Shukla, by incorporating identifying and tagging business objects to that are in referenced in a document and identifying and tagging business objects to that are not referenced in a document, as taught by Paled (see Paragraph 0016, 0034, 0045, 0061, 0062), because all four applications are directed to identifying and processing objects included in documents; configuring the document curation system to identify and tag business objects to that are in referenced in a document and identify and tag business objects to that are not referenced in a document improves the efficiency for analyzing a set of data objects in a data repository of an organization, and using these data objects in tagging, classifying and then searching a corpus of data. (see Paled Paragraph 0004).

As to claim 25:

Paled discloses:
The method of claim 1, wherein generating the similarity scores comprises determining a similarity score for a particular document based on an amount of matches between (i) identifiers for objects used to generate displayable elements of the first document and (ii) identifiers for objects used to generate displayable elements of the second document [Paragraph 0034 and Figure 2 teach business object analyzer function 52 of classifier 34 uses the information provided by crawler 38 and comparer function 50 in building, for each business object, a set of identifiers, including variants, that will serve as the basis for tagging instances of the business object in the documents in system 20. Paragraph 0047 teaches search is performed by searcher 40 after the documents have been tagged and indexed. Paragraph 0051 teaches a searcher 40 scores the documents in the repository or repositories of system 20 according to the search query, at a document scoring step 78. Typically, this stage in the process uses the indices of business objects and, if appropriate, keywords that have been stored in repository 36. For each business object in the query, each instance occurring in a given document contributes to the score of that document. The searcher returns a certain number of the documents that had the highest scores, or all documents with scores above some threshold.
The examiner interprets the indices with tagged data objects is used for searching, wherein a searcher provides a score for documents based on data object instance occurring in a given document and returning all documents with the highest scores or scores above a threshold to be the claimed generating similarity score comprising a determined similarity score for a particular document based on .]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Gorbansky, Baras, and Shukla, by incorporating identifying and tagging business objects to determine a score for documents based on data object instances in other documents, as taught by Paled (see Paragraph 0034, 0042, 0045, 0049, 0051), because all four applications are directed to identifying and processing objects included in documents; configuring the document curation system to identify and tag business objects to determine a score for documents based on data object instances in other documents improves the efficiency for analyzing a set of data objects in a data repository of an organization, and using these data objects in tagging, classifying and then searching a corpus of data (see Paled Paragraph 0004).

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Gorbansky et al. (US 20160098405 A1) hereinafter Gorbansky, in view of Baras et al. (U.S. Patent No.: US  8285734 B1) hereinafter Baras, in view of Shukla et al. (US Patent No.: US 9,378,432 B2) hereinafter Shukla, and further in view of Garg et al. (US Publication No.: US 20070133067 A1) hereinafter Garg.
As to claim 5:
Gorbansky discloses:
The method of claim 1, wherein the metadata repository includes object definitions for objects of multiple different object types [Paragraph 0079 teaches table entry may contain descriptions of object types supported by the object model, i.e., object types that may be found in the document. The examiner interprets the claimed descriptions of object types to be the claimed definitions for objects of multiple different object types.], 

Gorbansky, Baras, and Shukla disclose all the limitation as set forth in the rejection of claim 1 and some of claim 5, but does not appear to expressly disclose wherein generating the similarity scores comprises determining a similarity score for a particular document of the second documents based on determining that the first document and the particular document each reference objects of a same object type.
Garg discloses:
wherein generating the similarity scores comprises determining a similarity score for a particular document of the second documents based on determining that the first document and the particular document each reference objects of a same object type [Paragraph 0001 teaches electronic documents may comprise a compilation of data that may be represented in one or more data files. Paragraph 0037 teaches pages having common objects may be ranked with respect to each other. This may comprise determining the `affinity` between and/or among pages. Affinity, in this context, may comprise a measure of the number of and/or weighted total of common objects included in two or more pages. Affinity may be determined by weighting the common objects based at least in part on the type of object, such as by assigning greater weight for particular objects, for example, and/or may comprise a number of common objects. In one example, graphics and/or text may be assigned a differing weight than other objects. Therefore, similar occurrences of these types of objects between and/or among two or more pages may be assigned a greater weight, and thus the pages having common 
The examiner interprets the ranking common objects with respect each other using an affinity measure to be the claimed similarity score for a particular document of the second documents based on determining that the first document and the particular document each reference objects of a same object type. Determining the affinity by weighting common object based on the type of object and similar occurrences of these types of objects between and/or among two or more pages may be assigned a greater weight, and thus the pages having common objects such as these will have a greater affinity with respect to each other than two other pages that may not include these objects is interpreted to be the claimed determining a similarity score for a particular document of the second documents based on determining that the first document and the particular document each reference objects of a same object type wherein the two or more pages are interpreted to reasonably include two or more documents, therefore the two or more pages are interpreted to be the claimed first document and the particular document. A page of an electronic document is interpreted to be a file, and a file is interpreted to be a document where the examiner reasonably interprets pages to be separate files and ultimately interpreted to be separate documents.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Gorbansky, Baras, and Shukla, by incorporating an affinity measure used to represent two files (or documents) referencing common object types, as taught by Garg (see Paragraph 0037), because the four applications are directed to identifying and processing objects included in documents; configuring the document curation system to incorporate an affinity measure used to represent two files of an electronic document referencing common object types is desirable in forming a master page for an electronic document. (see Garg Paragraph 0010).

Claims 6, 14, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Gorbansky et al. (US 20160098405 A1) hereinafter Gorbansky, in view of Baras et al. (U.S. Patent No.: US  8285734 B1) hereinafter Baras, in view of Shukla et al. (US Patent No.: US 9,378,432 B2) hereinafter Shukla, and further in view of Okamoto et al. (U.S. Publication No. 2009/0248678) hereinafter Okamoto.
As to claim 6:
Gorbansky, Baras, and Shukla disclose all the limitation as set forth in the rejection of claim 1 but does not appear to expressly disclose the method of claim 1, further comprising identifying the first document, comprising at least one of: receiving data indicating user input to select the first document; receiving data indicating that a user accessed the first document; receiving data indication an end of the first document is reached; or determining that the first document is in a document collection of the first user.
Okamoto discloses:
The method of claim 1, further comprising identifying the first document, comprising at least one of: 
receiving data indicating user input to select the first document; receiving data indicating that a user accessed the first document; receiving data indication an end of the first document is reached; or determining that the first document is in a document collection of the first user [Paragraph 0063 teaches processing within the cluster-of-interest extraction unit and recommended document extraction unit to be input that contains a document set of a user’s browsing result documents. Examiner interprets browsing documents or historical browsing data to be the user accessing a document]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Gorbansky, Baras, and Shukla, by incorporating a document set of a user’s browsing result 


As to claim 14:
Gorbansky, Baras, and Shukla disclose all the limitation as set forth in the rejection of claim 1 but does not appear to expressly disclose the system of claim 9, wherein the operations further comprise identifying the first document, comprising at least one of: receiving data indicating user input to select the first document; receiving data indicating that a user accessed the first document; receiving data indication an end of the first document is reached; or determining that the first document is in a document collection of the first user.
Okamoto discloses:
identifying the first document, comprises at least one of:
receiving data indicating user input to select the first document; receiving data indicating that a user accessed the first document; receiving data indication an end of the first document is reached; or determining that the first document is in a document collection of the first user [Paragraph 0063 teaches input for processing within the cluster-of-interest extraction unit and recommended document extraction unit to be of a document set that contains a user’s browsing result documents. 


As to claim 16:
Gorbansky, Baras, and Shukla disclose all the limitation as set forth in the rejection of claim 1 but does not appear to expressly disclose the system of claim 9, wherein providing the data indicating the selected subset of the second documents to a client device is performed in response to receiving data indicating that a user of the client device selected the first document.
Okamoto discloses:
The system of claim 9, wherein providing the data indicating the selected subset of the second documents to a client device is performed in response to receiving data indicating that a user of the client device selected the first document [FIG. 8 – Item S21-S23, Paragraph 0042, 0064, 0065, and 0066 teaches clustering documents into a first cluster of topic and a second cluster of sub-topics. The input for processing within the cluster-of-interest extraction unit and recommended document extraction unit to be a document set that contains a user’s browsing bookmark result list. The cluster-of-interest 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Gorbansky, Baras, and Shukla, by incorporating an extraction of the recommended document as an indication that the document browsed and bookmarked the document, as taught by Okamoto [[FIG. 8 – Item S21-S23, Paragraph 0042, 0064, 0065, and 0066], because all four applications are directed to document comparison and analysis using metadata; configuring the document curation system to display the selected document from the comparison process with Level-1 or Level-2 clustering via the presentation unit as taught by Okamoto, provides a means to present the identified document via a web page displayed on the client in response to the user. Therefore, ensuring the recommendation is not unwarranted and effectively detects and presents a document, which is continued from a document browsed by a user in the past (See Okamoto [Paragraph 0003]).

Claims 7 is rejected under 35 U.S.C. 103 as being unpatentable over Gorbansky et al. (US 20160098405 A1) hereinafter Gorbansky, in view of Baras et al. (U.S. Patent No.: US  8285734 B1) hereinafter Baras, in view of Shukla et al. (US Patent No.: US 9,378,432 B2) hereinafter Shukla, and further in view of Conrad et al. (U.S. Publication No. 2006/0041597) hereinafter Conrad.
As to claim 7:
Gorbansky, Baras, and Shukla discloses all the limitation as set forth in the rejection of claim 1 but does not appear to expressly disclose the method of claim 1, wherein generating the similarity 
Conrad discloses:
The method of claim 1, wherein generating the similarity scores comprises determining a similarity score for a particular document of the second documents based on data indicating a frequency of access of the second document [Paragraph 0087 teaches the use of access frequency to build an ordered list made up of duplicate selectable document citations. The examiner interprets the ordered list to indicate the use of the access frequency metadata associated with the document to provide an order or ranking of relevancy using an access frequency value, with the most relevant document based on access frequency being the first document on the list].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Gorbansky, by incorporating use of access frequency to build an ordered list made up of duplicate selectable document citations, as taught by Conrad (Paragraph 0087), because both applications are directed to document comparison and analysis using metadata; configuring the document curation system to factor in access frequency when building a list or document collection for similarity comparison, the source of metadata objects used to make document comparisons is increased, thereby further providing an effectively generated set of similar or dissimilar documents and facilitates the identification and/or grouping of duplicate documents in search results, in accordance with user preferences (see Conrad Paragraph 0107).

Response to Arguments
The following is in response to Applicant’s arguments filed on January 15, 2021 remarks page 10:
 “Applicant submits that the claims as amended recite significantly more than the alleged abstract idea and are thus patent-eligible.”

Examiner respectfully presents the following response to Applicant’s amendments and remarks:
Applicant’s arguments have been fully considered but they are not persuasive. The examiner respectfully submits the claims as amended do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements “by the one or more computers”, and “client device over a computer network” in claim 1 and 17 and “one or more non-transitory computer-readable media comprising instructions that, when executed by the one or more computer-readable media, cause the one or more computers to perform operations” in claim 17,  are recited at a high level of generality to apply the exception using generic computer components. Accessing, by the one or more computers, first metadata identifying a first set of objects identified in a metadata repository that are included in, are referenced by, or were used to create a first document, accessing, by the one or more computers, second metadata identifying second sets of objects identified in the metadata repository, wherein the second sets of objects respectively are included in, are referenced by, or were used to create documents in a set of multiple second documents, and providing, by the one or more computers, data indicating the selected subset of the second documents to a client device over a computer network, by the one or more computers, client device over a computer network, providing, by the one or more computers, data indicating the selected subset of the second documents to a client device over a computer network are interpreted to be well understood, routine, and conventional activity (Storing and retrieving information in memory, Versata (see MPEP 2106.05(d))). Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. To further elaborate, the additional limitations accessing, by the one or more 

The following is in response to Applicant’s arguments filed on January 15, 2021 remarks page 12:
“The Prior Art Does Not Render Obvious the Claimed Techniques for Generating Similarity Scores”

Examiner respectfully presents the following response to Applicant’s amendments and remarks:
Applicant’s arguments have been fully and respectfully considered, but are moot in view of new grounds of rejections as necessitated by the amendments.

The following is in response to Applicant’s arguments filed on January 15, 2021 remarks page 13:
“The Modification of Gorbansky in view of Shukla is Flawed and Unworkable”

Examiner respectfully presents the following response to Applicant’s amendments and remarks:
Applicant’s arguments have been fully considered but they are not persuasive. The examiner respectfully disagrees with the applicant’s arguments regarding claim 1’s newly amended recitation of "generating, for each similarity measure, a weighted similarity measure by applying, to the similarity measure, a weight value corresponding to the category of objects, wherein each similarity measure based on an amount of objects in a category of objects in common between the first document and second document is weighted using the weight value corresponding to the category of objects". Gorbansky’s disclosure of a document curation system that facilitates finding previously-created objects, such as text and charts, in electronic business documents, such as word processing documents and slide presentations files stored in documents of a separate document storage system (see Gorbansky Paragraph 0013, 0056, 0062, 0069, 0073, 0085, 0093, 0094, and 0101) and Shukla’s disclosure of a hierarchy similarity measure techniques are described. In one or more implementations, categories in a hierarchy of categories are assigned to each of at least two objects and a similarity score may be calculated for the at least two objects that takes into account the categories assigned to the objects (see Shukla Column 2 Lines 62-66) sufficiently discloses the current claim language. The categories (e.g., 

The following is in response to Applicant’s arguments filed on January 15, 2021 remarks page 14:
 “Shukla Does Not Teach Applying Weights as Claimed”

Examiner respectfully presents the following response to Applicant’s amendments and remarks:
Applicant’s arguments have been fully considered but they are not persuasive. The examiner respectfully disagrees with the applicant’s arguments regarding claims 1 newly amended recitation of "applying, to the similarity measure, a weight value corresponding to the category of objects, wherein each similarity measure based on an amount of objects in a category of objects in common between the first document and second document is weighted using the weight value corresponding to the category of objects". Gorbansky’s disclosure of a document curation system that facilitates finding previously-created objects, such as text and charts, in electronic business documents, such as word processing documents and slide presentations files stored in documents of a separate document storage system (see Gorbansky Paragraph 0062, 0069, 0073, 0093, 0094, and 0101) and Shukla’s disclosure of a hierarchy similarity measure techniques are described. In one or more implementations, categories in a hierarchy of categories are assigned to each of at least two objects and a similarity score may be calculated for the at least two objects that takes into account the categories assigned to the objects (see Shukla Column 2 Lines 62-66) sufficiently discloses the current claim language. The categories (e.g., nodes) of the hierarchies may be assigned to each of the objects, such as by simply assigning one or more categories of a hierarchy that are related to an object and/or by weighting the categories in the hierarchy based on how relevant those categories are to the object (see Shukla Column 2 Lines 62-66). The cited weighting the categories of objects to be the claimed applying weights for the categories of objects. In the context of the cited reference, weighting the categories is interpreted to also include 

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EARL ELIAS whose telephone number is (571)272-9762.  The examiner can normally be reached on Monday - Friday (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 571-272-4046.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact 






/E.E./               Examiner, Art Unit 2169                                                                                                                                                                                         
/USMAAN SAEED/               Supervisory Patent Examiner, Art Unit 2169