Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
	This action is issued in response to amendment/RCE filed 8/2/2021.
	Claims 1-9, 11-24, 61-62 were directly and/or indirectly amended. Claims 10, 25-60 were canceled. No Claims were added.
	Claims 1-9, 11-24, 61-62 are pending.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 8/2/2021 has been entered.
 
Response to Arguments
Applicant's arguments filed 8/2/2021 have been fully considered but they are not persuasive. 
Applicant argues regarding Claim 1, Donnelly and Powers fail to teach or render obvious “receiving a plurality of data records, each data record of the plurality of data records comprising a plurality of data fields and a plurality of terms.

Applicant argues regarding Claim 1, the applied art fail to disclose the claimed step pf providing a summary of activity recited in amended claim 1, “a plurality of top terms associated with the entity retrieved from the plurality of data records.
Examiner disagrees. As stated in the rejection above, Huang in view of Donnelly doesn’t explicitly disclose the step of providing a summary of activity, and Power disclose the step of providing a summary of activity as shown in Fig. 20, which is further described in Para. 0184, wherein the table present a sale details which corresponds to the summary of activity, since the “top term” defined in Para. 0179 of the instant application specification as a keyword associated with the entity.
	Applicant argues Huang in view of Donnelly fail to cure the deficiencies of Power the office action acknowledge that Huang fail to teach or render obvious the claimed step of providing a summary of activity.
	Examiner disagrees. As stated in the rejection below Huang in the combination of Huang in view of Donnelly disclose a method of processing data in a document wherein the document corresponds to a data record and the words in the document corresponds to plurality of terms, the data, the represented data vector which corresponds to field compare the first and second data to determine the similarity between two vectors as shown in Col. 7, lines 32-48, address the data processing. However, as stated in the rejection above doesn’t generate or provide a summary of activity report. Power cure the deficiency of the combination of Huang in view of Donnelly by 
	Applicant argues the Huang also fails to teach or render obvious, at least “receiving a plurality of data records, each data records of the plurality of data records comprising a plurality of data fields and a plurality of terms.
	Examiner disagrees. Huang in Col. 7, lines 13-15, wherein the document comprised plurality of words as shown in Col. 7, lines 54-57, which corresponds to receiving a plurality of data records, each data record comprising a plurality of data field.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-24, 61-62 is/are rejected under 35 U.S.C. 103 as being obvious over Huang et al. (Huang hereinafter) US Patent No. 9519859 filed Sep. 6, 2013 in view of Donnelly et al. (Donnelly hereinafter) US Patent Application publication No. 20100094840 filed Oct. 14, 2008 and published April 15, 2010 and further in view of Powers et al. (Powers hereinafter) US Patent Application publication No. 20140053070 filed May 31, 2013 and published Feb. 20, 2014

Regarding Claims 1, and 19, Huang discloses a method, comprising:

interpreting at least two data records, of the plurality of records (Col. 7, lines 50-61, Fig. 7-8 wherein the “GOOD” and “CAT” corresponds to two data records, and Col. 8, lines 26-31, wherein the 500k corresponds to plurality of terms, Huang); 
determining a plurality of n-grams from the plurality of terms of each of the at least two data records (Fig. 7-8 wherein the parsing of the data into plurality of terms corresponds to plurality of n-grams, and Col. 8, lines 26-33, Col. 7, lines 50-61, wherein the vectors corresponds to data records, Huang); 
mapping the plurality of n-grams to a corresponding plurality of mathematical vectors (Fig. 7-8: mapping module maps N-Grams to individual N-Gram vectors for each word, Huang. Which disclose the vectors. Huang is silent with respect to vector being mathematical vector. On the other hand, Donnelly discloses the mathematical vector as shown in Para. 82.
The claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Huang, with the teachings of Donnelly, to accurately identify the similarity among topics and measure the weight to determine the degree of similarity as shown in Para. 81. Modification would have 
Furthermore, the combination of Huang in view of Donnelly discloses determining whether a similarity value between a first mathematical vector including a first term of the plurality of the terms a first one of the at least two data records and a second mathematical vector including a second term of the plurality of terms of a second one of the data records is greater than a threshold similarity value (Para. 136, and Para. 157, wherein the threshold cutoff corresponds to greater than, wherein the first vector corresponds to first data record and second vector corresponds to second data record, Donnelly); and
associating the first one of the at least two data records with the second one of the at least two data records in response to the similarity value exceeding the threshold similarity value (Col. 11, lines 1-3, “the similarity determination module 316 can then compare these two lower dimension vectors to determine the similarity between them” a determination that the vectors are relatively similar is thus an association between the vectors, Huang). The combination of Huang in view of Donnelly discloses all the limitations as stated above. 
However, Huang in view of Donnelly doesn’t explicitly disclose providing a summary of activity for an entity in response to the associating, wherein the summary of activity comprises at least one of a number of shipments by the entity, quantities of the items shipped by the entity, value of items shipped by the entity, or a histogram of shipment data for the entity. On the other hand, Powers disclose summary of activity comprises at least one of a number of shipments by the entity, quantities of the items shipped by the entity, value of items shipped by the entity as shown in Fig. 20, Para. 0184, wherein the “#prods” corresponds to items shipped, 
Skilled artisan would have been motivated to make such modification to generate an accessible summary for a user to review the sale as shown in Para. 0221, Powers, characterization of shipment anomalies associated with the entity (Para. 0269, wherein the method of selecting same product to two different location corresponds to shipment anomalies associated with the entity, Power), and a plurality of top terms associated with the entity retrieved from the plurality of data records (Fig. 21, Para. 0184, wherein the column identifier for a region such as “South” corresponds to top term since the “top term” defined in Para. 0179 of the instant application specification as a keyword associated with the entity, Power).
Regarding Claim 2, Huang in view of Donnelly and further in view of Powers discloses a method further comprising determining that the first term is related to the second term in response to the similarity value exceeding the threshold similarity value (Para. 147, Donnelly).
Regarding Claim 3, Huang in view of Donnelly and further in view of Powers discloses a method further comprising determining that the first term is synonymous with the second term in response to the similarity value exceeding the threshold similarity value (Fig. 6, step 5, and Para. 5, Donnelly).
Regarding Claims 4, and 21, Huang in view of Donnelly and further in view of Powers discloses a method wherein the first term and the second term each correspond to an entity 
Regarding Claims 5, and 22, Huang in view of Donnelly and further in view of Powers discloses a method further comprising providing a catalog identifier and associating each of the first term and the second term to the catalog identifier (Fig. 6, step 2, wherein the text relevant ID corresponds to the catalog identifier, Donnelly).
Regarding Claim 6, Huang in view of Donnelly and further in view of Powers discloses a method wherein the catalog identifier matches at least one of the first term and the second term (Para. 162, Donnelly).
Regarding Claim 7, Huang in view of Donnelly and further in view of Powers discloses a method wherein the n-grams comprise an n value of at least two (fig. 4, Col. 8, lines 20-31, Huang).
Regarding Claim 8, Huang in view of Donnelly and further in view of Powers discloses a method further comprising determining that a plurality of the data records correspond to a first entity, and wherein the determining the similarity value is further in response to the determining the records correspond to the first entity (Fig. 7, wherein “GOOD” corresponds to first entity, Huang).
Regarding Claims 9, and 20, Huang in view of Donnelly and further in view of Powers discloses a method further comprising determining that a first set of a plurality of the data records correspond to a first entity, and determining that a second set of the plurality of the data records correspond to a second entity, and wherein the determining the similarity value 
Regarding Claim 11, Huang in view of Donnelly and further in view of Powers discloses a method further comprising determining that the first term is related to the second term in response to the similarity value exceeding the threshold similarity value (Para. 163, Donnelly), wherein the first term and the second term each correspond to an entity identifier for the data records (Fig. 7 and Fig. 8, Huang), providing a catalog entity identifier, and associating each of the first term and the second term to the catalog entity identifier (Fig. 6, step 2, wherein the text relevant ID corresponds to the catalog identifier, Donnelly).
Regarding Claim 12, Huang in view of Donnelly and further in view of Powers discloses an apparatus, comprising:
a data access circuit structured to receiving a plurality of data records and interpret at least two data records of the plurality of data records, each data record of the plurality of data records comprising a plurality of data fields (Para. 149, wherein each document has more than one data record, Donnelly);
a record parsing circuit structured to determine a plurality of n-grams from plurality of terms of each of the at least two data records and to map the plurality of n-grams to a corresponding plurality of mathematical vectors (Col. 10, lines 13-26, Huang, and Para. 82, Donnelly);
a record association circuit structured to determine whether a similarity value between a first mathematical vector including a first term of the plurality of a first one of the at least two data records and a second mathematical vector including a second term of the second plurality 
Regarding Claim 13, Huang in view of Donnelly and further in view of Powers discloses an apparatus wherein the at least two data records include transactional records (Para. 149, Donnelly).
Regarding Claim 14, Huang in view of Donnelly and further in view of Powers discloses an apparatus wherein the transactional records include customs transaction records (Para. 208, 
Regarding Claim 15, Huang in view of Donnelly and further in view of Powers discloses an apparatus wherein at least one of the plurality of n-grams includes words from at least two distinct languages (Para. 99, wherein the English and foreign languages corresponds to two distinct languages, Donnelly).
Regarding Claims 16, 23, 61, and 62, Huang in view of Donnelly and further in view of Powers discloses an apparatus wherein the first term includes a member selected from the group comprising a numeric value, an abbreviation, a term including jargon, an acronym, and an initialization (Para. 152, wherein the value corresponds to numeric value, Donnelly).
Regarding Claims 17, and 24, Huang in view of Donnelly and further in view of Powers discloses an apparatus wherein the data records include a plurality of fields, wherein at least one of the plurality of fields includes a shortened phrase, wherein the shortened phrase comprises at least one member selected from the group comprising a non-grammatical phrase, a phrase incorporating at least two distinct languages, an abbreviation, a term including jargon, an acronym, and an initialization (Para. 99, wherein the English and foreign languages corresponds to two distinct languages, Donnelly).
Regarding Claim 18, Huang in view of Donnelly and further in view of Powers discloses an apparatus wherein the reporting circuit is structured to determine that the first term is related to the second term (Fig. 4c, Donnelly) in response to the similarity value exceeding the threshold similarity value (Col. 4, lines 52-53, Huang), wherein the first term and the second term each correspond to an entity identifier for the data records, to provide a catalog entity 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Psota et al. 20170091320 related to Natural language processing for entity resolution.
Collins et al. 20070239707 related to method of searching text to find relevant content.
Bradford 20060265209 related to machine translation using vector space representations.


Point of Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SANA A AL- HASHEMI whose telephone number is (571)272-4013.  The examiner can normally be reached on 8:00 am-4:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Vital can be reached on 571-272-4215.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


/SANA A AL- HASHEMI/Primary Examiner, Art Unit 2162                                                                                                                                                                                                        August 9, 2021