DETAILED ACTION
Applicant’s response, filed 05/13/2022, to the previous office action has been considered and made of record. Claims 1-20 are pending further consideration.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 05/31/2022 has been entered.
 
Response to Arguments
Applicant’s argument’s, see page 9 of Applicant’s remarks received 05/13/2022, with respect to the teachings of Mudgal and the presently claimed limitation of independent claims 1, 8, and 15 drawn to
 “Mudgal, however, fails to teach “the first type being different from the second type.” Instead, and as discussed above, Mudgal discusses that “entries in D and D’ follow the same representation (e.g., the same schema with attributes A1,….,AN in the case of structured data),” nothing that “[e]ntity matching (EM) finds data instances referring to the same real-world entity, such as (Eric Smith, John Hopkins) and (E. Smith,  JHU).” 
are not persuasive.
Applicant’s arguments are not persuasive for at least the reasoning that if D and D’ were the same, then there would not be a need to match the entities contained therein because the same set of entities would inherently match between said two datasets. The “same schema” disclosed by the teachings of Mudgal refers to a category or scheme of the entity’s material being the same. Hence categories or schema such as bills, office documents, are the “same” or related between the entities and not the actual datasets. For at least this reasoning, Applicant’s arguments are not convincing and a below presented prior art rejection of claims 1, 8, 15, and their corresponding dependent claims based at least on the teachings of Mudgal.
Applicant’s arguments, see page 9 of Applicant’s remarks received 05/13/2022, with respect to the teachings of Mudgal and the presently claimed limitation of independent claims 1, 8, and 15 drawn to:
“Mudgal also fails to teach “providing a set of features based on entities in the of entities of the first type, the set of features comprising one or more features expected to be included in entities in the set of entities of the second type,” much less “prior to matching entities …, filtering the set of entities of the second type based on the set of features.” Instead, and as discussed above Mudgal, referring to blocking, discusses “filtering[ing] the cross product DxD’ to a candidate set C that only includes pairs of entity mentions judged likely to be matches.” Mudgal, however, is absent further detail on blocking and is specifically absent features in the entities………”
are not persuasive.
The prior art of Mudgal has disclosed the matching between entities, wherein at least the entities comprise characters or words as features of said entities for which the entities are determined to match. Furthermore, the teachings of blocking not detailed in the disclosure of Mudgal have been addressed by the present office action by the introduction of the reference Papadakis et al ("Comparative Analysis of Approximate Block Techniques for Entity Resolution") that has disclosed the details of an entity matching blocking technique. As per the below presented prior art combination of Mudgal and Papadakis the newly amended claim limitations argued above have been disclosed by the known prior art. For at least these reasonings the presently claimed invention of claims 1, 8, and 15 have been rejected under 35 USC 103 based on the prior art teachings of Mudgal and Papadakis.

Applicant’s remaining arguments are drawn to the newly presented claim limitations and the teachings of Mudgal relied upon in the previously presented 35 USC 102 and 35 USC 103 prior art rejections. As per the above discussion and below presented 35 USC 103 prior art rejections utilizing the newly introduced prior art reference of Papadakis et al ("Comparative Analysis of Approximate Block Techniques for Entity Resolution"), the currently presented amended claim set has been rejected in view of the teachings of the prior art.
 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-6, 8-13, and 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Mudgal et al (“Deep Learning for Entity Matching: A Design Space Exploration”) in view of the teachings of Papadakis et al ("Comparative Analysis of Approximate Block Techniques for Entity Resolution").

With respect to Claims 1, 8, and 15: A computer-implemented method for matching entities in a machine learning (MVL)-based inference system, the method being executed by one or more processors and comprising: [Mudgal (page 9, RHC 3rd para) has disclosed a computer device having at least a CPU and memory used to perform the disclosed method.]
receiving input data comprising a set of entities of a first type and a set of entities of a second type, [Mudgal (page 2 RHC section 2.1) has disclosed receiving two collections of entities D and D’.]
the first type being different from the second type; [Mudgal (page 2 RHC section 2.1) has disclosed receiving two collections of entities D and D’.]
providing a set of features based on entities in the set of entities of the first type [Mudgal (page 2 RHC final paragraph, page 5 LHC Final 2 paragraphs, And Figure 3, wherein features can be “attributes” such as words or characters) has disclosed providing a first data set D and a second data set D’ comprising a set of entities, wherein the entities of each are to be entity matched after blocking type filtering. In order to be matched the first entity and second entity must correspond to datasets having matching features therebetween. Hence entities of D having features must exist in the set of entities of D’ having features in order to determine a candidate set C that are matches.] the set of features comprising one or more features of entities in the set of entities of the first type that are expected to be included in entities in the set of entities of the second type; [Mudgal (page 5 LHC last 2 paragraphs) has disclosed the set of entities e1 and e2 having the same schema of attributes, wherein the attributes of set of mentions corresponds to the other set of entity mentions. Furthermore, as per the above discussion the sets D and D’ are provided as input to the DL solution of section 3 of Mudgal and D and D’ correspond to the set of first and second entities that are then blocking type filtered to be supplied to the DL “deep learning” process of Mudgal.]
prior to matching entities in the set of entities of the first type to entities in the set of entities of the second type, [Mudgal (page 2 RHC final para through 1st para page 3 LHC) has disclosed the filtering of the entities D and D’ to a candidate set C by “blocking” prior to the set of matching.]
filtering the set of entities of the second type based on the set of features [Mudgal (page 2 RHC final two paragraphs) has disclosed “blocking” type filtering of the D and D’ entities having the feature`s set A1……AN to provide a set of candidate entities of the two entities D and D1 likely to be matches] to provide a sub-set of entities of the second type, [Mudgal (page 2 RHC final para through 1st para page 3 LHC) has disclosed the filtering of the entities D and D’ to a candidate set C by “blocking” prior to the set of matching. The candidate set C being a subset only including pairs between entities D and D’.]
the sub-set of entities of the second type comprising fewer entities than the set of entities of the second type, each entity in the sub-set of entities of the second type having at least one feature in the set of features and [Mudgal (page 2 RHC final two paragraphs) has disclosed as part of the process of EM “entity matching” the  “blocking” type filter of the second set of entities D’ to find all paired batches between D and D1, such that the Candidate set C includes only pairs of entities that have matches. Hence the second entity D’ of Mudgal is “blocking” filtered to remove entities not matching the features of the entities of the 1st set “D”.  Mudgal has further disclosed (page 3 2nd paragraph of the RHC) “The blocking step of EM has also received significant attention [9,59]. Our solutions in this paper assumed as input the output of a block procedure, and thus can work with any blocking procedure that has been proposed.”. Mudgal has disclosed generating a “sub-set of entities of the second type” by blocking filtering of the “entities of the second type” D’ as discussed above. Mudgal, by way of blocking filtering the process of Mudgal discloses creating an entity D’ having fewer entities that the entity D’ prior to removing of entities by way of blocking filtering. Mudgal has disclosed that the set of entities have at least one feature of the sub-set because the set of blocking filtered components must match the entities having said features in the first set of entities “D”. Mudgal has not further disclosed the details of the blocking filtering required by the following claim limitation that require “each entry filtered from set of entities of the second type is absent any features in the set of features” ]
each entity filtered from set of entities of the second type is absent any features in the set of features; and [Papadakis (Abstract, page 686 “Block Cleaning” LHC) has disclosed cleaning/filtering the set of entities “blocks” to remove “blocks” entities from the created filtered set that are “superfluous” and correspond to entities not having matching features (don’t match blocks from that sets being compared). Hence, Papadakis has disclosed blocking type filtering to produce a set of blocks “entities” that match the block of entities that they are being compared to, hence contain at least one feature of the set of features in order to match between said blocks. Furthermore, the removal of blocks “entities” not having features, hence not having matching features has been disclosed by Papadakis.]
[Papadakis and Mudgal are analogous art of “EM” entity matching having an initial set of blocking type filtering prior to the process of performing the matchings of sets of entities. It would have been obvious to one of ordinary skill in the art to substitute the blocking type filtering of Mudgal for a known blocking type filtering technique of Papadakis that removes superfluous entities not corresponding to a matching, hence not having features/attributes that match between the entity mention datasets. The motivation for combining would have been to reduce the total number of blocks/entities to be entity matched between the datasets as disclosed by Papadakis (page 686 “Block Cleaning” of Papadakis), and furthermore to utilize a known blocking method as input to the entity matching process as detailed by the teachings of Mudgal (3rd page RHC 2nd paragraph). Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to at least try to combine the teachings of Papadakis and Mudgal to achieve the set of presented claim limitations discussed above.]
generating an output by processing the set of entities of the first type and the sub- set of entities of the second type through a ML model, [Mudgal (page 2 RHC final 2 paragraphs through 2nd paragraph on page 3) has disclosed generating a set of matching data by performing a matching process by using the entities D, D’, and candidate set C. Mudgal (page 3, 2nd through 3rd paragraph LHC) further discloses that the matching process is performed by Machine Learning model.]
the output comprising a set of matching pairs, each matching pair in the set of matching pairs comprising an entity of the set of entities of the first type and at least one entity of the sub-set of entities of the second type. [A set of entities having labels as “match and “no-match” corresponding to output sets e1 and e2, which corresponding to D and D’, thus the matching determination outputs sets of entities from the sets D and D’ that are output as e1 and e2 (page 3 LHC 2nd through 3rd paragraph).]

With respect to Claims 2, 9, and 16: The method of claim 1, wherein providing the set of features based on entities in the of entities of the first type comprises
identifying features expected to be included in entities of the set of entities of the second type by processing entities of the set of entities of the first type through a classifier. [The set of entity data is processed to determine the similarity features s1…..SN between the set of entities e1 and e2 (page 5 RHC final 5 paragraphs), hence the set of features included in each set of entities.]

With respect to Claims 3, 10, and 17: The method of claim 2, wherein the classifier comprises
a ML algorithm that is trained based on correlations between one or more features of entities of the first type and one or more features of entities of the second type. [Correlation, as presently claimed, is a measure of interdependence between variables, wherein similarity as described by Mudgal is at least a subset of correlation analyzing the variables to determine how similar each is as opposed to just a “interdependence” in general. A determination of similarity between attributes/features of the entities as performed by the neural network machine learning device such that entity similarity is determined based on attribute similarity (page 5 Figure 3, page 5 section 3 RHC “Attribute Similarity Representation Module: (2) Attribute Comparison”).]

With respect to Claims 4, 11, and 18: The method of claim 3, wherein correlations are provided by one or more of domain knowledge and statistical analysis. [A determination of similarity between attributes/features of the entities as performed by the neural network machine learning device such that entity similarity is determined based on attribute similarity (page 5 Figure 3, page 5 section 3 RHC “Attribute Similarity Representation Module: (2) Attribute Comparison”). Furthermore (page 6 RHC section 3.3 “Aggregate Function”) discloses a statistical weighted average as the statistical analysis of similarity “correlation”.]

With respect to Claims 5, 12, and 19: The method of claim 1, wherein the one or more features comprise categorical features. [Attributes that are the embedded in the analyzed entities are of at least a “word” or “character” type category corresponding to words, letters, or models thereof (page 6 LHC “Word Level vs Character Level Embeddings”).]

With respect to Claims 6, 13, and 20: The method of claim 1, wherein filtering the set of entities of the second type based on the set of features to provide a sub-set of entities of the second type comprises,
for each entity in the set of entities of the second type, determining whether the entity includes at least one feature of the set of features, and removing the entity from the set of entities of the second type, if the entity is absent a feature of the set of features. [The set of entity data is processed to determine the similarity features s1…..SN between the set of entities e1 and e2 (page 5 RHC final 5 paragraphs), hence the set of features included in each set of entities. Hence the set of attributes “features” of each of the first and second set is reduced to the set of features s1….SN corresponding to those entity features present in each of the first and second set and absent those not present in each of said sets.]

Claims 7 and 14  are rejected under 35 U.S.C. 103 as being unpatentable over Mudgal et al (“Deep Learning for Entity Matching: A Design Space Exploration”) and in view of the teachings of Papadakis et al ("Comparative Analysis of Approximate Block Techniques for Entity Resolution") as applied to at least claim 1 above, in view of Copsey (US 2015/0278594).

With respect to Claims 7 and 14: The method of claim 1, [Mudgal, of Mudgal in view of Papadakis, has disclosed that the type of processed entities to be matched such as article, documents, or other entities containing text (page 4 LHC “Entity Linking”, page 5 RHC “Conference Resolution”,). Mudgal has not further disclosed that the “first type” or “second type” of the corresponding input image data or image data to be searched (corresponding to the “first type” and “second type” of the present claim) are of a bank statement and invoice type.] wherein the first type comprises bank statements and the second type comprises invoices. [Copsey (para 0003 and 0029) has disclosed receiving bank statements and invoices as types of input image document data to be captured and stored.]
[Copsey and Mudgal in view of Papadakis are analogous art of image data processing of document image data to store document type image data for further use. It would have been obvious to one of ordinary skill in the art to modify the set of document type image data capable of being processed by the system and method of image document processing of Mudgal and Papadakis to further include known types of document image data such as invoices and bank statements as disclosed by Copsey to perform the disclosed process of inputting and searching for invoice and bank statement type image data searching of Mudgal using known document image types. The motivation for combining would have been to utilize the method of image feature learning and image searching of document image data as disclosed by Mudgal using additional known types of document image type data such as invoices and bank statements as disclosed by Copsey to achieve the reasonably expected result of image document searching using image features of invoice and bank statement type image documents. Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to at least try to combine the teachings of Copsey with Mudgal and Papadakis to achieve the limitations of the presently claimed invention.]

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NATHAN J BLOOM whose telephone number is (571)272-9321.  The examiner can normally be reached on 9:30AM - 6:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached on (571) 270-3717.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/NATHAN J BLOOM/Examiner, Art Unit 2666                                                                                                                                                                                                        



/EMILY C TERRELL/Supervisory Patent Examiner, Art Unit 2666