DETAILED ACTION
This is in response to the application filed on 06/27/2019. 
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 have been examined and are pending.

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference character(s) not mentioned in the description: Fig. 6, 600, 602, 604, 606, 608, 610, 612, 614, 616, 618, 620 and 622.  
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they do not include the following reference sign(s) mentioned in the description: 510, 512, 514, 516, 518, 520 and 522. 
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
The disclosure is objected to because of the following informalities:
In paras [0070], it appears that “a plurality of blocking keys 131, 133, ... , 137...”  should read “a plurality of blocking keys 231, 233, ... , 237...”.
In paras [0071-73], references to “predetermined first attribute” and “predetermined second attribute” are inconsistent. For instance, it is unclear if 212 is the predetermined first attribute or second attribute.
In para [0072], it appears that “plurality of blocking keys 231, ... , 214...”  should read “plurality of blocking keys 231, ... , 237...”.
In paras [0074-76], references to newly received record are inconsistent. For instance, it is unclear if the newly received record is 302 or 220.
In para [0076]: it appears that “scoring function 228” should read “scoring function 304”.
In paras [0083-89], it appears “Fig. 5 shows, as an example, a computing system 500...” should read “Fig. 6 shows, as an example, a computing system 600...”. Further, references to individual components of the computing system in the specification appear inconsistent with references in Fig. 6. 
Appropriate correction is required.
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

Such claim limitation(s) is/are: 
“first assignment unit adapted for assigning" in claim 12.
“second assignment unit determining" in claim 12.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Independent claims 1, 12 and 20, and dependent claims 6, 9-11, 17 and 19 recites the limitation "said block(s)".  There is insufficient antecedent basis for this limitation in the claims. Subsequently, the limitation “each block” in claims 1-2, 10, 12-13 and 20 has insufficient antecedent basis as well.

Claims 2 and 13 recite limitations “said largest block” and “said given block”. There is insufficient antecedent basis for this limitation in the claims.

Claims 2 and 13 recite limitation “i = said number of a position of an initial surrogate identifier value in a given block, such that 0 = i < B.” The scope and meaning of this limitation is ambiguous as the mathematical expression “0 = i < B” appears invalid.

Claims 3 and 14 recite limitations “said respective record” and “said respective block”. There is insufficient antecedent basis for this limitation in the claims.

Claims 4 and 15 recite limitations “...testing whether a next closest available value for the final surrogate identifier provides a uniform distribution for a given block B according to the following formula:”. It is unclear how the application of the formula results in the testing determination, and thus, claims 4 and 15 are rendered ambiguous.  

Claims 4 and 15 recite limitations “...provides a uniform distribution for a given block B...” and “B = number of records in said given block”, while parent claims 2 and 13 recite “B = number of records in said given block”. 
	Given that claim 4 depends on claim 3, which depends on claim 2, and claim 15 depends on claim 14, which depends on claim 13, the limitation “a given block B...” in claims 4 and 15 render the claims ambiguous.  

Claim 4 recites the limitation “pos (id) = round [ (B - 1) * ( ( id - B(O) / B(B-1) - B(O) ) ]” twice and the limitation “V i € [O, B-1]: B(i) == B(pos(B(i)))” whose function is unclear. These limitations render the scope and meaning of the claim ambiguous.

Claim 11 recites the limitation “joining said block having assigned said access block number and said second access block number, by looking up each surrogate identifier value, at the following pre-determined position: pos (id)= round [ (B - 1) * ( ( id- B(O) / B(B-1) - B(O))]”, without defining the variables in the mathematical expression. This renders the scope and meaning of the claim ambiguous.

Claim limitations “first assignment unit” and “second assignment unit" in claim 12 invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. 
Paras [0011, 82] describe “first assignment unit” as a unit adapted to assign an initial surrogate identifier value to records and para [0068] describes that the initial surrogate identifier value may be ideally between O and E-1. However, there is no clear description of how the first assignment unit determines the value. 
Paras [0011, 82] describe “second assignment unit” as a unit adapted to assign a final surrogate identifier value to records and para[0078] describes creating final surrogate identifier values, based on determining an even or uniform distribution of the originally initial surrogate identifier values. However, there is no clear description of how the even or uniform distribution of initial values is determined.
 Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 

(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:



Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Regarding independent claims 1, 12 and 20,
Step 2A, Prong 1: The claim is directed to an abstract idea. 
The limitations of “assigning each of said plurality of records an initial surrogate identifier value, determining a final surrogate identifier value to each of said records assigned to one of said blocks such that said final surrogate identifier values in each block are uniformly distributed”, as drafted, is a process that under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “computer-implemented”, nothing in the claim elements precludes the steps from practically being performed in the human mind and/or. For example, but for the “computer-implemented” language, the claim encompasses the user thinking that each record should be assigned an initial identifier. Furthermore, but for the “computer-implemented” language, the claim encompasses the user thinking that records should be assigned final identifier values such that the interval between consecutive values remain even/uniform. Thus, the claim recites a mental process.
The limitation “assigning a plurality of block identifiers to each of said records by applying a locality sensitive hashing function to a predefined attribute of said Flook's method of  calculating using a mathematical formula. 758 F.3d at 1350, 111USPQ2d at 1721. Thus, the claim recites a mathematical concept.

Step 2A, Prong 2: This judicial exception is not integrated into a practical application. The claim recites the additional element of: acquiring/ providing a reference data set, comprising records that further comprise attributes. The acquiring step is recited at a high level of generality and amounts to mere data gathering, which is a form of insignificant extra-solution activity. The combination of this additional element is no more than mere instructions to apply the exception using a generic computer component (“computer”). Accordingly, even in combination, the additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.

Step 2B: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element in the claim: acquiring/ providing a reference data set comprising records, is recited at a high level of generality and amount to mere data gathering, which is a form of insignificant extra-solution activity, and being computer-implemented, amount to no more than mere instructions to apply the exception using a generic Symantec, TLI, and OIP Techs. court decisions cited in MPEP 2106.05(d)(II) indicate that mere collection or receipt of data over a network is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is here). Accordingly, a conclusion that the acquiring step is well-understood, routine, conventional activity is supported under Berkheimer memo. Thus, the claim is not patent eligible.

Regarding claims 2 and 13,
Claim 2 is dependent on independent claim 1 and include all the limitations of claim 1. Similarly, claim 13 is dependent on independent claim 12 and include all the limitations of claim 12. Therefore, claims 2 and 13 recite the same abstract idea of a mental process and mathematical concept. 
The claim recites the additional limitations regarding sorting block identifiers, which elaborates in the abstract idea of a mental process, and applying a mathematical calculation to determine values, which elaborates in the abstract idea of a mathematical concept, and therefore, does not amount to significantly more than the abstract idea. Thus, the claim is not patent eligible.

Regarding claims 3 and 14,
Claim 3 is dependent on dependent claim 2 and include all the limitations of claim 2. Similarly, claim 14 is dependent on dependent claim 13 and include all the 
The claim recites the additional limitations regarding checking if a value has already been determined and if so, choosing a different value such that the interval between consecutive values remain even/uniform, which elaborates in the abstract idea of a mental process, and therefore, does not amount to significantly more than the abstract idea. Thus, the claim is not patent eligible.

Regarding claims 4 and 15, 
Claim 4 is dependent on dependent claim 3 and include all the limitations of claim 3. Similarly, claim 15 is dependent on dependent claim 14 and include all the limitations of claim 14. Therefore, claims 4 and 15 recite the same abstract idea of a mental process and mathematical concept. 
The claim recites the additional limitations regarding testing if identifier values provide a uniform distribution based on a mathematical calculation, which elaborates in the abstract idea of a mental process and a mathematical concept, and therefore, does not amount to significantly more than the abstract idea. Thus, the claim is not patent eligible.

Regarding claims 5 and 16,
Claim 5 is dependent on independent claim 1 and include all the limitations of claim 1. Similarly, claim 16 is dependent on independent claim 12 and include all the 
The claim recites the additional limitations regarding assigning a same final value to records which had the same initial values, which elaborates in the abstract idea of a mental process, and therefore, does not amount to significantly more than the abstract idea. Thus, the claim is not patent eligible.

Regarding claims 6 and 17,
Claim 6 is dependent on independent claim 1 and include all the limitations of claim 1. Similarly, claim 17 is dependent on independent claim 12 and include all the limitations of claim 12. Therefore, claims 6 and 17 recite the same abstract idea of a mental process and mathematical concept. 
The claim recites the additional limitations regarding organizing blocks containing records as array(s) using record identifiers as index, which elaborates in the abstract idea of a mental process, and therefore, does not amount to significantly more than the abstract idea. Thus, the claim is not patent eligible.

Regarding claims 7 and 18,
Claim 7 is dependent on independent claim 1 and include all the limitations of claim 1. Similarly, claim 18 is dependent on independent claim 12 and include all the limitations of claim 12. Therefore, claims 7 and 18 recite the same abstract idea of a mental process and mathematical concept. 
Symantec, TLI, and OIP Techs. Court decisions cited in MPEP 2106.05(d)(II) indicate that mere collection or receipt of data over a network is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is here). Accordingly, a conclusion that the selecting a data type step is well-understood, routine, conventional activity is supported under Berkheimer memo. Thus, the claim is not patent eligible.

Regarding claims 8 and 19,
Claim 8 is dependent on independent claim 1 and include all the limitations of claim 1. Similarly, claim 19 is dependent on independent claim 12 and include all the limitations of claim 12. Therefore, claims 8 and 19 recite the same abstract idea of a mental process and mathematical concept. 
The claim recites the additional limitations regarding assigning block identifiers and determining a final value for a second attribute, which elaborates in the abstract idea of a mental process and a mathematical concept, and therefore, 

Regarding claim 9,
Claim 9 is dependent on independent claim 1 and include all the limitations of claim 1. Therefore, claim 9 recites the same abstract idea of a mental process and mathematical concept. 
The claim recites the additional limitation of receiving a new record, which is directed to mere data gathering, and considered to be an extra-solution activity that does not meaningfully limit the claim. Further, this additional limitation amounts to no more than mere instructions to apply the exception using a generic computer component. The specification does not provide any indication that such data collection or manipulation is performed by anything other than a generic, off-the-shelf computer component, and the Symantec, TLI, and OIP Techs. Court decisions cited in MPEP 2106.05(d)(II) indicate that mere collection or receipt of data over a network is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is here). Accordingly, a conclusion that the selecting a data type step is well-understood, routine, conventional activity is supported under Berkheimer memo. 
The claim further recites the additional limitation of determining block number by applying a mathematical calculation to attribute of record and accessing block corresponding to block number, which elaborates in the abstract idea of a 

Regarding claim 10,
Claim 10 is dependent on independent claim 1 and include all the limitations of claim 1. Therefore, claim 10 recites the same abstract idea of a mental process and mathematical concept. 
The claim further recites the additional limitation of determining block numbers by applying a mathematical calculation to a second attribute of records and determining an identifier value for the records such that the consecutive values remain even/uniform, which elaborates in the abstract idea of a mathematical concept and a mental process, and therefore, does not amount to significantly more than the abstract idea. Thus, the claim is not patent eligible.

Regarding claim 11,
Claim 11 is dependent on dependent claim 10 and include all the limitations of claim 10. Therefore, claim 11 recites the same abstract idea of a mental process and mathematical concept. 
The claim recites the additional limitation of receiving a new record, which is directed to mere data gathering, and considered to be an extra-solution activity that does not meaningfully limit the claim. Further, this additional limitation amounts to no more than mere instructions to apply the exception using a generic computer component. The specification does not provide any indication that such data Symantec, TLI, and OIP Techs. court decisions cited in MPEP 2106.05(d)(II) indicate that mere collection or receipt of data over a network is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is here). Accordingly, a conclusion that the selecting a data type step is well-understood, routine, conventional activity is supported under Berkheimer memo. 
The claim recites the additional limitations regarding applying a mathematical calculation to a first and second attribute of records to determine block numbers, and  joining the blocks corresponding to the block numbers, which elaborates in the abstract idea of a mathematical concept and a mental process, and therefore, does not amount to significantly more than the abstract idea. Thus, the claim is not patent eligible.

Examiner’s Note
With regard to Claims 4 and 15 , in view of the numerous 112(b) and 101 rejections of the above claim limitations and the parent claims 1-3 and 12-14 respectively, which render the scope and meaning of the claim ambiguous, a prior art rejection is not being given at this time. 


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5-8, 10, 12, 16-18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Agarwal (US 2015/0254329 A1) in view of Rhodes (US 2015/0269178 A1).


Agarwal teaches A computer-implemented method for record linkage of an incoming record to a reference data set, said method comprising providing a reference data set comprising a plurality of records, each record comprising a plurality of attributes, *see paras08-09(“System(s) and method(s) for entity resolution... in the ER analysis, a plurality of documents [read: record] obtained from the various data sources [read: reference data set] may be matched... a set of textual documents related to an entity [read: incoming record] may be identified, and the identified set of textual documents may then be combined to create a merged document [read: record linkage] for the entity...”), para17(“external database and/or an in-house database” teaches ‘reference data set’), para42)
assigning each of said plurality of records an initial surrogate identifier value, *see para18(unique identification(ID) read on ‘initial surrogate identifier value’), para46(“...blocking module 120 may allot a unique identification (ID) to each of the plurality of documents, and may maintain an ID file mapping record IDs to the corresponding documents”)
assigning a plurality of block identifiers to each of said records by applying a locality sensitive hashing function to a predefined attribute of said records, resulting in said plurality of said block identifiers, and *see paras18-19 , paras43-45(Locality Sensitive Hashing (LSH) function applied to group documents into buckets with specific bucket IDs [read: block identifier] based on textual similarity [read: predefined attribute], “...documents with high textual similarity are likely to get at least one same hash-value, i.e., same bucket ID... For example, if 
determining a final surrogate identifier value to each of said records assigned to one of said blocks such that ...each block are uniformly distributed. *see para29 (“partial-entity ID message may be provided to each record-vertex in order to inform the record-vertices about their corresponding partial-entity ID”, para58 (“...record vertices belonging to the two partial entities may be connected and may be considered to be belonging to the same entity. Further, the computation module 124 may provide a connected component ID (CCID) to each of the connected record vertices...”), “partial-entity ID, connected component ID” teach ‘final surrogate identifier value’; para49 and para22 (“...in case the blocking of the plurality of documents may result into substantially uniform distribution of the plurality of documents among the buckets, the BCP technique for entity resolution may be provided...” teaches checking for uniform distribution of records/documents (IDs) among buckets [read: blocks] before entity resolution is performed)

Agarwal does not explicitly teach ... such that said final surrogate identifier values in each block are uniformly distributed. 
However, Rhodes teaches ...such that said final surrogate identifier values in each block are uniformly distributed. *see para34(“Value Transformer 220 transforms the data stream into a substantially uniform, random distribution of transformed values [read: final identifier values]”)
Agarwal to incorporate the teachings of Rhodes and enable Agarwal to ensure final identifier values are uniformly distributed as doing so would enable analyzing the transformed set to yield a desired estimation of cardinality (Rhodes, para34).

Regarding claim 5,
Agarwal as modified by Rhodes teaches all the claimed limitations as set forth in the rejection of claim 1 above.
Agarwal further teaches The method according to claim 1, also comprising:
assigning said determined final surrogate identifier value also to records in subsequent blocks which have said same initial surrogate identifier value as said record having been assigned said final surrogate identifier value *see para58 (“...In case a document or a corresponding record vertex is shared by multiple partial entities, the corresponding record vertex may appear in the vertex-edge structure of each of the multiple partial entities. In such an implementation, record vertices belonging to the two partial entities may be connected and may be considered to be belonging to the same entity. Further, the computation module 124 may provide a connected component ID (CCID) to each of the connected record vertices” teaches record vertices with same initial IDs and belonging to multiple partial entities/edges, being assigned same connected component id [read: final ID])


Agarwal as modified by Rhodes teaches all the claimed limitations as set forth in the rejection of claim 1 above.
Agarwal further teaches The method according to claim 1, wherein said blocks with assigned records are organized as one or more arrays of surrogate identifiers using said surrogate identifiers as index in said reference data set.*see paras45-46(“...In one implementation, each bucket may be understood as a key-value pair. The key may be understood as a corresponding bucket-ID, and value is a group of documents, which may get hashed to this ‘key’... the blocking module 120 may allot a unique identification (ID) to each of the plurality of documents, and may maintain an ID file mapping record IDs to the corresponding documents... in order to reduce data traffic, instead of blocking the plurality of documents themselves, the blocking module 120 may block unique IDs of the documents into the at least one bucket. Further, in the course of blocking the document IDs, one or more singleton buckets may also be formed. Singleton buckets can be understood as buckets including one document ID”. Here, non-singleton buckets, which are buckets with more than one document ID, is read on ‘blocks... organized as arrays of surrogate identifiers’, “ID file” read on ‘index’)

Regarding claim 7,
Agarwal as modified by Rhodes teaches all the claimed limitations as set forth in the rejection of claim 1 above.
Agarwal further teaches The method according to claim 1, wherein said predefined attribute of said records is a combination of at least two attributes *see paras43-45(“...For example, if attributes, such as a name, an address, and a phone number are same in two documents, there might be a possibility that the two documents are related to the same person. Similarly, if the name is same in two documents whereas the address and the phone number differ, the possibility of the two documents being related to the same person is relatively lesser...” and para54(“...a match function may be based on at least one rule defined over attribute values of the two documents being compared. For example, a match function may be defined that the two documents may return “True', if (name matches) AND (address matches) AND (date-of-birth matches)...” indirectly teach that process of blocking/hashing/matching documents can be based on a combination of at least two attributes (for example, “name” and “address”)

Regarding claim 8,
Agarwal as modified by Rhodes teaches all the claimed limitations as set forth in the rejection of claim 1 above.
Agarwal further teaches The method according to claim 1, wherein said steps of assigning a plurality of block identifiers comprises determining a final surrogate identifier value for a second predefined attribute. *see paras43-45(teaches LSH function for blocking documents into buckets with specific bucket IDs [read: block identifier] based on textual similarity, “...documents with high textual similarity are likely to get at least one same hash-value, i.e., same bucket ID... For 

Regarding claim 10,
Agarwal as modified by Rhodes teaches all the claimed limitations as set forth in the rejection of claim 1 above.
Agarwal and Rhodes further teaches The method according to claim 1, also comprising:
assigning a plurality of second block identifiers to each of said records by applying a locality sensitive hashing function to a second predefined attribute of said records, resulting in said plurality of said second block identifiers, *see paras43-45(“...For example, if attributes, such as a name, an address, and a phone number are same in two documents, there might be a possibility that the two documents are related to the same person. Similarly, if the name is same in two documents whereas the address and the phone number differ, the possibility of the two documents being related to the same person is relatively lesser...” and para54(“...a match function may be based on at least one rule defined over attribute values of the two documents being compared. For example, a match function may be defined that the two documents may return “True', if (name matches) AND (address matches) AND (date-of-birth matches)...”) indirectly teach 
determining a second final surrogate identifier value to each of said records assigned to one of said blocks such that said second final surrogate identifier values in each block are uniformly distributed *see para29 (“partial-entity ID message may be provided to each record-vertex in order to inform the record-vertices about their corresponding partial-entity ID”, para58 (“...record vertices belonging to the two partial entities may be connected and may be considered to be belonging to the same entity. Further, the computation module 124 may provide a connected component ID (CCID) to each of the connected record vertices...”), “partial-entity ID, connected component ID” teach ‘final surrogate identifier value’; para49 and para22 (“...in case the blocking of the plurality of documents may result into substantially uniform distribution of the plurality of documents among the buckets, the BCP technique for entity resolution may be provided...” teaches checking for uniform distribution of records/documents (IDs) among buckets [read: blocks] before entity resolution is performed; It is obvious that process of determining final identifiers can be repeated for attribute “address” [read: second predefined attribute]); see Rhodes, para34(“Value Transformer 220 transforms the data stream into a substantially uniform, random distribution of transformed values [read: second final identifier values]”)

Claim 12 recites substantially the same claim limitations as claim 1, and is rejected for the same reasons.

Claim 16 recites substantially the same claim limitations as claim 5, and is rejected for the same reasons.

Claim 17 recites substantially the same claim limitations as claim 6, and is rejected for the same reasons.

Claim 18 recites substantially the same claim limitations as claims 7 and 8, and is rejected for the same reasons.

Claim 20 recites substantially the same claim limitations as claim 1, and is rejected for the same reasons.

Claims 2, 3, 13 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Agarwal in view of Rhodes and Malewicz (US 2015/0213375 A1).

Regarding claim 2,
Agarwal as modified by Rhodes teaches all the claimed limitations as set forth in the rejection of claim 1 above.
Rhodes further teaches The method according to claim 1, also comprising:
..., and wherein said determining said final surrogate identifier value FS-ID comprises performing for each block, starting with said largest block,
FS-ID = offset + i * gap, wherein 
FS-ID = final surrogate identifier value,
gap = E / B and offset = gap / 2, wherein
E = said total number of records in said reference data set,
B = number of records in said given block, and
i = said number of a position of an initial surrogate identifier value in a given block, such that 0 = i < B *see para056 (“...if only initial values are stored in a lookup table, then an enhanced value [read: final surrogate identifier value] corresponding to an initial value may be determined by multiplying a table index number for the initial value by a pre-defined static interval value and adding a pre-defined static offset value to the product. For example, if a value is stored at index i [read: i] of a lookup table, then the mapped value may be derived by k*i+b equation, where k is a static interval value and b is a static offset.”, Here “static interval value k” indirectly teaches ‘gap’, when number of records in blocks are static; While Rhodes does not precisely teach using the particular claimed equation, the particular elements are obvious and it would be obvious to try combination of elements in order to achieve a predictable results. See MPEP 2143).

Agarwal and Rhodes doesn’t explicitly teach sorting said block identifiers by its cardinality
However, Malewicz teaches sorting said block identifiers by its cardinality *see para67(“At block 1405 the system may sort the clusters by cardinality into a 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Agarwal and Rhodes to incorporate the teachings of Malewicz and enable Agarwal to sort blocks by its cardinality as doing so would enable the system to consider the next highest cardinality cluster for further analysis (Malewicz, paras68-69).

Regarding claim 3,
Agarwal as modified by Rhodes and Malewicz teaches all the claimed limitations as set forth in the rejection of claim 2 above.
Rhodes further teaches The method according to claim 2, wherein, in case a final surrogate identifier value is determined which value has already been determined during said determining said final surrogate identifier values of a previous block, a next closest final surrogate identifier value is chosen for said respective record, provided that final surrogate identifier values in said respective block continue to be uniformly distributed *see para34(“Value Transformer 220 transforms the data stream into a substantially uniform, random distribution of transformed values [read: final identifier values]” indirectly teaches that transformed/final values are determined such that the values do not repeat and are uniformly distributed)

Claim 13 recites substantially the same claim limitations as claim 2, and is rejected for the same reasons.

Claim 14 recites substantially the same claim limitations as claim 3, and is rejected for the same reasons.


Claims 9, 11 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Agarwal in view of Rhodes and Malhotra (“Incremental Entity Resolution from Linked Documents”, 2018). 

Regarding claim 9,
Agarwal as modified by Rhodes teaches all the claimed limitations as set forth in the rejection of claim 1 above.
Agarwal further teaches The method according to claim 1, also comprising:
...applying said locality sensitive hashing function to said predefined attribute of said newly received record resulting in an access block number, and accessing said block having said resulting access block number *see paras43-45(teaches applying locality sensitive hashing function to plurality of documents, resulting in document(s) being assigned into buckets with specific bucket IDs [read: access block number] based on textual similarity [read: predefined attribute], Blocking/assigning documents to a block with a specific block ID/access block number also indirectly teaches that the block is being accessed)

Agarwal does not explicitly teach receiving a new record comprising said predefined attribute 
However, Malhotra teaches receiving a new record comprising said predefined attribute... and accessing said block having said resulting access block number *see page5:sec3.3(teaches documents comprising attribute), 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Agarwal and Rhodes to incorporate the teachings of Malhotra and enable Agarwal to receive new records as doing so would enable the system to incorporate new documents that may correspond to entities that do not exist in the database of resolved entities at the time these documents arrive (Malhotra, pages1-2).

Regarding claim 11,
Agarwal as modified by Rhodes teaches all the claimed limitations as set forth in the rejection of claim 10 above.
Agarwal further teaches The method according to claim 10, also comprising:
...applying a second locality sensitive hashing function to said predefined attribute resulting in an access block number, applying the second locality sensitive hashing function to said second predefined attribute resulting in a second access block number, and  *see paras43-45(teaches applying locality sensitive hashing function to documents, resulting in document(s) being assigned into buckets with specific bucket IDs [read: access block number] based on textual similarity or other specific attributes [read: predefined attribute],  “For example, if attributes, such as a name, an address, and a phone number are same in two documents, there might be a possibility that the two documents are related to the same person”); It is obvious that process of blocking/hashing documents and assigning block identifiers based on attribute “name” can be repeated for another attribute “address” [read: second predefined attribute] resulting in different block numbers)
With respect to the limitation “joining said block having assigned said access block number and said second access block number, by looking up each surrogate identifier value, at the following pre-determined position:
pos (id)= round [ (B - 1) * ( ( id- B(0) / B(B-1) - B(0))]. ”, in view of the 112(b) and 101 rejections of the above claim limitations and the parent claims 1 and 10, which render the scope and meaning of the claim ambiguous, a prior art rejection is not being given at this time.

Agarwal does not explicitly teach ...receiving a new record comprising said predefined attribute and said second predefined attribute,
Malhotra teaches receiving a new record comprising said predefined attribute and said second predefined attribute, *see page5:sec3.3(teaches documents comprising a plurality of attributes), page10:sec5(“...we describe how a previously resolved entity-document collection can be resolved with a new set of documents [read: record]... The bucket-ids [read: access block number] created by the LSH process on the new set of documents may contain bucket-ids that were also created earlier LSH was earlier applied on the old documents...” teaches steps executed when a new record is received)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Agarwal and Rhodes to incorporate the teachings of Malhotra and enable Agarwal to receive new records comprising a plurality of attributes as doing so would enable the system to incorporate new documents that may correspond to entities that do not exist in the database of resolved entities at the time these documents arrive (Malhotra, pages1-2).

Claim 19 recites substantially the same claim limitations as claim 9, and is rejected for the same reasons.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure
Christen ("A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication", 2011) presents a survey of twelve variations of six indexing techniques, developed for record linkage and deduplication.
Baxter et. al ("A Comparison of Fast Blocking Methods for Record Linkage", 2003) compares new blocking methods, bigram indexing and canopy clustering with TFIDF (Term Frequency/Inverse Document Frequency), with two older methods, developed for the purposes of record linkage.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANUGEETHA KUNJITHAPATHAM whose telephone number is (408)918-7510.  The examiner can normally be reached on M-F 9-5 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aleksandr Kerzhner can be reached on (571) 270-1760.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public 


/A.K./Examiner, Art Unit 2165                                                                                                                                                                                                        
/MATTHEW ELL/Primary Examiner, Art Unit 2145