DETAILED ACTION


Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Remarks
Claims 1-20 are pending.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 19-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  
As per claim 19, the claim does not fall within at least one of the four categories of patent eligible subject matter because the claim recites “[a] computer program product comprising a computer code”. The specification mentions this embodiment (see specification at page 5, lines 12-14) but does not exclude the claimed “computer program product” from being a signal per se. Thus, the broadest reasonable 
As per claim 20, the claim does not fall within at least one of the four categories of patent eligible subject matter because the claim discloses a system or apparatus but do not describe hardware which executes each of the claimed steps, which is required for a system claim to be statutory. See Digitech Image Techs. v. Electronics for Imaging, 758 F.3d 1344, 1348, 111 USPQ2d 1717, 1719 (Fed. Cir. 2014) ("For all categories except process claims, the eligible subject matter must exist in some physical or tangible form"); MPEP 2106.03(I). It is noted that the claimed “processor” could be interpreted as being a software element since the specification does not particular define the scope of the claimed “processor” or limit it to a hardware embodiment. Accordingly, this claim is rejected as non-statutory for failing to disclose such hardware.

Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites receiving an identifier of the source dataset; determining an intersection weight between the source dataset and each of a plurality of possible further datasets based on the number of common keys between the source dataset and each respective possible further dataset and generating an output based on the intersection weights for use in selecting, one of the plurality of possible further datasets to be joined with the source dataset.
The limitation of determining an intersection weight between the source dataset and each of a plurality of possible further datasets based on the number of common keys between the source dataset and each respective possible further dataset, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Nothing in the claim element precludes the step from practically being performed in the mind. For example “determining” in the context of this claim encompasses a person manually determining an intersection weight between the source dataset and each of a plurality of possible further datasets based on the number of common keys between the source dataset and each respective possible further dataset, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of any generic computer components. For example “generating an output based on the intersection weights for user in selecting” in the context of this claim encompasses the user selecting possible further datasets to be joined with the source dataset. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application because there are no additional elements that integrate the abstract idea into a practical application nor impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The "receiving an identifier from the source dataset" is considered the insignificant extra-solution activity of mere data gathering and such activity does not amount to an inventive concept (see MPEP 2106.05(g)). The claim is not patent eligible.
Claims 2-19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. These depend from independent claim 1 and do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 20 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites an input for receiving an identifier of a source dataset; a processor configured to execute a computer program which determines an intersection weight between the source dataset and each of a plurality of possible further datasets based on a number of common keys between the source dataset and each respective possible further dataset; and an output for providing an output signal based on the intersection weights for use in selecting one of the plurality of possible further datasets to be joined with the source dataset.
The limitation of determining an intersection weight between the source dataset and each of a plurality of possible further datasets based on the number of common keys between the source dataset and each respective possible further dataset, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “by a processor,” nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the “by a processor language, “determining” in the context of this claim encompasses a person manually determining an intersection weight between the source dataset and each of a plurality of possible further datasets based on the number of common keys between the source dataset and each respective possible further dataset, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of any generic computer components. For example “an output for providing an output signal based on the intersection weights for use in selecting” in the context of this claim encompasses the user selecting possible further datasets to be joined with the source dataset. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application In particular, the claim only recites one additional element – using a processor to perform both the determining and output steps. The processor in both steps is recited at a high-level of generality (i.e., as a generic processor performing a generic computer function of ranking information based on a determined amount of use) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a processor to perform both the determining and output steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The "receiving an identifier from the source dataset" is considered the insignificant extra-solution activity of mere data gathering and such activity does not amount to an inventive concept (see MPEP 2106.05(g)). The claim is not patent eligible.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-7 and 10-20 are rejected under 35 U.S.C. 103 as being unpatentable over Frohock et al. (‘Frohock’ hereinafter) (Publication Number 20150242407) in view of Oberhofer et al. (‘Oberhofer’ hereinafter) (Publication Number 20170212953).

As per claim 1, Frohock teaches
A method of determining a further dataset to be joined with a source dataset comprising a plurality of data entries each identified by a respective key, the method comprising: (see abstract and background)
receiving an identifier of the source dataset; (receive/retrieved datasets, paragraphs [0036],[0038],[0042])
[…]
and generating an output based on the intersection weights for use in selecting, one of the plurality of possible further datasets to be joined with the source dataset. (confidence metrics presented to user interface for user to review various derived relationships and confidence metrics and select a relationship to base the data structure synthesis upon, paragraph [0027]; note that Oberhofer teaches intersection weights as shown below)
Frohock does not explicitly indicate “determining an intersection weight between the source dataset and each of a plurality of possible further datasets based on the number of common keys between the source dataset and each respective possible further dataset”.
However, Oberhofer discloses “determining an intersection weight between the source dataset and each of a plurality of possible further datasets based on the number of common keys between the source dataset and each respective possible further dataset” (calculates similarity between datasets based on common attributes and weighting partial similarity scores, paragraphs [0027],[0055],[0063]; note that in the alternative Frohock also teaches calculating an RCM that shows relationships between attributes and values in datasets with weighted results, see paragraphs [0025],[0027],[0043],[0049],[0051]-[0052]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Frohock and Oberhofer because using the steps claimed would have given those skilled in the art the tools to improve the invention by providing a better understanding of datasets to determine if there is a potential of data consolidation reducing IT costs or to understand if the data assets are properly managed (Oberhofer, paragraph [0002]). This gives the user the advantage of more efficient use of expensive resources.

As per claim 2, Frohock teaches
the output which is generated causes the possible further datasets to be presented to the user via a graphical user interface. (user interface and list of related datasets, paragraph [0038]; see also paragraph [0055])

As per claim 3, Frohock teaches
the possible further datasets are presented to the user on the GUI ranked according to their intersection weight with the source dataset. (user interface with ranked results, paragraph [0055])

As per claim 4, Frohock teaches
determining an intersection weight comprises accessing a data structure holding, for each pair of possible further datasets and the source dataset, a pre-calculated intersection weight. (paragraphs [0052]-[0053])

As per claim 5, Frohock teaches
determining an intersection weight comprises accessing the, for each of the plurality of datasets, data indicative of a set of keys comprised in the respective dataset, and calculating the intersection weight from the sets of keys. (weighted aggregate result, paragraphs [0051]-[0052])

As per claim 6, Frohock teaches
receiving user input selecting one of the datasets, based on the intersection weights. (user interface and selection, paragraphs [0027],[0055])

As per claim 7, Frohock teaches
selecting the dataset comprises automatically selecting the possible further dataset which has the highest respective intersection weight with the source dataset. (highest RCM, paragraphs [0025],[0038]-[0039])

As per claim 10,
Frohock does not explicitly indicate “receiving a filtering category and selecting from the plurality of possible further datasets a subset having data entries of a category which matches the filtering category”.
However, Oberhofer discloses “receiving a filtering category and selecting from the plurality of possible further datasets a subset having data entries of a category which matches the filtering category” (dataset and attribute classified for integration, paragraph [0022]-[0023],[0055]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Frohock and Oberhofer because using the steps claimed would have given those skilled in the art the tools to improve the invention by providing a better understanding of datasets to determine if there is a potential of data consolidation reducing IT costs or to understand if the data assets are properly managed (Oberhofer, paragraph [0002]). This gives the user the advantage of more efficient use of expensive resources.

As per claim 11,
Frohock does not explicitly indicate “said determining is performed only for the subset of datasets”.
However, Oberhofer discloses “said determining is performed only for the subset of datasets” (dataset and attribute classified for integration, paragraph [0022]-[0023],[0055]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Frohock and Oberhofer because using the steps claimed would have given those skilled in the art the tools to improve the invention by providing a better understanding of datasets to determine if there is a potential of data consolidation reducing IT costs or to understand if the data assets are properly managed (Oberhofer, paragraph [0002]). This gives the user the advantage of more efficient use of expensive resources.

As per claim 12,
Frohock does not explicitly indicate “the filtering category is received from a user”.
However, Oberhofer discloses “the filtering category is received from a user” (dataset and attribute classified for integration, paragraph [0022]-[0023],[0055]; selected by user, paragraph [0052]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Frohock and Oberhofer because using the steps claimed would have given those skilled in the art the tools to improve the invention by providing a better understanding of datasets to determine if there is a potential of data consolidation reducing IT costs or to understand if the data assets are properly managed (Oberhofer, paragraph [0002]). This gives the user the advantage of more efficient use of expensive resources.

As per claim 13, Frohock teaches
selecting at least one of the possible further datasets. (user selects attributes of two or more datasets, paragraph [0038])

As per claim 14, Frohock teaches
applying a query to the source dataset and the at least one selected the possible further datasets. (join datasets, paragraph [0039], where join is typically used in a query to span different tables)

As per claim 15, Frohock teaches
the identifier of the source dataset is received from a human user, and wherein the generated output enables a step of selecting to be carried out by the human user. (user interface and selection, paragraphs [0027],[0055])

As per claim 16, Frohock teaches
the identifier of the source dataset is received from a human user, and wherein the generated output enables a step of selecting to be carried out automatically without input from a user. (user interface and selection, paragraphs [0027],[0055])

As per claim 17, Frohock teaches
the identifier of the source dataset is received from an autonomous agent supplying the query, and the step of selecting is carried automatically. (automatically synthesizes dataset, paragraph [0025],[0040])

As per claim 18, Frohock teaches
the dataset stores entries identified by a key of a first type; and wherein the intersection weight between the source dataset and at least one of the plurality of datasets storing entries identified by a key of the second type and not of the first type is determined using an intermediate dataset to convert the key of the first type to a key of the second type. (attribute type, paragraph [0022]; key or attribute transform using conversion, paragraph [0028],[0050])

As per claim 19,
This claim is rejected on grounds corresponding to the reasons given above for rejected claim 1 and is similarly rejected.

As per claim 20,
This claim is rejected on grounds corresponding to the reasons given above for rejected claim 1 and is similarly rejected.


Claims 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Frohock et al. (‘Frohock’ hereinafter) (Publication Number 20150242407) in view of Oberhofer et al. (‘Oberhofer’ hereinafter) (Publication Number 20170212953) and further in view of Vatsalan et al. (‘Vatsalan’ hereinafter) (Vatsalan et al., "Privacy-preserving matching of similar patients," Journal of biomedical informatics 59 (2016): 285-298).

As per claim 8,
Neither Frohock nor Oberhofer explicitly indicate “said data indicative of the set of keys comprised in the respective dataset is a respective bloom filter generated from keys comprised in the respective dataset”.
However, Vatsalan discloses “said data indicative of the set of keys comprised in the respective dataset is a respective bloom filter generated from keys comprised in the respective dataset” (section 2.2 & 2.2.3).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Frohock, Oberhofer and Vatsalan because using the steps claimed would have given those skilled in the art the tools to improve the invention by providing efficient masking and achieves similar matching accuracy compared to the matching of actual unencoded patient record (Vatsalan, abstract). This gives the user the advantage of having matching of records that provide for privacy.

As per claim 9, Frohock teaches
the step of determining the intersection weight for a given possible further dataset (datasets with weighted results, paragraphs [0025],[0027],[0043],[0049],[0051]-[0052]).
Neither Frohock nor Oberhofer explicitly indicate “is performed by generating a source bloom filter or hyperloglog structure from keys comprised in the source dataset and comparing the source bloom filter with the respective bloom filter or hyperloglog structure of that dataset”.
However, Vatsalan discloses “is performed by generating a source bloom filter or hyperloglog structure from keys comprised in the source dataset and comparing the source bloom filter with the respective bloom filter or hyperloglog structure of that dataset” (section 2.2 & 2.2.3).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Frohock, Oberhofer and Vatsalan because using the steps claimed would have given those skilled in the art the tools to improve the invention by providing efficient masking and achieves similar matching accuracy compared to the matching of actual unencoded patient record (Vatsalan, abstract). This gives the user the advantage of having matching of records that provide for privacy.


Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon is considered pertinent to applicant's disclosure.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAY A MORRISON whose telephone number is (571)272-7112.  The examiner can normally be reached on Monday - Friday, 8:00 am - 4:00 pm ET.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, James Trujillo can be reached on (571) 272-3677.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Jay A Morrison/
Primary Examiner, Art Unit 2198