DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 16 May 2022 has been entered.

Response to Amendment
Claims 1-15 were previously pending in this application.  The amendment filed 16 May 2022 has been entered and the following has occurred: Claims 1 & 14 have been amended.  Claims 13 & 15 have been cancelled.  No Claims have been added.
Claims 1-12 & 14 remain pending in the application. 









Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-12 & 14 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e. an abstract idea) without significantly more.  
The claims recite subject matter within a statutory category as a process (claims 1-12) & machine (claim 14) which recite steps of:
retrieving the data set of patient data from multiple patients, 
the patient data comprising events related to a disease or a treatment of a disease, and time stamps relates to the events
wherein some of the multiple patients in the data set of patient data each comprise two or more indirect identifiers rendering the respective patient identifiable via a concatenation of the two or more indirect identifiers
where there are less than a predefined value (k) of patients having a same concatenation of two or more indirect identifiers 
wherein an identifiable patient is an outlying patient concatenation of all indirect identifiers of a patient;
determining at least one first indirect identifier representing a property of the data distribution of the time stamps;
determining at least one second indirect identifier representing a number of events regarding a respective patient;
determining, for all patients in the data set, the respective concatenations comprising the first indirect identifier and the second indirect identifier;
identifying, based on the determined concatenations, one or more outlying patients in the data setl and
removing the patient data of each identified outlying patient from the data set to generate an anonymized data set of patient datal
providing the anonymized data set of patient data.
These steps of retrieving data that comprises various events and timestamps/concatenations of indirect identifiers, determining, modifying, and removing various types of PII based on certain parameters/factors of received patient data and/or identified patient outliers based on indirect identifiers, and outputting the result of the modified document(s) as drafted, under the broadest reasonable interpretation, includes performance of the limitation in the mind but for recitation of generic computer components.  That is, other than reciting steps as performed by the generic computer components, nothing in the claim element precludes the step from practically being performed in the mind.  For example, retrieving data that comprises various events, timestamps, and/or concatenations of indirect identifiers simply amounts to person scanning a document associated with a patient that which contains identifying information of the patient such as clinical events, treatment dates, etc. but for the recitation of generic computer components. For example, but for the determining a first and second indirect identifier representing a property of the data distribution of the time stamps and a number of events regarding respective patient language, determining identifiers in the context of this claim encompasses a mental process of the user determining information that can potentially identify a patient based on certain parameters/factors of received patient data.  Similarly, the limitation of determining concatenations that contain the first and second identifiers, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind, such as a person identifying instances where either or both identifiers could possibly arise in patient data and inadvertently allow for identification of the patient, but for the recitation of generic computer components.  For example, but for the removing the patient data from one or more outlying patients that may contain said determined first and second indirect identifiers language, removing the patient data in response to determining a first or second indirect identifier is present in the context of this claim encompasses a mental process of the user manually deleting or removing a portion of an outlier patient data file, whether physical or electronic, based on a first or second indirect identifier being determined to be present by the user.  Furthermore, outputting the modified patient document could possibly include presenting a hand-modified document that has removed or modified PII found therein or simply using a computer as a tool or aid for outputting or presenting the document electronically.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
Dependent claims recite additional subject matter which further narrows or defines the abstract idea embodied in the claims (such as claims 1-12 & 15, reciting particular aspects of determining identifiers based on certain mathematical formulas and/or aspects/parameters, determining time points/windows where certain events/identifiers occur, and deleting or replacing data that contain said determined identifiers may be performed in the mind but for recitation of generic computer components).  
This judicial exception is not integrated into a practical application.  In particular, the additional elements do not integrate the abstract idea into a practical application, other than the abstract idea per se, because the additional elements amount to no more than limitations which:
amount to mere instructions to apply an exception (such as recitation of a computer program product, a computer, a processor, a data interface, a database, amounts to invoking computers as a tool to perform the abstract idea, see applicant’s specification (pp. 12, ll. 24 - pp. 13, ll. 12), (p. 5, ll. 5-10), (pp. 14, ll. 3-14), (pp. 14, ll. 16-20), (pp. 14, ll. 17-23), respectively, see MPEP 2106.05(f))
add insignificant extra-solution activity to the abstract idea (such as recitation of receiving patient data, that which contains clinical events such as disease treatments and timestamps associated with said events and PII within patient data amounts to mere data gathering, recitation of determining indirect identifiers or concatenations of indirect identifiers from patient data based on certain factors or parameters for purposes of deidentifying patient data, and identifying one or more outlying patients in a dataset based on said identifiers/concatenations amounts to selecting a particular data source or type of data to be manipulated, recitation of ); concatenating and/or removing portions of PII found in the clinical documentation and/or outputting the result of the anonymized document amounts to insignificant application, see MPEP 2106.05(g))
generally link the abstract idea to a particular technological environment or field of use (such as recitation of patient data/PII within patient data, anonymization or generation of an anonymized data set of patient data, events related to a disease or a treatment of a disease, see MPEP 2106.05(h))
Dependent claims recite additional subject matter which amount to limitations consistent with the additional elements in the independent claims (such as claims 2-12, which all recited further limiting aspects of the computer-implemented method, program, or system, additional limitations which amount to invoking computers as a tool to perform the abstract idea, claims 2-12, which recite limitations that relate to gathered patient data/PII information from within said patient data, additional limitations which add insignificant extra-solution activity to the abstract idea which amounts to mere data gathering, claims 2-12, which recite limitations that relate to manipulating certain aspects of gathered patient data or PII such as indirect identifiers, various mathematical values specific to the identifier, etc., additional limitations which add insignificant extra-solution activity to the abstract idea by selecting a particular data source or type of data to be manipulated, claims 2, 7, 8-9, 11, which recite various aspects of the data specifically relating to patients, disease treatment/prognosis, and/or certain fields to be implemented such as genomics, genetics, etc., additional limitations generally link the abstract idea to a particular technological environment or field of use).  Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually.  There is no indication that the combination of elements improves the functioning of a computer or improves any other technology.  Their collective functions merely provide conventional computer implementation and do not impose a meaningful limit to integrate the abstract idea into a practical application.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to discussion of integration of the abstract idea into a practical application, the additional elements amount to no more than mere instructions to apply an exception, add insignificant extra-solution activity to the abstract idea, and generally link the abstract idea to a particular technological environment or field of use.  Additionally, the additional limitations, other than the abstract idea per se, amount to no more than limitations which:
amount to elements that have been recognized as well-understood, routine, and conventional activity in particular fields (such as receiving data sets that contain clinical events such as disease treatments with associated timestamps for each of the events and/or PII from multiple patients presumably over a network or communications interface, e.g., receiving or transmitting data over a network, Symantec, MPEP 2106.05(d)(II)(i); concatenating PII received from the clinical documentation and determining an outlying patient based on comparison to a predefined value (k) patients having a same concatenation of indirect identifiers, e.g., performing repetitive calculations, Flook, MPEP 2106.05(d)(II)(ii); determining indirect identifiers and concatenations of indirect identifiers in patient data and subsequently removing the outlier patient data that contains said identifiers for purposes of anonymizing patient data, and outputting, replacing or updating patient clinical documentation with the anonymized/de-identified results e.g., electronic recordkeeping, Alice Corp., MPEP 2106.05(d)(II)(iii); storing computerized instructions for performance of the computerized method, storing patient data comprising events and time stamps related to the events, determining and storing indirect identifiers and/or concatenations of said indirect identifiers relating to the patient data, storing patient data sets/PII for multiple patients, storing instructions to output or display the results of the anonymization efforts, e.g., storing and retrieving information in memory, Versata Dev. Group, MPEP 2106.05(d)(II)(iv)).
Dependent claims recite additional subject matter which, as discussed above with respect to integration of the abstract idea into a practical application, amount to invoking computers as a tool to perform the abstract idea.  Dependent claims recite additional subject matter which amount to limitations consistent with the additional elements in the independent claims (such as claims 2-12, additional limitations which amount to elements that have been recognized as well-understood, routine, and conventional activity in particular fields, claims 2-12, which all require the receiving and/or transmitting of patient data or PII for performance of limitations presented in those Claims, e.g., receiving or transmitting data over a network, Symantec, MPEP 2106.05(d)(II)(i); claims 2, 5-11, which contain limitations relating to comparing aspects of the patient data/PII to a threshold such as an event threshold, calculating lengths of time window(s)/periods, calculating breaks in time window(s)/periods, normalizing respective categories between a minimum and maximum value or via logarithmic functions, e.g., performing repetitive calculations, Flook, MPEP 2106.05(d)(II)(ii); claims 12, which contain limitations relating to updating or replacing timestamps in in patient data records or performance of updating patient records via efforts for anonymizing patients’ data records, e.g., electronic recordkeeping, Alice Corp., MPEP 2106.05(d)(II)(iii); claims 2-12, which all recite various computerized functions or methods that have corresponding stored computerized instructions, storing thresholds, storing patient data sets, storing and retrieving identifiers, storing mathematical formulae/functions, storing metadata associated with patient data sets such as the number of events or set of events that took place, e.g., storing and retrieving information in memory, Versata Dev. Group, MPEP 2106.05(d)(II)(iv)).  Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually.  There is no indication that the combination of elements improves the functioning of a computer or improves any other technology.  Their collective functions merely provide conventional computer implementation.






Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-7, 10-12 & 14 are rejected under 35 U.S.C. 103 as being unpatentable over El Emam et al. (U.S. Patent Publication No. 20150339496) in view of Stevens et al. (U.S. Patent Publication No. 20080147554)

Claim 1 –
Regarding Claim 1, El Emam discloses a computer-implemented method (See El Emam Par [0105] which discloses the use of a computer or server providing at least a processor, memory, and or I/O interface for executing the process disclosed through El Emam) utilizing an anonymization system for anonymization of a data set of patient data from multiple patients for providing a predefined anonymity property (It should be noted the “for anonymization of a data set…” limitation is interpreted as a purpose or intended use of the claimed invention, See MPEP 2111.02(II);  As a result, it is understood that the preamble merely states, for example, the purpose or intended use of the invention, rather than any distinct definition of any of the  claimed invention’s limitations, then the preamble is not considered a limitation and is of no significance to claim construction; see El Emam Par [0060]-[0062] which discloses the computerized method), wherein:
the property defines that a concatenation of all indirect identifiers of a patient enables identifying an outlying patient in the data set if there are less than a predefined value (k) patients having a same concatenation of indirect identifiers (this limitation is considered to be a whereby clause (See MPEP 2111.04(I)) in a method claim that expresses the intended result of a process step positively recited and does not necessarily limit the structure or specific performance of the method; therefore, any method or system that discloses concatenation/grouping of all indirect identifiers of a patient and comparing to a predefined value (k) should enable or allow the identifying of an outlying patient because there is no accompanying language that limits the structure/method otherwise;  in light of Applicant’s specification “concatenations comprise the above determined first indirect identifier and the second indirect identifier, and any further indirect identifiers. Optionally, various first, second and further indirect identifiers may be included in said concatenation, where such combination of indirect identifiers is considered to constitute a risk of identifying the individual”; Therefore, see El Emam Par [0060]-[0062] which discloses datasets containing personal information and possibly containing potentially identifying information within said personal information and further describes the use of k-anonymity which describes an original data set containing indirectly identifying variables such as age and gender, for example, and a the k-anonymized data set having the property that each record is similar to at least another predefined value, k-1 other records on the potentially identifying variables;  for example, if the predefined value k=5, then a k-anonymized data set has at least 5 records (k>4) for each value combination of age and gender, meaning there are at least 5 other records are similar to the k-anonymized data set which effectively anonymizes that data set, as well as the other k data sets; Further, El Emam Par [0066] discloses determining a number of instances within one or more indirect identifiers defined within the dataset, the number of instances specifically utilized for a risk assessment being performed to assess different attack risks (410) such as if the desired risk threshold or predefined number of risk attacks is not exceeded (Yes at 412), the de-identified dates can be published following being randomized with the generalization (414) and the de-identified dates are produced (416), but if the threshold is exceeded, (No at 412), the dates can be further de-identified (e.g., using the next generalization hierarchy provided by a user) which is understood to constitute identifying less than a predefined value (k) patients having a same number or amount of indirect identifiers), 
the patient data comprising:
events related to a disease or a treatment of a disease (See El Emam Par [0033]-[0039] which discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the dataset for each of the plurality of data entries including date events and connected dates in the dataset and as further specified in El Emam Par [0004], [0084] can possibly be related to a patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.); and
time stamps related to the events (While not “time stamp” per se, See El Emam Par [0063] and Figs. 1-3 & 5-9 which disclose dates and intervals between dates associated with an occurrence such as a procedure, visit, date of birth, diagnosis, etc. of a subject and specifically displays the date of said occurrence, it is understood that a “time stamp” can include any date or time granularity unless Applicant further specifies said time stamp specifically including a specific time granularity (e.g., day, hour, minute, second, etc.));
the method comprising the steps of:
retrieving, from a database, the data set of patient data from multiple patients, the patient data comprising events related to a disease or a treatment of a disease, and time stamps relates to the events (See El Emam Par [0004] & [0099] & Figs. 1-2 & 19-20 which disclose the use of timestamps and/or data sequences and identification of certain medical events for purposes of comparing identifiable information of a patient;  See El Emam Par [0063] which discloses that dates or timestamps could correspond to events associated with a disease or treatment of a disease such as financial transaction, doctors/clinical visits, etc.), wherein some of the multiple patients in the data set of patient data each comprise two or more indirect identifiers rendering the respective patient identifiable via a concatenation of the two or more indirect identifiers (While this limitation makes use of “concatenation of two or more indirect identifiers”, this is understood to include a string of two or more indirect identifiers that have been combined for purposes of determining identifiable information of each patient, therefore see El Emam Par [0058]-[0059] which discloses multiple quasi-identifiers and connected dates in a dataset of a patient and performing de-identification on said combined quasi-identifiers/dates in the dataset; see El Emam Par [0060] which discloses potentially identifying variables such as age and gender and the combination of age and gender being more potentially identifying of the patient; see El Emam Par [0099]-[0100] which specifically discloses multiple core dates that are concatenated or connected dates of which may be possible in identifying the patient when combined specifically versus having a singular date and therefore need to be date-shifted or anonymized in the context of the dates being connected.  However El Emam does not seem to specifically disclose the concatenation or direct combination of one or more strings of identifiable information), 
where there are less than a predefined value (k) of patients having a same concatenation of two or more indirect identifiers (See El Emam Par [0063]-[0070] which discloses the use of k-anonymization and general anonymization of medical data for multiple patients, that which utilizes a predefined value (k) of patients having the same indirect identifiers to reduce identification of a patient through said indirect identifier), 
wherein an identifiable patient is an outlying patient (See El Emam Par [0079]-[0083] which discloses using multiple patients’ data for defining an outlier patient and/or identifiability status based on indirect identifying information);
determining, by a processor of the anonymization system (See El Emam Par [0105] which discloses the use of a computer or server providing at least a processor, memory, and or I/O interface for executing the process disclosed through El Emam), at least one first indirect identifier representing a property of the data distribution of the time stamps (See El Emam Par [0033]-[0039] which discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the dataset for each of the plurality of data entries including date events and connected dates in the dataset and as further specified in El Emam Par [0004], [0084] can possibly be related to a patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.; Therefore, see El Emam Par [0060]-[0062] which discloses datasets containing personal information and possibly containing potentially identifying information within said personal information and further describes the use of k-anonymity which describes an original data set containing indirectly identifying variables such as age and gender, for example, and a the k-anonymized data set having the property that each record is similar to at least another predefined value, k-1 other records on the potentially identifying variables, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.),
determining, by the processor (See El Emam Par [0105] which discloses the use of a computer or server providing at least a processor, memory, and or I/O interface for executing the process disclosed through El Emam), at least one second indirect identifier representing a number of events regarding a respective patient (See El Emam Par [0033]-[0039] which discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the dataset for each of the plurality of data entries including date events and connected dates in the dataset and as further specified in El Emam Par [0004], [0084] can possibly be related to a patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.; Therefore, see El Emam Par [0060]-[0062] which discloses datasets containing personal information and possibly containing potentially identifying information within said personal information and further describes the use of k-anonymity which describes an original data set containing indirectly identifying variables such as age and gender, for example, and a the k-anonymized data set having the property that each record is similar to at least another predefined value, k-1 other records on the potentially identifying variables, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.),
determining, by the processor (See El Emam Par [0105] which discloses the use of a computer or server providing at least a processor, memory, and or I/O interface for executing the process disclosed through El Emam), for all patients in the data set, the respective concatenations comprising the first indirect identifier and the second indirect identifier (See El Emam Par [0033]-[0039] which discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the dataset for each of the plurality of data entries including date events and connected dates in the dataset and as further specified in El Emam Par [0004], [0084] can possibly be related to a patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.; Therefore, see El Emam Par [0060]-[0062] which discloses datasets containing personal information and possibly containing potentially identifying information within said personal information and further describes the use of k-anonymity which describes an original data set containing indirectly identifying variables such as age and gender, for example, and a the k-anonymized data set having the property that each record is similar to at least another predefined value, k-1 other records on the potentially identifying variables, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.),
identifying, by the processor (See El Emam Par [0105] which discloses the use of a computer or server providing at least a processor, memory, and or I/O interface for executing the process disclosed through El Emam), based on the determined concatenations, one or more outlying patients in the data set (Per Applicant’s specification, outlying patient(s) constitute “For some predefined value k the k-anonymity property requires that each release of data must be such that every combination of values of quasi-identifiers can be indistinctly matched to at least k individuals.  So, the anonymity property defines that a concatenation of all indirect identifiers of a patient enables identifying an outlying patient in the data set if there are less than the predefined value k patients having a same concatenation of indirect identifiers” While El Emam does not disclose “outlying patients” or “concatenations” per se, El Emam Par [0033]-[0039] discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the grouping or dataset for each of the plurality of data entries including date events and connected dates in the dataset; Further, El Emam Par [0025] & [0060]-[0062] discloses datasets containing PII and possibly containing identifying information within said PII and further describes the use of k-anonymity which describes that a k-anonymized data set has the property that each record is similar to at least another predefined value, k-1 other patient records on the potentially identifiable variables/patients, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc., therefore, the records that do not meet this k-anonymity property, are considered “outliers” via Applicant’s description in the specification; thus, by El Emam disclosing the identification of each record either being similar to at least another k-1 other records on potentially identifying variables or the determination that the dataset does not satisfy said k-anonymity property by all data records not being similar to at least another k-1 other records, El Emam is understood to therefore determine said “outliers” in order to make the determination that the dataset does not satisfy said k-anonymity property;  See further El Emam Par [0061]-[0063] which discloses the system specifically determining if a node is found to be k-anonymous, and if a node is found not to be k-anonymous (thus identifying “outliers” per Applicant’s definition in Applicant’s specification)), and
removing, by the processor (See El Emam Par [0105] which discloses the use of a computer or server providing at least a processor, memory, and or I/O interface for executing the process disclosed through El Emam), the patient data of each identified outlying patient from the data set to generate an anonymized data set of patient data (See El Emam Par [0063]-[0070] & Figs. 4 & 13 which describes the process of de-identifying potentially identifying indirect variables, thus constituting generating an anonymized data set of patient data, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc. by removing said indirect variable and editing the variable to instead contain a placeholder or anchor of sorts to show de-identified identifying information); and
providing the anonymized data set of patient data (See El Emam Par [0104] & Fig. 21 which discloses the publishing or providing of the de-identified data by generating a destination table corresponding to the respective source table that has been anonymized).

While it has been shown above that El Emam discloses most limitations and the use of k-anonymity, El Emam does not explicitly disclose “the property defines a concatenation of all indirect identifiers”.  Rather, El Emam is vocal on the use of k-anonymity which describes an original data set containing indirectly identifying variables such as age, gender, dates of occurrences, etc., for example, and El Emam Par [0034] describes “performing consolidation of a plurality of indirect identifiers and connected dates in the dataset”.  Therefore, El Emam describes the anonymization of an original data set containing indirectly identifying variables, but is not explicit on this consolidation explicitly being a “concatenation” of all indirect identifiers as claimed above.

However, Stevens specifically discloses a concatenation of potentially identifying information of the patient (See Stevens Par [0013], [0035], [0063]-[0064] and Fig. 2 which disclose a concatenation module that concatenate or sequences in a predetermined order the parts of PII being used to create an anonymous linking code, the concatenation module further orders the data in block 804 such that the encrypted first name is followed immediately by the encrypted last name, followed by the encrypted insurance policy and so on).  The disclosure of Stevens is directly applicable to the disclosure of El Emam because both disclosures share limitations and capabilities, namely, they are both directed towards the deidentification of healthcare and personal identifying data of patients.
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the aspects of El Emam relating to the performance of consolidation of a plurality of date events in the indirect identifiers and connected dates in the dataset to further specifically include the concatenation of the identifying variables, as disclosed by Stevens, to allow the linking-together of portions of PII in a proper sequence for purposes of forming an anonymous linking code and/or transformation into deidentified data (See Stevens Par [0035], [0063]-[0064] and Fig. 2).

Claim 2 –
Regarding Claim 2, El Emam and Stevens disclose the method of Claim 1 in its entirety.  El Emam further discloses a method, wherein:
determining whether the number of events regarding a respective patient is below a number threshold (N) (El Emam Par [0066] discloses determining a number of instances within one or more indirect identifiers defined within the dataset, the number of instances specifically utilized for a risk assessment being performed to assess different attack risks (410) such as if the desired risk threshold or predefined number of risk attacks is not exceeded (Yes at 412), the de-identified dates can be published following being randomized with the generalization (414) and the de-identified dates are produced (416), but if the threshold is exceeded, (No at 412), the dates can be further de-identified (e.g., using the next generalization hierarchy provided by a user), which therefore constitutes determining an outlying event category or “attack risk”), and, if so,
determining, as a third indirect identifier, the set of events regarding the respective patient (See El Emam Par [0033]-[0039] which discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the dataset for each of the plurality of data entries including a set of date events and connected dates in the dataset and as further specified in El Emam Par [0004], [0084] can possibly be related to a patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.; Therefore, see El Emam Par [0060]-[0062] which discloses datasets containing personal information and possibly containing potentially identifying information within said personal information and further describes the use of k-anonymity which describes an original data set containing indirectly identifying variables such as age and gender, for example, and a the k-anonymized data set having the property that each record is similar to at least another predefined value, k-1 other records on the potentially identifying variables, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.).

Claim 3 –
Regarding Claim 3, El Emam and Stevens disclose the method of Claim 1 in its entirety.  El Emam further discloses a method, wherein:
the set of events is an order list of events (See El Emam Par [0033]-[0039] which discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the dataset for each of the plurality of data entries including a set of date events and connected dates in the dataset and as further specified in El Emam Par [0004], [0084] can possibly be related to a patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.; Therefore, see El Emam Par [0060]-[0062] which discloses datasets containing personal information and possibly containing potentially identifying information within said personal information and further describes the use of k-anonymity which describes an original data set containing indirectly identifying variables such as age and gender, for example, and a the k-anonymized data set having the property that each record is similar to at least another predefined value, k-1 other records on the potentially identifying variables, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.; See El Emam Par [0063] which further specifically discloses an “order” of said list of events being maintained)

Claim 4 –
Regarding Claim 4, El Emam and Stevens disclose the method of Claim 1 in its entirety.  El Emam further discloses a method, wherein:
the first indirect identifier represents a length of a time window covering all time stamps from an individual (While not “time stamps” per se, See El Emam Par [0063]-[0065] which discloses time intervals and/or anchors expressed as a time before and/or after dates and times associated with a patient’s medical visits, treatments, etc. as being a candidate for de-identification because any sequence dates represent personal information as disclosed in El Emam Par [0063]).

Claim 5 –
Regarding Claim 5, El Emam and Stevens disclose the method of Claim 4 in its entirety.  El Emam further discloses a method, wherein:
determining a number of breaks in the time window as a further indirect identifier, a break being a local minimum in the distribution of the events during the time window (While not “time stamps” per se, See El Emam Par [0063]-[0065] which discloses time intervals and/or anchors expressed as a time before and/or after dates and times associated with a patient’s medical visits, treatments, etc. as being a candidate for de-identification because any sequence dates represent personal information as disclosed in El Emam Par [0063];  The time intervals or periods between the anchor dates (which are expressed as a time before and/or after dates and times associated with a patient’s medical visits, i.e. constituting a time window)).

Claim 6 –
Regarding Claim 6, El Emam and Stevens disclose the method of Claim 1 in its entirety.  El Emam further discloses a method, wherein:
determining periods of a predetermined length in a sequence of events from an individual (While not “time stamps” per se, See El Emam Par [0063]-[0065] which discloses time intervals and/or anchors expressed as a time before and/or after dates and times associated with a patient’s medical visits, treatments, etc. as being a candidate for de-identification because any sequence dates represent personal information as disclosed in El Emam Par [0063];  The time intervals or periods between the anchor dates (anchor dates are expressed as a time before and/or after dates and times associated with a patient’s medical visits, i.e. constituting a time window)), and
determining a number of breaks in the periods as the first indirect identifier, a break being a local minimum in the distribution of the events during the periods (While not “time stamps” per se, See El Emam Par [0063]-[0065] which discloses time intervals and/or anchors expressed as a time before and/or after dates and times associated with a patient’s medical visits, treatments, etc. as being a candidate for de-identification because any sequence dates represent personal information as disclosed in El Emam Par [0063];  The time intervals or periods between the anchor dates (anchor dates are expressed as a time before and/or after dates and times associated with a patient’s medical visits, i.e. constituting a time window) and therefore the intervals or periods between the anchor dates would be considered “breaks in the distribution of events” under broadest reasonable interpretation).

Claim 7 –
Regarding Claim 7, El Emam and Stevens disclose the method of Claim 1 in its entirety.  El Emam further discloses a method, wherein:
determining, as the first indirect identifier, intervals of a predetermined length that have no events in respective sequences of events of respective patients (While not “time stamps” per se, See El Emam Par [0063]-[0065] which discloses time intervals and/or anchors expressed as a time before and/or after dates and times associated with a patient’s medical visits, treatments, etc. as being a candidate for de-identification because any sequence dates represent personal information as disclosed in El Emam Par [0063];  The time intervals or periods between the anchor dates (anchor dates are expressed as a time before and/or after dates and times associated with a patient’s medical visits, i.e. constituting a time window) and therefore the intervals or periods between the anchor dates would be considered “breaks in the distribution of events or intervals of length that have no events” under broadest reasonable interpretation).

Claim 10 –
Regarding Claim 10, El Emam and Stevens disclose the method of Claim 1 in its entirety.  El Emam further discloses a method, wherein:
using as the second indirect identifier a logarithmic function of the number of events regarding a respective individual (While not “logarithmic function” per se, see El Emam Par [0066] discloses determining a number of equivalence classes for one or more indirect identifiers defined within the dataset, the equivalence class based upon ranges of values associated with each indirect identifier, and even further defines a “lattice” being generated for a plurality of nodes, each node of the lattice defining an anonymization strategy by equivalence class generalization, the plurality of nodes arranged in rows providing k-anonymity by performing a recursive binary search of the lattice commencing from a left most node in a middle row of the lattice, each of the one or more generalization strategies being defined by nodes lowest in the respective generalization strategy within the lattice, each providing a least amount of equivalence class generalization of one or more quasi-identifiers and the associated record suppression value of the dataset;  this is done by, as disclosed in Par [0066] and U.S. Patent No. 8326849, which is incorporated into El Emam Par [0066] by reference, by performing the entropy information loss which is a logarithmic function, as disclosed in U.S. Patent No. 8326849).

Claim 11 –
Regarding Claim 11, El Emam and Stevens disclose the method of Claim 1 in its entirety.  El Emam further discloses a method, wherein:
determining, across the data set, respective numbers of events in respective event categories regarding a respective disease or treatment (El Emam Par [0066] discloses determining a number of instances within one or more indirect identifiers defined within the dataset, the number of instances specifically utilized for a risk assessment being performed to assess different attack risks (410) such as if the desired risk threshold or predefined number of risk attacks is not exceeded (Yes at 412), the de-identified dates can be published following being randomized with the generalization (414) and the de-identified dates are produced (416), but if the threshold is exceeded, (No at 412), the dates can be further de-identified (e.g., using the next generalization hierarchy provided by a user), which therefore constitutes determining an outlying event category or “attack risk”),
determining at least one outlying event category where the respective number of events is less than an event threshold (E) (While not “event threshold” per se, See El Emam Par [0064]-[0066] which specifically discloses a risk assessment being performed to assess different attack risks (410) such as if the desired risk threshold is not exceeded (Yes at 412), the de-identified dates can be published following being randomized with the generalization (414) and the de-identified dates are produced (416), but if the threshold is exceeded, (No at 412), the dates can be further de-identified (e.g., using the next generalization hierarchy provided by a user), which therefore constitutes determining an outlying event category or “attack risk”, as El Emam Par [0066] discloses determining a number of equivalence classes for one or more indirect identifiers defined within the dataset, the equivalence class based upon ranges of values associated with each indirect identifier), and
generalizing the outlying respective event category until the events end-up in an event category where the respective number of events is higher than the threshold (While not “event threshold” per se, See El Emam Par [0064]-[0066] which specifically discloses a risk assessment being performed to assess different attack risks (410) such as if the desired risk threshold or predefined number of risk attacks is not exceeded (Yes at 412), the de-identified dates can be published following being randomized with the generalization (414) and the de-identified dates are produced (416), but if the threshold is exceeded, (No at 412), the dates can be further de-identified (e.g., using the next generalization hierarchy provided by a user), which therefore constitutes determining an outlying event category or “attack risk”, as El Emam Par [0066] discloses determining a number of equivalence classes for one or more indirect identifiers defined within the dataset, the equivalence class based upon ranges of values associated with each indirect identifier; as shown in Figs. 4 and 21 of El Emam, the performance of this determination is recursive or a loop and therefore would end the loop if the respective number of events or attack risks exceeds said risk threshold which reads on generalizing until the number of risk attacks or events is higher than the threshold).

Claim 12 –
Regarding Claim 12, El Emam and Stevens disclose the method of Claim 1 in its entirety.  El Emam further discloses a method, wherein:
replacing time stamps representing dates by the time stamps representing intervals between the dates (While not “time stamps” per se, See El Emam Par [0063]-[0065] which discloses time intervals and/or anchors expressed as a time before and/or after dates and times associated with a patient’s medical visits, treatments, etc. as being a candidate for de-identification because any sequence dates represent personal information as disclosed in El Emam Par [0063];  The time intervals or periods between the anchor dates (anchor dates are expressed as a time before and/or after dates and times associated with a patient’s medical visits, i.e. constituting a time window) and therefore the intervals or periods between the anchor dates would be considered “breaks in the distribution of events or intervals of length that have no events” under broadest reasonable interpretation; See El Emam Par [0063]-[0070] & Figs. 4 & 13 which describes the process of de-identifying potentially identifying indirect variables, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc. by removing said indirect variable and editing the variable to instead contain a placeholder or anchor and interval of sorts to show de-identified identifying information).

Claim 14 –
Regarding Claim 14, El Emam discloses a system for anonymization of a data set of patient data from multiple patients for providing a predefined anonymity property, wherein
the property defines that a concatenation of all indirect identifiers of a patient enables identifying an outlying patient in the data set if there are less than a predefined value (k) patients having a same concatenation of indirect identifiers (this limitation is considered to be a whereby clause (See MPEP 2111.04(I)) in a method claim that expresses the intended result of a process step positively recited and does not necessarily limit the structure or specific performance of the method; therefore, any method or system that discloses concatenation/grouping of all indirect identifiers of a patient and comparing to a predefined value (k) should enable or allow the identifying of an outlying patient because there is no accompanying language that limits the structure/method otherwise;  Therefore, see El Emam Par [0060]-[0062] which discloses datasets containing personal information and possibly containing potentially identifying information within said personal information and further describes the use of k-anonymity which describes an original data set containing indirectly identifying variables such as age and gender, for example, and a the k-anonymized data set having the property that each record is similar to at least another predefined value, k-1 other records on the potentially identifying variables;  for example, if the predefined value k=5, then a k-anonymized data set has at least 5 records (k>4) for each value combination of age and gender, meaning there are at least 5 other records are similar to the k-anonymized data set which effectively anonymizes that data set, as well as the other k data sets), 
the patient data comprising:
events related to a disease or a treatment of a disease (See El Emam Par [0033]-[0039] which discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the dataset for each of the plurality of data entries including date events and connected dates in the dataset and as further specified in El Emam Par [0004], [0084] can possibly be related to a patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.);
time stamps related to the events (While not “time stamp” per se, See El Emam Par [0063] and Figs. 1-3 & 5-9 which disclose dates and intervals between dates associated with an occurrence such as a procedure, visit, date of birth, diagnosis, etc. of a subject and specifically displays the date of said occurrence, it is understood that a “time stamp” can include any date or time granularity unless Applicant further specifies said time stamp specifically including a specific time granularity (e.g., day, hour, minute, second, etc.));
said system comprising:
a database comprising the data set of patient data from multiple patient (See El Emam Par [0004] & [0099] & Figs. 1-2 & 19-20 which disclose the use of timestamps and/or data sequences and identification of certain medical events for purposes of comparing identifiable information of a patient), the patient data comprising events related to a disease or a treatment of a disease and time stamps related to the events (See El Emam Par [0063] which discloses that dates or timestamps could correspond to events associated with a disease or treatment of a disease such as financial transaction, doctors/clinical visits, etc.) wherein some of the multiple patients in the data set of patient data each comprise two or more indirect identifiers rendering the respective patient identifiable via a concatenation of the two or more indirect identifiers (While this limitation makes use of “concatenation of two or more indirect identifiers”, this is understood to include a string of two or more indirect identifiers that have been combined for purposes of determining identifiable information of each patient, therefore see El Emam Par [0058]-[0059] which discloses multiple quasi-identifiers and connected dates in a dataset of a patient and performing de-identification on said combined quasi-identifiers/dates in the dataset; see El Emam Par [0060] which discloses potentially identifying variables such as age and gender and the indirect combination of age and gender being more potentially identifying of the patient; see El Emam Par [0099]-[0100] which specifically discloses multiple core dates that are concatenated or connected dates of which may be possible in identifying the patient when combined specifically versus having a singular date and therefore need to be date-shifted or anonymized in the context of the dates being connected.  However El Emam does not seem to specifically disclose the concatenation or direct combination of one or more strings of identifiable information), 
where there are less than a predefined value (k) of patients having a same concatenation of two or more indirect identifiers (See El Emam Par [0063]-[0070] which discloses the use of k-anonymization and general anonymization of medical data for multiple patients, that which utilizes a predefined value (k) of patients having the same indirect identifiers to reduce identification of a patient through said indirect identifier), 
wherein an identifiable patient is an outlying patient (See El Emam Par [0079]-[0083] which discloses using multiple patients’ data for defining an outlier patient and/or identifiability status based on indirect identifying information);
a data interface configured to receive the patient data set from the database (See El Emam Par [0105] which discloses the use of a computer or server providing at least a processor, memory, and or I/O interface for executing the process disclosed through El Emam; See El Emam Par [0033]-[0039] which discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the dataset for each of the plurality of data entries including date events and connected dates in the dataset), and 
a processor (See El Emam Par [0105] which discloses the use of a computer or server providing at least a processor, memory, and or I/O interface for executing the process disclosed through El Emam) arranged to:
determine at least one first indirect identifier representing a property of the data distribution of the time stamps (See El Emam Par [0033]-[0039] which discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the dataset for each of the plurality of data entries including date events and connected dates in the dataset and as further specified in El Emam Par [0004], [0084] can possibly be related to a patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.; Therefore, see El Emam Par [0060]-[0062] which discloses datasets containing personal information and possibly containing potentially identifying information within said personal information and further describes the use of k-anonymity which describes an original data set containing indirectly identifying variables such as age and gender, for example, and a the k-anonymized data set having the property that each record is similar to at least another predefined value, k-1 other records on the potentially identifying variables, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.),
determine at least one second indirect identifier representing a number of events regarding a respective patient (See El Emam Par [0033]-[0039] which discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the dataset for each of the plurality of data entries including date events and connected dates in the dataset and as further specified in El Emam Par [0004], [0084] can possibly be related to a patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.; Therefore, see El Emam Par [0060]-[0062] which discloses datasets containing personal information and possibly containing potentially identifying information within said personal information and further describes the use of k-anonymity which describes an original data set containing indirectly identifying variables such as age and gender, for example, and a the k-anonymized data set having the property that each record is similar to at least another predefined value, k-1 other records on the potentially identifying variables, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.),
determine, for all patients in the data set, the respective concatenations comprising the first indirect identifier and the second indirect identifier (See El Emam Par [0033]-[0039] which discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the dataset for each of the plurality of data entries including date events and connected dates in the dataset and as further specified in El Emam Par [0004], [0084] can possibly be related to a patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.; Therefore, see El Emam Par [0060]-[0062] which discloses datasets containing personal information and possibly containing potentially identifying information within said personal information and further describes the use of k-anonymity which describes an original data set containing indirectly identifying variables such as age and gender, for example, and a the k-anonymized data set having the property that each record is similar to at least another predefined value, k-1 other records on the potentially identifying variables, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc.),
identify, based on the determined concatenations, one or more outlying patients in the data set (Per Applicant’s specification, outlying patient(s) constitute “For some predefined value k the k-anonymity property requires that each release of data must be such that every combination of values of quasi-identifiers can be indistinctly matched to at least k individuals.  So, the anonymity property defines that a concatenation of all indirect identifiers of a patient enables identifying an outlying patient in the data set if there are less than the predefined value k patients having a same concatenation of indirect identifiers” While El Emam does not disclose “outlying patients” per se, El Emam Par [0033]-[0039] discloses retrieving a dataset possibly containing personal identifying information, such as patient data as set forth in El Emam Abstract, and within said dataset, seeking and determining quasi-identifiers in the dataset for each of the plurality of data entries including date events and connected dates in the dataset, etc.; Further, El Emam Par [0025] & [0060]-[0062] discloses datasets containing PII and possibly containing identifying information within said PII and further describes the use of k-anonymity property which describes that a k-anonymized data set has the property that each record is similar to at least another predefined value, k-1 other patient records on the potentially identifiable variables/patients, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc., therefore, the records that do not meet this k-anonymity property, are considered “outliers” via Applicant’s description in the specification; thus, by El Emam disclosing the identification of each record either being similar to at least another k-1 other records on potentially identifying variables or the determination that each record does not satisfy said k-anonymity property by all data records not being similar to at least another k-1 other records, El Emam is understood to therefore determine said “outliers” in order to make the determination that the dataset does not satisfy said k-anonymity property;  See further El Emam Par [0061]-[0063] which discloses the system specifically determining if a node is found to be k-anonymous, and if a node is found not to be k-anonymous (thus identifying “outliers” per Applicant’s definition in Applicant’s specification)), and
remove the patient data of each identified outlying patient from the data set to generate an anonymized data set of patient data (See El Emam Par [0063]-[0070] & Figs. 4 & 13 which describes the process of de-identifying potentially identifying indirect variables, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc. by removing said indirect variable and editing the variable to instead contain a placeholder or anchor of sorts to show de-identified identifying information)
provide the anonymized data set of patient data (See El Emam Par [0104] & Fig. 21 which discloses the publishing or providing of the de-identified data by generating a destination table corresponding to the respective source table that has been anonymized).

While it has been shown above that El Emam discloses most limitations, El Emam does not specifically disclose “the property defines a concatenation of all indirect identifiers”.  Rather, El Emam is vocal on the use of k-anonymity which describes an original data set containing indirectly identifying variables such as age, gender, dates of occurrences, etc., for example, and El Emam Par [0034] describes “performing consolidation of a plurality of indirect identifiers and connected dates in the dataset”.  Therefore, El Emam describes the anonymization of an original data set containing a group, link, or other combination of indirectly identifying variables, but is not explicit on this consolidation being a “concatenation” of all indirect identifiers as claimed above.

However, Stevens specifically discloses a concatenation of potentially identifying information of the patient (See Stevens Par [0013], [0035], [0063]-[0064] and Fig. 2 which disclose a concatenation module that concatenate or sequences in a predetermined order the parts of PII being used to create an anonymous linking code, the concatenation module further orders the data in block 804 such that the encrypted first name is followed immediately by the encrypted last name, followed by the encrypted insurance policy and so on).  The disclosure of Stevens is directly applicable to the disclosure of El Emam because both disclosures share limitations and capabilities, namely, they are both directed towards the deidentification of healthcare and personal identifying data of patients.
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the aspects of El Emam relating to the performance of consolidation of a plurality of date events in the indirect identifiers and connected dates in the dataset to further specifically include the concatenation of the identifying variables, as disclosed by Stevens, to allow the linking-together of portions of PII in a proper sequence for purposes of forming an anonymous linking code and/or transformation into deidentified data (See Stevens Par [0035], [0063]-[0064] and Fig. 2).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over El Emam et al. in view of Stevens, further in view of Karkarla et al. (“Normalizing Data – Part 1 of AI series” - NPL).

Claim 8 –
Regarding Claim 8, El Emam discloses the method of Claim 1 in its entirety.  El Emam further discloses a method, wherein:
determining of a number of categories (nr_categories) for values (x) of a respective indirect identifier that attacker may differentiate (While not “category” per se, under broadest reasonable interpretation, any determination of a number of classes, categories, etc. of a respective identifier is understood to read on this claim;  See El Emam Par [0066] which discloses determining a number of equivalence classes for one or more indirect identifiers defined within the dataset, the equivalence class based upon ranges of values associated with each indirect identifier, and even further defines a “lattice” being generated for a plurality of nodes, each node of the lattice defining an anonymization strategy by equivalence class generalization, the plurality of nodes arranged in rows providing k-anonymity by performing a recursive binary search of the lattice commencing from a left most node in a middle row of the lattice, each of the one or more generalization strategies being defined by nodes lowest in the respective generalization strategy within the lattice, each providing a least amount of equivalence class generalization of one or more quasi-identifiers and the associated record suppression value of the dataset),
El Emam does not explicitly further disclose a method wherein:
normalizing a respective determined category to a normalized value (c) between a minimum value (value_min) and a maximum value (value_max):
c=round(((x−value_min)/(value_max−value_min))*nr_categories).
It should be noted that Examiner interprets the normalization of the respective category to be for purposes of anonymization of the data given the interpretation of the claim provided above in the 35 U.S.C. 112(b) section of this Office Action.  El Emam Par [0066] does disclose determining a number of equivalence classes for one or more indirect identifiers defined within the dataset, the equivalence class based upon ranges of values associated with each indirect identifier, and even further defines a “lattice” being generated for a plurality of nodes, each node of the lattice defining an anonymization strategy by equivalence class generalization, the plurality of nodes arranged in rows providing k-anonymity.  El Emam Par [0066]-[0068] further describes generalizing strategies for the plurality of nodes, such as for conversion of values to a generalized sequence, number, or date for purposes of anonymization or providing k-anonymity.  However, El Emam is not explicit on the generalizing strategies for the plurality of nodes being normalized using min-max value normalization as is understood to be set forth in the above limitations.

Kakarla discloses the following limitations:
normalizing a respective determined category to a normalized value (c) between a minimum value (value_min) and a maximum value (value_max):
c=round(((x−value_min)/(value_max−value_min))*nr_categories) (See Kakarla “Why do we normalize?” and “Normalization Methods – Min Max Normalization” which discloses the previous formula aside from multiplying by the number of categories, however, it is understood that the basis of this equation is simply the min-max normalization as described by Kakarla, and as described in Kakarla, provides a conversion of the data to fall between the range of 0 to 1, and the remaining portion of the formula, such as multiplying by the number of categories, is simply optimization through routine experimentation (See MPEP 2144.05), because the Min-Max normalization could be multiplied by any arbitrary value and multiplying by the number of categories is not considered to be critical to the performance of the method, such as for purposes of future determinations within the system).

The disclosures of El Emam/Stevens and Kakarla share limitations and capabilities, namely, they are all directed towards the treatment, normalization, and manipulation of data in varying forms.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the generalizing strategies for the plurality of nodes disclosed in El Emam/Stevens to specifically be normalized using min-max value normalization, as disclosed in Kakarla, because normalization brings any dataset to a comparable range and brings any dataset uniformity (See Kakarla “Why do we Normalize?”), and as expressed in El Emam, conversion of values to a more generalized/normalized sequence, number, or date allows for purposes of anonymization or providing k-anonymity.

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over El Emam in view of Stevens, further in view of Feng et al. (“Log-Transformation and Its Implications for Data Analysis” – NPL).

Claim 9 –
Regarding Claim 9, El Emam discloses the method of Claim 1 in its entirety.  El Emam further discloses a method, wherein:
determining of a number of categories (nr_categories) for values (x) up to a maximum value (value_max) of a respective indirect identifier that attacker may differentiate (While not “category” per se, under broadest reasonable interpretation, any determination of a number of classes, categories, etc. of a respective identifier is understood to read on this claim;  See El Emam Par [0066]-[0068] which discloses determining a number of equivalence classes for one or more indirect identifiers defined within the dataset, the equivalence class based upon ranges of values associated with each indirect identifier, and even further defines a “lattice” being generated for a plurality of nodes, each node of the lattice defining an anonymization strategy by equivalence class generalization, the plurality of nodes arranged in rows providing k-anonymity by performing a recursive binary search of the lattice commencing from a left most node in a middle row of the lattice, each of the one or more generalization strategies being defined by nodes lowest in the respective generalization strategy within the lattice, each providing a least amount of equivalence class generalization of one or more quasi-identifiers and the associated record suppression value of the dataset),
El Emam does not further disclose a method wherein:
normalizing a respective determined category to a log-normalized value (c):
c = round(log L(x)), wherein L is extracted from round (log L(value_max)) = nr_categories .

It should be noted that Examiner interprets the normalization of the respective category to be for purposes of anonymization of the data given the interpretation of the claim provided above in the 35 U.S.C. 112(b) section of this Office Action.  El Emam Par [0066] does disclose determining a number of equivalence classes for one or more indirect identifiers defined within the dataset, the equivalence class based upon ranges of values associated with each indirect identifier, and even further defines a “lattice” being generated for a plurality of nodes, each node of the lattice defining an anonymization strategy by equivalence class generalization, the plurality of nodes arranged in rows providing k-anonymity.  El Emam Par [0066]-[0068] further describes generalizing strategies for the plurality of nodes, such as for conversion of values to a generalized sequence, number, or date for purposes of anonymization or providing k-anonymity.  However, El Emam is not explicit on the generalizing strategies for the plurality of nodes being normalized being log-normalized as is understood to be set forth in the above limitations.

Feng discloses the following limitations:
normalizing a respective determined category to a log-normalized value (c):
c = round(log L(x)), wherein L is extracted from round (log L(value_max)) = nr_categories (See Feng “Log-Normal Transformation – Using the log transformation to make data conform to normality” & “Log-Normal Transformation – Using the log transformation to reduce variability of data” which disclose the application of log-normalization to data in order to simulate data that is uniformly distributed between integers 0 and 1).

The disclosures of El Emam/Stevens and Feng share limitations and capabilities, namely, they are all directed towards the treatment, normalization, and manipulation of data, and more specifically biologically relevant data, in varying forms.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the generalizing strategies for the plurality of nodes disclosed in El Emam/Stevens to specifically be normalized using log-normalization, as disclosed in Feng to simulate data that is uniformly distributed between integers 0 and 1 (See Feng “Using the log transformation to make data conform to normality” & “Log-Normal Transformation – Using the log transformation to reduce variability of data”) and as expressed in El Emam, conversion of values to a more generalized/normalized sequence, number, or date allows for purposes of anonymization or providing k-anonymity.

Response to Arguments
Applicant's arguments filed 16 May 2022 have been fully considered but they are not persuasive:
Regarding 35 U.S.C. 112(b) rejections of Claim 15, Applicant argues on pp. 7 of Arguments/Remarks that Claim 15 has been cancelled and therefore 35 U.S.C. 112 rejections should be withdrawn.  Examiner agrees with Applicant’s arguments.  Therefore, the 35 U.S.C. 112(b) rejections are withdrawn. 
Regarding 35 U.S.C. 101 rejections of Claims 1-12 & 14, Applicant argues on pp. 7-8 of Arguments/Remarks that the newly amended Clams are not directed to/do not recite an abstract idea.  More specifically, Applicant argues that the Claims are directed to a method of generating an anonymized data set of patient data and therefore the Claims do not represent an abstract idea.  Examiner respectfully disagrees with Applicant’s arguments.  In the instant set of Claims, the steps of determining, modifying, and removing various types of PII based on certain parameters/factors of received patient data, as drafted, under the broadest reasonable interpretation, includes performance of the limitation in the mind but for recitation of generic computer components.  While additional elements are recited that are further considered in the Alice/Mayo framework presented in the 35 U.S.C. 101 rejection of this Office Action, the broadest reasonable interpretation of the anonymization efforts presented in the Claims amounts to receiving patient PII, analyzing/determining that there is a certain amount of identifiability of the patient given the PII, modifying or removing various portions of the PII so that the amount of identifiability is reduced, and outputting or displaying the result.  That is, other than reciting steps as performed by the generic computer components, nothing in the claim element precludes the step from practically being performed in the mind.  Furthermore, MPEP 2106.04(a)(2)(III)(C) states that a Claim that requires a computer or use of generic computer components can still recite a Mental Process.  For instance, the steps of receiving patient data, analyzing said patient data for any personally identifying information, modifying said information to reduce the identifiability of the patient, and outputting the modified record are all limitations that can be reasonably performed in the mind of a human, albeit with the aid of generic computerized components.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  As such, Claims 1-12 & 14 remain rejected under 35 U.S.C. 101.
Regarding 35 U.S.C. 101 rejections of Claims 1-12 & 14, Applicant argues on pp. 9-10 of Arguments/Remarks that the newly amended Claims are not directed to an abstract idea because the transformation of data about patients in a dataset comprising personally identifying information to an anonymized data set about patients represents an improvement in a computer or technological field.  More specifically, Applicant argues that because the medical system now comprises an anonymized data set, the data set can be used for research and other functionality and thus represents an improvement in the functioning of the medical system.  Examiner respectfully disagrees with Applicant’s Arguments.  Examiner notes that anonymizing data sets about patients could represent an improvement to the computer or technological field because the data set can now be more aligned with HIPAA requirements, i.e. that the data needs to be anonymized because the medical data could contain sensitive PII and would thus be a violation of the Health Insurance Portability and Accountability Act of 1996 (HIPAA) if the data was not anonymized prior to other sources utilizing the data for research, etc., but this is not made apparent by Applicant or Applicant’s specification.  Even further, it remains unclear how current systems that already operate within the confines of HIPAA, which has been enacted since 1996 and requires medical establishments to anonymize sensitive patient/PII data before research/analysis takes place, would be unable to anonymize said data.  This is further seen by the recitation of anonymization techniques found in cited prior art references.  Therefore, it is understood by Examiner that prior art systems that make use of PII or sensitive patient data already perform anonymization of said data, as seen in said cited prior art references, and thus anonymization alone does not represent an improvement to the technological field as Applicant argues.  While Applicant’s specific form of anonymization may represent an improvement to anonymization systems, this is not argued by Applicant and not specifically apparent through Applicant’s specification.  Rather, Applicant argues that anonymization, in general, represents an improvement to the field of medical systems (See pp.16 of Arguments/Remarks).  However, as explained above, prior art systems already make use of PII or sensitive patient data already perform anonymization of said data, as seen in cited prior art references, and thus anonymization alone does not represent an improvement to the technological field as Applicant argues without a clear nexus between a specific problem that is found in the field of medical/anonymization systems and a solution to said problem as described in the instant set of Claims.  While it is noted in Applicant’s arguments that old and conventional elements should still be given weight in determining whether the Claim in question recites a practical application of the abstract idea, Examiner maintains that the Claims do not amount to a practical application of the abstract idea because there is no clear technical improvement, such as an improvement to anonymization techniques that are already found in prior art systems, and the Claims recite limitations that amount to simply applying the abstract idea via mere instructions to implement the abstract idea on a computer.  As such, Claims 1-12 & 14 remain rejected under 35 U.S.C. 101.
Regarding 35 U.S.C. 101 rejections of Claims 1-12 & 14, Applicant argues on pp. 9 of Arguments/Remarks that the newly amended Claims applies and uses the alleged abstract idea in a particular technological environment, and thus amounts to a practical application of the abstract idea by limiting the use of the alleged abstract idea to the very specific identification and removal of identifying information.  Examiner respectfully disagrees with Applicant’s arguments.  The abstract idea may be confined to the identification and removal of identifying information/PII from data, as Applicant argues, but identification and removal of identifying information/PII from data does not specifically apply to one field of practice/technological environment.  That is, the anonymization and/or de-identification of potentially identifying information efforts could be applied to several fields of practice.  While there are embodiments found in the Claims that somewhat limit these efforts to the field of medical and clinical documentation, i.e. regarding treatment of a disease or clinical events, this is not seen as sufficiently limiting the use of the alleged abstract idea considering how pervasive the field of medical and clinical documentation is.  That is, any generic medical/clinical documentation computer could perform or execute the steps found in the Claims and therefore leads to the interpretation of a drafting effort that is designed to monopolize generation of anonymized data, but simply applied to the vast field of medical and clinical documentation.  Therefore, the Claims do not amount sufficiently limit the use of the alleged abstract idea and do not amount to a practical application of the abstract idea.  As such, Claims 1-12 & 14 remain rejected under 35 U.S.C. 101.
Regarding 35 U.S.C. 101 rejections of Claims 1-15, Applicant argues on pp. 10-11 of Arguments/Remarks that the additional elements constitute an inventive concept and therefore amount to significantly more than any recited judicial exception or abstract idea.  More specifically, Applicant argues that the Claims recite a very specific and concrete physical system rather than an abstract mental process, or generically recited components.  Examiner respectfully disagrees with Applicant’s arguments.  While the additional elements found in the Claims may represent concrete, physical computerized components, said components are used for generic purposes as already found in prior art systems or described in Applicant’s specification (pp. 12, ll. 24 - pp. 13, ll. 12), (p. 5, ll. 5-10), (pp. 14, ll. 3-14), (pp. 14, ll. 17-23).  That is, prior art systems generally make use of a processor, a database, non-transitory storage mediums, network arrangements, program products, computers, data interfaces, etc.  Applicant is not using the aforementioned computer components in a special, unique, or novel manner, rather the components are simply applied for purposes of computer computation and anonymization of PII/data.  As explained above, prior art systems already operate within the confines of HIPAA, which has been enacted since 1996 and therefore can anonymize sensitive patient/PII data before research/analysis takes place.  Therefore, it is understood by Examiner that prior art systems that make use of PII or sensitive patient data already can perform anonymization of said data, as seen in said cited prior art references, and thus anonymization efforts represented well-understood, routine, and conventional activities in the prior art.  Furthermore, the generic computer components that make use of said anonymization efforts are not specific to this invention.  That is, any generic medical/clinical documentation computer could perform or execute the steps found in the Claims and therefore leads to the interpretation of a drafting effort that is designed to monopolize well-understood, routine, and conventional aspects of the generation of anonymized data.  Therefore, the Claims do not recite a specific, ordered combination that constitute an inventive concept and do not amount to significantly more than the recited judicial exception or abstract idea.  As such, Claims 1-12 & 14 remain rejected under 35 U.S.C. 101.
Regarding 35 U.S.C. 103 rejections of Claims 1-7, 10-12 & 14, Applicant argues on pp. 11-13 of Arguments/Remarks that previously cited portions of El Emam and Stevens do not teach the newly amended limitations of the independent Claims 1 & 14.  That is, previously cited portions of El Emam and Stevens do not disclose a database comprising the data set of patient data from multiple patient, the patient data comprising events related to a disease or a treatment of a disease, wherein some of the multiple patients in the data set of patient data each comprise two or more indirect identifiers rendering the respective patient identifiable via a concatenation of the two or more indirect identifiers where there are less than a predefined value (k) of patients having a same concatenation of two or more indirect identifiers wherein an identifiable patient is an outlying patient and providing the anonymized data set of patient data. Examiner agrees with Applicant’s arguments.  Specifically, Examiner concedes that previously cited portions of El Emam and Stevens do not suggest or render obvious the patient data comprising events related to a disease or treatment of disease with associated timestamps for each event, and the patient data comprising two or more indirect identifiers rendering the respective patient identifiable via a concatenation of the two or more indirect identifiers and providing the anonymized data set of patient data.  Therefore, the 35 U.S.C. 103 rejections of Claims 1-7 & 10-15 have been withdrawn.  However, upon further consideration, a new grounds of rejection is made over El Emam and Stevens to specifically meet the newly amended limitations.  For instance, El Emam Par [0004] & [0099] & Figs. 1-2 & 19-20 disclose the use of timestamps and/or data sequences and identification of certain medical events for purposes of comparing identifiable information of a patient; El Emam Par [0063] discloses that dates or timestamps could correspond to events associated with a disease or treatment of a disease such as financial transaction, doctors/clinical visits, etc.; El Emam Par [0060] discloses potentially identifying variables such as age and gender and the indirect combination of age and gender being more potentially identifying of the patient; El Emam Par [0099]-[0100] specifically discloses multiple core dates that are concatenated or connected dates of which may be possible in identifying the patient when combined specifically versus having a singular date and therefore need to be date-shifted or anonymized in the context of the dates being connected; El Emam Par [0104] & Fig. 21 discloses the publishing or providing of the de-identified data by generating a destination table corresponding to the respective source table that has been anonymized However El Emam does not seem to specifically disclose the concatenation or direct combination of one or more strings of identifiable information.  Therefore, these embodiments that are disclosed by El Emam are then modified by Stevens Par [0013], [0035], [0063]-[0064] and Fig. 2 which disclose a concatenation module that concatenate or sequence a predetermined order of PII being to create an anonymous linking code, the concatenation module further orders the data such that the encrypted first name is followed immediately by the encrypted last name, followed by the encrypted insurance policy and so on, constituting concatenations of a plurality of indirect identifiers.  By combining the two references, El Emam and Stevens effectively read on the limitations regarding concatenations of indirect identifying information and the manipulation of said concatenations for purposes of de-identification/anonymization as required by the newly amended independent Claims.  Therefore, the newly cited/applied portions of El Emam and Stevens still read on independent Claims 1 & 14.  As such, Claims 1 & 14 and Claims 2-7 & 10-12, which are dependent from independent Claim 1 remain rejected under 35 U.S.C. 103.
Regarding 35 U.S.C. 103 rejections of Claims 1-7 & 10-15, Applicant argues on pp. 14-15 of Arguments/Remarks that El Emam and Stevens do not disclose the individuals being identified as “outliers”.  Examiner respectfully disagrees with Applicant’s arguments.  In the previous Office Action, portions of El Emam were understood to read on the identification of “outliers” given the broadest reasonable interpretation of the term in light of Applicant’s specification.  That is, while El Emam does not disclose “outliers” per se, El Emam Par [0025] & [0060]-[0062] still discloses datasets containing PII and possibly containing identifying information within said PII and further describes the use of k-anonymity property which describes that a k-anonymized data set has the property that each record is similar to at least another predefined value, k-1 other patient records on the potentially identifiable variables/patients, including patient’s medical procedures, visits, longitudinal sequence such as date of birth, diagnosis of a particular disease, etc., therefore, the records that do not meet this k-anonymity property, are considered “outliers” via Applicant’s description in the specification; thus, by El Emam disclosing the identification of each record either being similar to at least another k-1 other records on potentially identifying variables or the determination that each record does not satisfy said k-anonymity property by all data records not being similar to at least another k-1 other records, El Emam is understood to therefore determine said “outliers” in order to make the determination that the dataset does not satisfy said k-anonymity property;  See further El Emam Par [0061]-[0063] which discloses the system specifically determining if a node is found to be k-anonymous, and if a node is found not to be k-anonymous (thus identifying “outliers” per Applicant’s definition in Applicant’s specification, Claims 1-7 & 10-15 remain rejected under 35 U.S.C. 103.
















Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Eisenberger et al. (U.S. Patent Publication No. 2006/0168043) discloses a system for selectively removing, de-identifying, and/or anonymizing patient identifying information from various patient documents, messages, etc.;
Moore et al. (U.S. Patent Publication No. 2007/0106754) discloses a system for checking certain de-identification properties of patient data and further allows for the syndication of said data via de-identifying and/or anonymizing said data to ensure patient confidentiality without interfering with institution-based error review procedures;
Sweeney et al. (U.S. Patent Publication No. 2002/0169793) discloses a system for deidentifying potentially identifying information in a data record or data source such that the data record satisfies k-anonymity requirements;
Akinmeji et al. (U.S. Patent Publication No. 2018/0096102) discloses a system for redacting sensitive patient data and aggregating anonymization scores and responsive to a score equaling or exceeding a threshold, redacting data corresponding to data types whose scores equal or exceed the threshold;
Rajagopal et al. (U.S. Patent Publication No. 2018/0082020) discloses securing medical records found within a medical record database comprising defining certain identifying data of the patient into data fields, masking said fields through obfuscation or anonymization techniques and displaying the updated data fields;
Antonatos et al. (U.S. Patent Publication No. 2018/0025179) discloses a system for identifying data within a data stream and applying k-anonymization/l-diversity to the identifying data to further anonymize files/data;
Takahashi et al. (U.S. Patent Publication No. 2018/0012039) discloses an anonymization system for anonymizing personally identifying data via extensively utilizing k-anonymity index values as a criterion for ensuring anonymization conformity;
Scaiano et al. (U.S. Patent Publication No. 2017/0177907) discloses a system for reducing re-identification risk of a data set such as patient medical records that contain quasi-identifiers and/or direct identifiers and calculating probabilities of re-identification given said identifiers
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUNTER J RASNIC whose telephone number is 571-270-5801.  The examiner can normally be reached on M-F 7am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool.  To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VICTORIA P. AUGUSTINE can be reached on 313-446-4858.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/H.R./Examiner, Art Unit 3626                                                                                                                                                                                                        07/15/22
/JONATHAN DURANT/Primary Examiner, Art Unit 3619