Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections

  Claims 14 is objected to because of the following informality:  
Claim 14 recites, in line 4,  “identify attribute included in the stream of data;”. Examiner suggests that “attribute”  be replaced with “an attribute”

Appropriate correction is required.



Claim Rejections - 35 USC § 112
        The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a)  IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same,  and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 2, 9, 15  are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for pre-AIA  the inventor(s), at the time the application was filed, had possession of the claimed invention.  
Claims 2, 9, 15 recite “the attribute including an impression of a portion of data”. The specification does not disclose anything about a “an impression of a portion of data”.
Dependent claims 10-13 are also rejected for inheriting the deficiencies of the base claim.
          The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

         
       Claims 2, 9, 15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
Claims 2, 9, 15  recite “the attribute including an impression of a portion of data” Since the specification does not disclose anything about  “an impression of a portion of data”, it is unclear to the examiner how to interpret the limitation of these claims. For purpose of examination in view of the art below, the phrase " an impression of a portion of data is interpreted as “masked data”.



Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1 and 14 are rejected under 35 U.S.C. 102(a)(l) as being
anticipated by Burdick (US 2004/0107203 )


   Regarding claim 1, Burdick discloses: A computer-implemented method comprising: ingesting a stream of data that corresponds to a client; (Burdick,  [0048], The collection of records input  (corresponding to “ingesting”) to the input component 201 may come from one or more sources. Input sources may be static (i.e., data marts, data warehouses, databases, flat files, etc.) or dynamic ( data streams, output of an extraction-transform-load operation, etc.); [0038] Each record contains information about a real world entity (corresponding to a “client”) .) 
 identifying an attribute included in the stream of data; (Burdick, [0025] identifies groups of records that have "similar" values in different records for the same field; [0038], line 2- Each record can be divided into fields, each field describing an attribute of the entity; [0027] As viewed in FIG. 6, parsing may intelligently break a text string into the correct data fields (Note: Examiner interprets that a “field” is corresponding to an “attributer”).)
 processing the attribute in a data profiling process (Burdick, Fig. 5, item 402 “Processing Layer”), the data profiling process including: retrieving a set of validation rules (Burdick, Fig. 5, item 502b , “Correct/Valid. Rules”) and a set of standardization rules (Burdick, Fig. 5, item 503b “Stand./Norm. Rules) that correspond to the attribute;  (Burdick, Fig. 5, item 402 “Processing Layer”, item 403 “Rules Layer”, item 501b “Parsing Rules”,  item 502b “Correct/Valid. Rules” and item 503b “Stand./Norm. Rules” ; [0056] A processing layer 402 of the automated learning component 203 performs the cleansing process on the record collection by implementing predefined algorithms. Each step of the cleansing process is controlled by a set of rules defined in a rules layer 403 (i.e., defining the proper execution for each step); [0038], line 2- Each record can be divided into fields, each field describing an attribute of the entity;  [0027] As viewed in FIG. 6, parsing may intelligently break a text string into the correct data fields. Typically, the data is not found in an easily readable format and a significant amount of decoding needs to be done to determine which piece of text corresponds to what particular data field;)
comparing the attribute with the set of validation rules to validate information included in the attribute;  (Burdick, [0063] Once the string is parsed into the appropriate fields, a correction/validation module 502 determines whether the field values are in the proper range and/or the field values are valid) 
responsive to determining that the information included in the attribute is validated according to the set of validation rules, (Burdick, Fig. 5, item 502b “Correct/Valid. Rules” and item 503b “Stand./Norm. Rules”; [0068] Each layer 402, 403, 404 in the automated learning component 203 has a separate section to support each of the steps, as illustrated in FIG. 5. Generally, the processing layer 402 will execute the steps in the order presented, with the output of the previous step becoming input to the subsequent step; fig. 1 “Standardization/ Validation/ Correction can handle? “; [0029] Once the string is parsed into the appropriate fields, the validation step, as viewed in FIG. 7, may check the field values for proper range and/or validity. Thus, a "truth" criteria must be provided as input to this step for each field. [0030] The correction step may update the existing field value to reflect a specific truth value (i.e., correcting the spelling of "Pittsburgh" in FIG. 7) 
modifying the attribute into a standardized format according to the set of standardization rules; (Burdick, Fig. 5, item 503b;[0063], line 6- The correction/validation module 502 further updates the existing field value to reflect a specific truth value (i.e., correcting the spelling of "Pittsburgh" in FIG. 7); [0064] a standardization rules section 503b of the rules layer 403 [0064] A standardization module 503 arranges the data in a consistent manner and/or a preferred format in order for the data to be compared against data from other sources)
 processing the modified attribute through a set of rules engines; (Burdick, (Burdick, Fig. 5, item 504b “Clustering Rules”, item 505b “Matching Rules”, item 506b “Merging Rules” (Note: Examiner interprets that “Clustering Rules”, “Matching Rules” and “Merging Rules” are corresponding to “a set of rules engines” ); [0068] Each layer 402, 403, 404 in the automated learning component 203 has a separate section to support each of the steps, as illustrated in FIG. 5. Generally, the processing layer 402 will execute the steps in the order presented, with the output of the previous step becoming input to the subsequent step; [0031] As viewed in FIG. 8, the standardization step may arrange the data in a consistent manner and/or a preferred format in order for it to be compared against data from other sources. The preferred format for the data should be provided as input to this step. [0053] An output evaluation module 305 of the pre-processing component 202 evaluates the output of the three other functional modules 302, 303, 304. If the output is determined to be satisfactory, the output is passed to the automated learning component 203 via an output module 306; [0054], line 12- If the output needs to be modified as determined by the output evaluation module 305, the single-source module 302, the information generating module 303, and the planning module 304 may be run again.) 
and outputting the processed attribute to a network-accessible server system. (Burdick, [0047] An output destinations component 205 outputs the results of the data cleansing process to one or more different destinations (i.e., a variety of data mining applications). The results include the cleansed record collection and information about how these results were obtained by the system 200 )

    Claim 14 corresponds to claim 1, and is rejected accordingly.


     Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 8, 15 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Burdick (US 2004/0107203 ) in view of Wang (US 2016/0283735 )
      Regarding claim 2, Burdick discloses all of the features with respect to claim 1 as outlined above. Burdick does not clearly disclose: wherein the attribute comprises an impression of a portion of data included in the stream of data that prevents transmission of information included in the stream of data from a client node maintaining the stream of data.
   However, Wang discloses:
wherein the attribute comprises an impression of a portion of data included in the stream of data that prevents transmission of information included in the stream of data from a client node maintaining the stream of data.   (Wang, [0012] In one aspect there is provided a method for generating a classification model using original data that is sensitive or private to a data owner. The method comprises: receiving, from one or more first entities, a masked data set, each data set from an entity having masked data corresponding to the original sensitive data, the masked data set further including a masked feature label set for use in classifying the masked data contents; [0067], the methods herein may be run on a computer, or any equipment that is designed for data acquisition, transferring, sharing and storage. Such equipment can integrate the method herein to prevent privacy leakage while maintaining the usability of the data )
  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick with the teaching of Wang to share sensitive data with the properties of privacy-preserving and model-preserving and also guarantees that the shared data is safe, i.e., the shared data cannot be used to recover the original ( sensitive) data. (Wang, [0008]-[009]) 
    Claim 15 corresponds to claim 2, and is rejected accordingly.

Regarding claim 8, Burdick discloses all of the features with respect to claim 1 as outlined above. Burdick does not clearly disclose: retrieving client-specific configuration information that includes a listing of labels, wherein each label in the listing of labels provides a client-specific indication of a type of information included in the stream of data; and identifying a first label included in the listing of labels that is indicative of information included in the attribute, wherein the set of validation rules and the set of standardization rules correspond to the first label.
   However, Wang discloses:
retrieving client-specific configuration information that includes a listing of labels, wherein each label in the listing of labels provides a client-specific indication of a type of information included in the stream of data;  
(Wang, [0013], line 3- accessing, from a computing device associated with a first entity, one or more records having original data sensitive to a data owner; generating an original data matrix of original data content including sensitive features and a corresponding feature label set for use in classifying the feature data; generating a random feature matrix sharing the same subspace as the sensitive features of original data matrix; [0035] the generating of a matrix data feature set C and labels set d provides a data encryption function in which the original sensitive data could never be obtained);

and identifying a first label included in the listing of labels that is indicative of information included in the attribute, wherein the set of validation rules and the set of standardization rules correspond to the first label.  (Wang,  abstract, line 4- masked data set having masked data corresponding to the original sensitive data, and further including a masked feature label set for use in classifying the masked data contents; [0012], line 8-forming a shared data collection of the masked data and the masked feature label sets received from the first entities) 
   Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick with the teaching of Wang to share sensitive data with the properties of privacy-preserving and model-preserving and also guarantees that the shared data is safe, i.e., the shared data cannot be used to recover the original ( sensitive) data. (Wang, [0008]-[009]) 
         Claim 20 corresponds to claim 8, and is rejected accordingly.


Claims 3 and 16  are rejected under 35 U.S.C. 103 as being unpatentable over Burdick (US 2004/0107203 ) in view of Patel (US 11,294,906 B2)

        Regarding claim 3, Burdick discloses all of the features with respect to claim 1 as outlined above. Claim 3 further recites: wherein processing the modified attribute through the set of rules engines further comprises:  responsive to determining that the attribute is indicative of a name, processing the modified attribute through a name engine that associates the attribute with associated names included in a listing of associated names;  (Burdick, Fig. 1, e.g. 11th  row, dirty data “Name1 = •J . Smith" Name2 = 'Smith J." [0030] The correction step may update the existing field value to reflect a specific truth value (i.e., correcting the spelling of "Pittsburgh" in FIG. 7). The correction step may use a recognized source of correct data such as a dictionary or a table of correct known values; Fig. 8, First Name: Jim [Wingdings font/0xE0] First Name: James)
and  147837792.1-21- Docket No. 133499-8005.US00 responsive to determining that the attribute is indicative of an address, processing the modified attribute through an address library engine that adds the attribute to a library of addresses associated with the client.  
(Burdick, Fig.7, “city: Pittsburgh" ;[0030] The correction step may update the existing field value to reflect a specific truth value (i.e., correcting the spelling of "Pittsburgh" in FIG. 7). The correction step may use a recognized source of correct data such as a dictionary or a table of correct known values;) 
  However, Burdick does not clearly disclose:
responsive to determining that the attribute is indicative of a name, processing the modified attribute through a name engine that associates the attribute with associated names included in a listing of associated names;  and  147837792.1-21- Docket No. 133499-8005.US00 responsive to determining that the attribute is indicative of an address, processing the modified attribute through an address library engine that adds the attribute to a library of addresses associated with the client.  
 However, Patel discloses:
responsive to determining that the attribute is indicative of a name, processing the modified attribute through a name engine that associates the attribute with associated names included in a listing of associated names;  (Patel, column 2, line 35-the user may provide a search request including search strings for an entity name ( e.g., "Acme" "Widgets,") etc., search strings for the entity address (e.g., "Kansas City," "Reed Street."), etc; column 4, line 61- At optional operation 204, the database management system normalizes the sets of search strings. For example, synonyms and stop words may be filtered from the sets of search strings. Synonyms are words that have the same meaning; column 7, line 8- a model is generated relating search strings provided with historical search requests to the corresponding record field value of the record field or fields returned in the response, which may be approved names on the list of approved names 106;)
 and  147837792.1-21- Docket No. 133499-8005.US00 responsive to determining that the attribute is indicative of an address, processing the modified attribute through an address library engine that adds the attribute to a library of addresses associated with the client.   (Patel, column 2, line 35-the user may provide a search request including search strings for an entity name ( e.g., "Acme" "Widgets,") etc., search strings for the entity address (e.g., "Kansas City," "Reed Street."), etc; column 4, line 61- At optional operation 204, the database management system normalizes the sets of search strings. For example, synonyms and stop words may be filtered from the sets of search strings. Synonyms are words that have the same meaning; column 6, line 59- At operation 302, the database management system 102 60 compares column search strings to a list of approved names 106. The list of approved names 106 is a list of record field values that have been manually and/or automatically approved; column 7, line 19- A training file including the list of approved names may be provided to the neural network. Input features can include values for various record fields such as, for example, raw supplier name, street address, city, postal code, state, country, etc. An output of the neural network is a record identifier, where a unique identifier describes each record in the relevant table; column 14, line 8- identifying a best-fit name from an approved list of names using the first set of 1 o strings; determining that a first string of the first set of strings matches a second string of the best-fit name; and adding the first string to the set of first column keywords… receiving, from a user, an indication that a second string of the first set of strings corresponds to a name associated with the first column of the database table; and adding the name to the approved list of names.)
  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick with the teaching of Patel to provide suitable search results (Patel, column 2, line 42)  and also to identify matching strings between the set of search strings and the best-fit name (Patel, column 7, line 28)  

  Claim 16 corresponds to claim 3, and is rejected accordingly.


Claims 4-5 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Burdick (US 2004/0107203 ) in view of Poduri (US 2021/0004998 )

     Regarding claim 4, Burdick discloses all of the features with respect to claim 1 as outlined above. Burdick does not clearly disclose:
comparing a number of instances of the attribute relative to other attributes in the stream of data;  and generating a usage rank for the attribute, the usage rank based on the number of instances of the attribute in the stream of data, wherein the usage rank is indicative of a number of insights that are capable of being derived from the attribute.  
   However, Poduri discloses: 
comparing a number of instances of the attribute relative to other attributes in the stream of data;  (Poduri, [0052]  a score 116 is a measure of a particular insight's relevancy, importance, and/or value, in comparison to other insights 114.) 
and generating a usage rank for the attribute, the usage rank based on the number of instances of the attribute in the stream of data, wherein the usage rank is indicative of a number of insights that are capable of being derived from the attribute.  (Poduri [00151] One or more embodiments include determining a global score based various factors using the local score and the selected scoring algorithm (Operation 410). The various factors for scoring a particular insight include, for example: (a) a number of insights associated with a same attribute value as the particular insight; and (b) a total number of insights in the set of insights identified at Operation 204. [00152] As an example, an insight engine may compute a ratio of (a) a number of insights associated with a same attribute value as the insight to be scored to (b) a total number of insights in the set of insights identified at Operation 204. The global score may be a sum of the local score and the computed ratio; [00153] the insight engine may apply the weight to the global score to determine a weighted global score; [0047] In one or more embodiments, a score 116 is a measure of a particular insight's relevancy, importance, and/or value. The scores 116 of the insights 114 are used for comparing the relative relevancy, importance, and/or values of the insights 114.)
  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick with the teaching of Poduri to generate a set of insights based on the set of metrics and to select an anomaly based on a context and also to determine a new context for selecting another primary anomaly. Hence, a series of primary anomalies may be selected, each primary anomaly being related to each other. (Poduri, abstract) 


     Regarding claim 5, Burdick discloses all of the features with respect to claim 1 as outlined above. Burdick does not clearly disclose: 
identifying a series of features associated with the attribute that are identified relative to other attributes in the stream of data; and deriving a value score for the attribute based on an aggregation of the series of features. 
However, Poduri discloses:
identifying a series of features associated with the attribute that are identified relative to other attributes in the stream of data; 
and deriving a value score for the attribute based on an aggregation of the series of features.  (Poduri, [0038] An example of an insight algorithm is  an aggregation algorithm. An aggregation algorithm, applied to a particular metric and a particular attribute value, computes a ratio of (a) a sum of the particular metric for communications, associated with the particular attribute value, with a particular node to (b) a sum of the particular metric for all recorded communications with the particular node. [0095] each of a set of insights may be associated with a same attribute but a different attribute value. The set of insights may be generated by an aggregation algorithm; [0122] The insight engine computes new scores for the insights. The new scores may be determined based on different weights. A weight may be determined based on a previously presented primary anomaly. [0124] Additionally or alternatively, the insight engine determines a weight, to be applied to a global score associated with a particular insight, based on a particular attribute value associated with the particular insight [0154], line 4 -an insight that specifies a percentage in which communications associated with a particular attribute value contributed to a particular metric may be generated by an aggregation algorithm.)
  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick with the teaching of Poduri to generate a set of insights based on the set of metrics and to select an anomaly based on a context and also to determine a new context for selecting another primary anomaly. Hence, a series of primary anomalies may be selected, each primary anomaly being related to each other. (Poduri, abstract) 

  Regarding claim 17, Burdick discloses all of the features with respect to claim 14 as outlined above. Burdick does not clearly disclose:
compare a number of instances of the attribute relative to other attributes in the stream of data;  147837792.1-26- Docket No. 133499-8005.US00 generate a usage rank for the attribute, the usage rank based on the number of instances of the attribute in the stream of data, wherein the usage rank is indicative of a number of insights that are capable of being derived from the attribute; identify a series of features associated with the attribute that are identified relative to other attributes in the stream of data, the series of features used to identify a value score for the attribute.  
     However, Poduri discloses: 
compare a number of instances of the attribute relative to other attributes in the stream of data;  (Poduri, [0052]  a score 116 is a measure of a particular insight's relevancy, importance, and/or value, in comparison to other insights 114.)
generate a usage rank for the attribute, the usage rank based on the number of instances of the attribute in the stream of data, wherein the usage rank is indicative of a number of insights that are capable of being derived from the attribute; (Poduri [00151] One or more embodiments include determining a global score based various factors using the local score and the selected scoring algorithm (Operation 410). The various factors for scoring a particular insight include, for example: (a) a number of insights associated with a same attribute value as the particular insight; and (b) a total number of insights in the set of insights identified at Operation 204. [00152] As an example, an insight engine may compute a ratio of (a) a number of insights associated with a same attribute value as the insight to be scored to (b) a total number of insights in the set of insights identified at Operation 204. The global score may be a sum of the local score and the computed ratio; [00153] the insight engine may apply the weight to the global score to determine a weighted global score; [0047] In one or more embodiments, a score 116 is a measure of a particular insight's relevancy, importance, and/or value. The scores 116 of the insights 114 are used for comparing the relative relevancy, importance, and/or values of the insights 114.)
identify a series of features associated with the attribute that are identified relative to other attributes in the stream of data, the series of features used to identify a value score for the attribute.  (Poduri, [0038] An example of an insight algorithm is  an aggregation algorithm. An aggregation algorithm, applied to a particular metric and a particular attribute value, computes a ratio of (a) a sum of the particular metric for communications, associated with the particular attribute value, with a particular node to (b) a sum of the particular metric for all recorded communications with the particular node. [0095] each of a set of insights may be associated with a same attribute but a different attribute value. The set of insights may be generated by an aggregation algorithm; [0122] The insight engine computes new scores for the insights. The new scores may be determined based on different weights. A weight may be determined based on a previously presented primary anomaly. [0124] Additionally or alternatively, the insight engine determines a weight, to be applied to a global score associated with a particular insight, based on a particular attribute value associated with the particular insight [0154], line 4 -an insight that specifies a percentage in which communications associated with a particular attribute value contributed to a particular metric may be generated by an aggregation algorithm.)
  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick with the teaching of Poduri to generate a set of insights based on the set of metrics and to select an anomaly based on a context and also to determine a new context for selecting another primary anomaly. Hence, a series of primary anomalies may be selected, each primary anomaly being related to each other. (Poduri, abstract) 


  Claims 6, 11 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Burdick (US 2004/0107203 ) in view of Poduri (US 2021/0004998 ) in view of Miller (US 2018/0293327) in further view of Kabra (US 2019/0377715 )

  Regarding claim 6, Burdick in view of Poduri discloses all of the features with respect to claim 5 as outlined above. Burdick does not clearly disclose: 
wherein deriving the value score for the attribute based on the aggregation of the series of features further comprises:  processing the attribute to derive a quality feature of the attribute, the quality feature identifies a number of differences between the attribute as identified in the stream of data and the modified attribute modified according to the set of standardization rules; processing the attribute to derive an availability feature of the attribute, the availability feature indicative of a number of null entries in a portion of data in the stream of data that corresponds to the attribute; processing the attribute to derive a cardinality feature of the attribute, the cardinality feature indicative of a difference of the attribute relative to other attributes in the stream of data; Docket No. 133499-8005.US00 aggregating the derived quality feature, the derived availability feature, and the derived cardinality feature of the attribute to generate the value score for the attribute.   
     However, Poduri discloses:
Docket No. 133499-8005.US00 aggregating the derived quality feature, the derived availability feature, and the derived cardinality feature of the attribute to generate the value score for the attribute.  (Poduri, [0038] An example of an insight algorithm is  an aggregation algorithm. An aggregation algorithm, applied to a particular metric and a particular attribute value, computes a ratio of (a) a sum of the particular metric for communications, associated with the particular attribute value, with a particular node to (b) a sum of the particular metric for all recorded communications with the particular node. [0095] each of a set of insights may be associated with a same attribute but a different attribute value. The set of insights may be generated by an aggregation algorithm; [0122] The insight engine computes new scores for the insights. The new scores may be determined based on different weights. A weight may be determined based on a previously presented primary anomaly. [0124] Additionally or alternatively, the insight engine determines a weight, to be applied to a global score associated with a particular insight, based on a particular attribute value associated with the particular insight [0154], line 4 -an insight that specifies a percentage in which communications associated with a particular attribute value contributed to a particular metric may be generated by an aggregation algorithm.)
  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick with the teaching of Poduri to generate a set of insights based on the set of metrics and to select an anomaly based on a context and also to determine a new context for selecting another primary anomaly. Hence, a series of primary anomalies may be selected, each primary anomaly being related to each other. (Poduri, abstract) 
  However, Burdick in view of Poduri does not clearly disclose: 
wherein deriving the value score for the attribute based on the aggregation of the series of features further comprises:  processing the attribute to derive a quality feature of the attribute, the quality feature identifies a number of differences between the attribute as identified in the stream of data and the modified attribute modified according to the set of standardization rules; processing the attribute to derive an availability feature of the attribute, the availability feature indicative of a number of null entries in a portion of data in the stream of data that corresponds to the attribute; processing the attribute to derive a cardinality feature of the attribute, the cardinality feature indicative of a difference of the attribute relative to other attributes in the stream of data;
   However, Miller discloses: 
processing the attribute to derive an availability feature of the attribute, the availability feature indicative of a number of null entries in a portion of data in the stream of data that corresponds to the attribute;  ( Miller ,[0275], The display table section 1322 can provide detailed information corresponding to various fields or field values that relate to the selected result group 1315. In the illustrated embodiment of FIG. 13A, the display table section 1322 includes a field name column 1324, a type column 1326, a match column 1328, a uniqueness column 1330, a null values column 1332, and a top value column 1334. [0276]The null values column 1332 can identify the percentage of events in the result group 1315 that include a null value for identified in the field name column 1324. ) 
processing the attribute to derive a cardinality feature of the attribute, the cardinality feature indicative of a difference of the attribute relative to other attributes in the stream of data;  (Miller, [0275], The display table section 1322 can provide detailed information corresponding to various fields or field values that relate to the selected result group 1315. In the illustrated embodiment of FIG. 13A, the display table section 1322 includes a field name column 1324, a type column 1326, a match column 1328, a uniqueness column 1330, a null values column 1332, and a top value column 1334. [0267], line 10-The uniqueness column 1330 can indicate the quantity of unique entries that have the field identified in the field name column 1324.) 
  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick in view of Poduri with the teaching of Miller to  extract pre-specified data items and to facilitate efficient retrieval and analysis of those data items at search time, (Miller, [0057]) , and also to provide one or more tools for searching and analyzing large sets of data to locate data of interest, (Miller, [0002]). 
   However, Burdick in view of Poduri in view of Miller does not clearly disclose:
wherein deriving the value score for the attribute based on the aggregation of the series of features further comprises:  processing the attribute to derive a quality feature of the attribute, the quality feature identifies a number of differences between the attribute as identified in the stream of data and the modified attribute modified according to the set of standardization rules;
   However, Kabra discloses: 
wherein deriving the value score for the attribute based on the aggregation of the series of features further comprises: processing the attribute to derive a quality feature of the attribute, the quality feature identifies a number of differences between the attribute as identified in the stream of data and the modified attribute modified according to the set of standardization rules; (Kabra, [0017] For example, during data profiling, a score may be generated for each column that will indicate how “dirty” the column is in terms of standardization. In another words, a metric is generated that will indicate the confidence that the quality of data would improve if a standardization process would be applied on them; [0003], line 7- the calculated data standardization score reflecting whether data quality of attribute values would increase if a standardization rule is applied to the attribute values. [0054] According to one embodiment, the similarity algorithm comprises at least one of edit distance and Levenshtein edit distance algorithms.) 
    Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick in view of Poduri in view of Miller with the teaching of Kabra to  determine a data standardization score for an attribute of a dataset which reflects whether data quality of attribute values would increase if a standardization rule is applied to the attribute values. (Kabra, abstract) 

   Claim 18 corresponds to claim 6, and is rejected accordingly.


     Regarding claim 11, Burdick in view of Poduri in view of Miller discloses all of the features with respect to claim 9 as outlined above. Burdick does not clearly disclose: 
wherein deriving the value score for the attribute based on the aggregation of the series of features further comprises: processing the attribute to derive a quality feature of the attribute, the quality feature identifies a number of differences between the attribute as identified in the dataset and the modified attribute modified according to the set of standardization rules;  147837792.1-24- Docket No. 133499-8005.US00 processing the attribute to derive an availability feature of the attribute, the availability feature indicative of a number of null entries in the portion of data in the dataset that corresponds to the attribute; processing the attribute to derive a cardinality feature of the attribute, the cardinality feature indicative of a difference of the attribute relative to other attributes in the dataset; aggregating the derived quality feature, availability feature, and cardinality feature of the attribute to generate the value score for the attribute.  
However, Poduri discloses:
aggregating the derived quality feature, availability feature, and cardinality feature of the attribute to generate the value score for the attribute.  (Poduri, [0038] An example of an insight algorithm is  an aggregation algorithm. An aggregation algorithm, applied to a particular metric and a particular attribute value, computes a ratio of (a) a sum of the particular metric for communications, associated with the particular attribute value, with a particular node to (b) a sum of the particular metric for all recorded communications with the particular node. [0095] each of a set of insights may be associated with a same attribute but a different attribute value. The set of insights may be generated by an aggregation algorithm; [0122] The insight engine computes new scores for the insights. The new scores may be determined based on different weights. A weight may be determined based on a previously presented primary anomaly. [0124] Additionally or alternatively, the insight engine determines a weight, to be applied to a global score associated with a particular insight, based on a particular attribute value associated with the particular insight [0154], line 4 -an insight that specifies a percentage in which communications associated with a particular attribute value contributed to a particular metric may be generated by an aggregation algorithm.)
  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick with the teaching of Poduri to generate a set of insights based on the set of metrics and to select an anomaly based on a context and also to determine a new context for selecting another primary anomaly. Hence, a series of primary anomalies may be selected, each primary anomaly being related to each other. (Poduri, abstract) 
  However, Burdick in view of Poduri does not clearly disclose: wherein deriving the value score for the attribute based on the aggregation of the series of features further comprises: processing the attribute to derive a quality feature of the attribute, the quality feature identifies a number of differences between the attribute as identified in the dataset and the modified attribute modified according to the set of standardization rules;  147837792.1-24- Docket No. 133499-8005.US00 processing the attribute to derive an availability feature of the attribute, the availability feature indicative of a number of null entries in the portion of data in the dataset that corresponds to the attribute; processing the attribute to derive a cardinality feature of the attribute, the cardinality feature indicative of a difference of the attribute relative to other attributes in the dataset; 
  However, Miller discloses:  
processing the attribute to derive an availability feature of the attribute, the availability feature indicative of a number of null entries in the portion of data in the dataset that corresponds to the attribute; ( Miller ,[0275], The display table section 1322 can provide detailed information corresponding to various fields or field values that relate to the selected result group 1315. In the illustrated embodiment of FIG. 13A, the display table section 1322 includes a field name column 1324, a type column 1326, a match column 1328, a uniqueness column 1330, a null values column 1332, and a top value column 1334. [0276]The null values column 1332 can identify the percentage of events in the result group 1315 that include a null value for identified in the field name column 1324. )
processing the attribute to derive a cardinality feature of the attribute, the cardinality feature indicative of a difference of the attribute relative to other attributes in the dataset;  (Miller, [0275], The display table section 1322 can provide detailed information corresponding to various fields or field values that relate to the selected result group 1315. In the illustrated embodiment of FIG. 13A, the display table section 1322 includes a field name column 1324, a type column 1326, a match column 1328, a uniqueness column 1330, a null values column 1332, and a top value column 1334. [0267], line 10-The uniqueness column 1330 can indicate the quantity of unique entries that have the field identified in the field name column 1324.) 
  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick in view of Poduri with the teaching of Miller to  extract pre-specified data items and to facilitate efficient retrieval and analysis of those data items at search time, (Miller, [0057]) , and also to provide one or more tools for searching and analyzing large sets of data to locate data of interest, (Miller, [0002]). 
   However, Burdick in view of Poduri in view of Miller does not clearly disclose: wherein deriving the value score for the attribute based on the aggregation of the series of features further comprises: processing the attribute to derive a quality feature of the attribute, the quality feature identifies a number of differences between the attribute as identified in the dataset and the modified attribute modified according to the set of standardization rules;  147837792.1-24- 
However, Kabra discloses:
wherein deriving the value score for the attribute based on the aggregation of the series of features further comprises: processing the attribute to derive a quality feature of the attribute, the quality feature identifies a number of differences between the attribute as identified in the dataset and the modified attribute modified according to the set of standardization rules;  147837792.1-24- (Kabra, [0017] For example, during data profiling, a score may be generated for each column that will indicate how “dirty” the column is in terms of standardization. In another words, a metric is generated that will indicate the confidence that the quality of data would improve if a standardization process would be applied on them; [0003], line 7- the calculated data standardization score reflecting whether data quality of attribute values would increase if a standardization rule is applied to the attribute values. [0054] According to one embodiment, the similarity algorithm comprises at least one of edit distance and Levenshtein edit distance algorithms.) 
    Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick in view of Poduri in view of Miller with the teaching of Kabra to  determine a data standardization score for an attribute of a dataset which reflects whether data quality of attribute values would increase if a standardization rule is applied to the attribute values. (Kabra, abstract) 


  Claims 7, 9 and 12  are rejected under 35 U.S.C. 103 as being unpatentable over Burdick (US 2004/0107203 ) in view of Poduri (US 2021/0004998 ) in view of Miller (US 2018/0293327)

      Regarding claim 7, Burdick in view of Poduri discloses all of the features with respect to claim 5 as outlined above. Burdick in view of Poduri does not clearly disclose: 
wherein comparing the attribute with the set of validation rules to validate information included in the attribute further comprises:  determining whether the attribute includes a null value that is identified in the set of validation rules, wherein the attribute is validated responsive to determining that the attribute does not include the null value.  
 However, Miller discloses: 
 wherein comparing the attribute with the set of validation rules to validate information included in the attribute further comprises:  determining whether the attribute includes a null value that is identified in the set of validation rules, wherein the attribute is validated responsive to determining that the attribute does not include the null value.  (Miller, [0276]The null values column 1332 can identify the percentage of events in the result group 1315 that include a null value for identified in the field name column 1324. [0278] Further, various numerical values within the table, such as the percentages of events that match and/or have no (null) corresponding match, may also be provided graphically next to the numerical value, such as via a bar graph.)
    Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick in view of Poduri with the teaching of Miller to  extract pre-specified data items and to facilitate efficient retrieval and analysis of those data items at search time, (Miller, [0057]) , and also to provide one or more tools for searching and analyzing large sets of data to locate data of interest, (Miller, [0002]). 
   

     Regarding claim 9, Burdick discloses:   A method performed by a computing node to generate a modified attribute of a dataset, the method comprising: ingesting a dataset from a client node that corresponds to a client; (Burdick,  [0048], The collection of records input  (corresponding to “ingesting”) to the input component 201 may come from one or more sources. Input sources may be static (i.e., data marts, data warehouses, databases, flat files, etc.) or dynamic ( data streams, output of an extraction-transform-load operation, etc.); [0038] Each record contains information about a real world entity (corresponding to a “client”) .) 
identifying an attribute from the dataset, (Burdick, [0025] identifies groups of records that have "similar" values in different records for the same field; [0038], line 2- Each record can be divided into fields, each field describing an attribute of the entity; [0027] As viewed in FIG. 6, parsing may intelligently break a text string into the correct data fields.)
retrieving a set of validation rules (Burdick, Fig. 5, item 502b , “Correct/Valid. Rules”) and a set of standardization rules  (Burdick, Fig. 5, item 503b “Stand./Norm. Rules) that correspond to the attribute; (Burdick, Fig. 5, item 402 “Processing Layer”, item 403 “Rules Layer”, item 501b “Parsing Rules”,  item 502b “Correct/Valid. Rules” and item 503b “Stand./Norm. Rules” ; [0056] A processing layer 402 of the automated learning component 203 performs the cleansing process on the record collection by implementing predefined algorithms. Each step of the cleansing process is controlled by a set of rules defined in a rules layer 403 (i.e., defining the proper execution for each step); [0038], line 2- Each record can be divided into fields, each field describing an attribute of the entity;  [0027] As viewed in FIG. 6, parsing may intelligently break a text string into the correct data fields. Typically, the data is not found in an easily readable format and a significant amount of decoding needs to be done to determine which piece of text corresponds to what particular data field;)
 comparing the attribute with the set of validation rules to validate information included in the attribute;  (Burdick, [0063] Once the string is parsed into the appropriate fields, a correction/validation module 502 determines whether the field values are in the proper range and/or the field values are valid) 
responsive to determining that the information included in the attribute is validated according to the set of validation rules, (Burdick, Fig. 5, item 502b “Correct/Valid. Rules” and item 503b “Stand./Norm. Rules”; [0068] Each layer 402, 403, 404 in the automated learning component 203 has a separate section to support each of the steps, as illustrated in FIG. 5. Generally, the processing layer 402 will execute the steps in the order presented, with the output of the previous step becoming input to the subsequent step; fig. 1 “Standardization/ Validation/ Correction can handle? “; [0029] Once the string is parsed into the appropriate fields, the validation step, as viewed in FIG. 7, may check the field values for proper range and/or validity. Thus, a "truth" criteria must be provided as input to this step for each field. [0030] The correction step may update the existing field value to reflect a specific truth value (i.e., correcting the spelling of "Pittsburgh" in FIG. 7)
modifying the attribute into a standardized format according to the set of standardization rules; (Burdick, Fig. 5, item 503b;[0063], line 6- The correction/validation module 502 further updates the existing field value to reflect a specific truth value (i.e., correcting the spelling of "Pittsburgh" in FIG. 7); [0064] a standardization rules section 503b of the rules layer 403 [0064] A standardization module 503 arranges the data in a consistent manner and/or a preferred format in order for the data to be compared against data from other sources)
processing the modified attribute through a set of rules engines; (Burdick, (Burdick, Fig. 5, item 504b “Clustering Rules”, item 505b “Matching Rules”, item 506b “Merging Rules” (Note: Examiner interprets that “Clustering Rules”, “Matching Rules” and “Merging Rules” are corresponding to “a set of rules engines” ); [0068] Each layer 402, 403, 404 in the automated learning component 203 has a separate section to support each of the steps, as illustrated in FIG. 5. Generally, the processing layer 402 will execute the steps in the order presented, with the output of the previous step becoming input to the subsequent step; [0031] As viewed in FIG. 8, the standardization step may arrange the data in a consistent manner and/or a preferred format in order for it to be compared against data from other sources. The preferred format for the data should be provided as input to this step. [0053] An output evaluation module 305 of the pre-processing component 202 evaluates the output of the three other functional modules 302, 303, 304. If the output is determined to be satisfactory, the output is passed to the automated learning component 203 via an output module 306; [0054], line 12- If the output needs to be modified as determined by the output evaluation module 305, the single-source module 302, the information generating module 303, and the planning module 304 may be run again.) 
and outputting the processed attribute to a network-accessible server system. (Burdick, [0047] An output destinations component 205 outputs the results of the data cleansing process to one or more different destinations (i.e., a variety of data mining applications). The results include the cleansed record collection and information about how these results were obtained by the system 200 )
  However, Burdick does not clearly disclose:
the attribute including an impression of a portion of data in the dataset; comparing a number of instances of the attribute relative to other attributes in the dataset; generating a usage rank for the attribute based on the number of instances of the attribute in the dataset; identifying a series of features associated with the attribute that are identified relative to other attributes in the dataset;  147837792.1-23-Docket No. 133499-8005.US00 deriving a value score for the attribute based on an aggregation of the series of features;  
 However, Poduri discloses: 
comparing a number of instances of the attribute relative to other attributes in the dataset;  (Poduri, [0052]  a score 116 is a measure of a particular insight's relevancy, importance, and/or value, in comparison to other insights 114.) 
generating a usage rank for the attribute based on the number of instances of the attribute in the dataset; (Poduri [00151] One or more embodiments include determining a global score based various factors using the local score and the selected scoring algorithm (Operation 410). The various factors for scoring a particular insight include, for example: (a) a number of insights associated with a same attribute value as the particular insight; and (b) a total number of insights in the set of insights identified at Operation 204. [00152] As an example, an insight engine may compute a ratio of (a) a number of insights associated with a same attribute value as the insight to be scored to (b) a total number of insights in the set of insights identified at Operation 204. The global score may be a sum of the local score and the computed ratio; [00153] the insight engine may apply the weight to the global score to determine a weighted global score; [0047] In one or more embodiments, a score 116 is a measure of a particular insight's relevancy, importance, and/or value. The scores 116 of the insights 114 are used for comparing the relative relevancy, importance, and/or values of the insights 114.)
identifying a series of features associated with the attribute that are identified relative to other attributes in the dataset;  147837792.1-23-Docket No. 133499-8005.US00 deriving a value score for the attribute based on an aggregation of the series of features;  (Poduri, [0038] An example of an insight algorithm is  an aggregation algorithm. An aggregation algorithm, applied to a particular metric and a particular attribute value, computes a ratio of (a) a sum of the particular metric for communications, associated with the particular attribute value, with a particular node to (b) a sum of the particular metric for all recorded communications with the particular node. [0095] each of a set of insights may be associated with a same attribute but a different attribute value. The set of insights may be generated by an aggregation algorithm; [0122] The insight engine computes new scores for the insights. The new scores may be determined based on different weights. A weight may be determined based on a previously presented primary anomaly. [0124] Additionally or alternatively, the insight engine determines a weight, to be applied to a global score associated with a particular insight, based on a particular attribute value associated with the particular insight [0154], line 4 -an insight that specifies a percentage in which communications associated with a particular attribute value contributed to a particular metric may be generated by an aggregation algorithm.)
  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick with the teaching of Poduri to generate a set of insights based on the set of metrics and to select an anomaly based on a context and also to determine a new context for selecting another primary anomaly. Hence, a series of primary anomalies may be selected, each primary anomaly being related to each other. (Poduri, abstract) 
However, Burdick in view of Poduri does not clearly disclose: 
the attribute including an impression of a portion of data in the dataset;
  However, Miller discloses: 
the attribute including an impression of a portion of data in the dataset; (Miller, [0109], line 6- masking a portion of an event (e.g., masking a credit card number))
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick in view of Poduri with the teaching of Miller to  extract pre-specified data items and to facilitate efficient retrieval and analysis of those data items at search time, (Miller, [0057]) , and also to provide one or more tools for searching and analyzing large sets of data to locate data of interest, (Miller, [0002]). 


  Regarding claim 12, Burdick in view of Poduri in view of Miller discloses all of the features with respect to claim 9 as outlined above. Burdick in view of Poduri does not clearly disclose: wherein comparing the attribute with the set of validation rules to validate information included in the attribute further comprises: determining whether the attribute includes a null value that is identified in the set of validation rules, wherein the attribute is validated responsive to determining that the attribute does not include the null value.  
   However, Miller discloses: 
wherein comparing the attribute with the set of validation rules to validate information included in the attribute further comprises: determining whether the attribute includes a null value that is identified in the set of validation rules, wherein the attribute is validated responsive to determining that the attribute does not include the null value.  (Miller, [0276]The null values column 1332 can identify the percentage of events in the result group 1315 that include a null value for identified in the field name column 1324. [0278] Further, various numerical values within the table, such as the percentages of events that match and/or have no (null) corresponding match, may also be provided graphically next to the numerical value, such as via a bar graph.)
   Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick in view of Poduri with the teaching of Miller to  extract pre-specified data items and to facilitate efficient retrieval and analysis of those data items at search time, (Miller, [0057]) , and also to provide one or more tools for searching and analyzing large sets of data to locate data of interest, (Miller, [0002]).

 
Claim 10 is  rejected under 35 U.S.C. 103 as being unpatentable over Burdick (US 2004/0107203 ) in view of Poduri (US 2021/0004998 ) in view of Miller (US 2018/0293327) in further view of Patel (US 11,294,906 B2)

   Regarding claim 10, Burdick in view of Poduri in view of Miller discloses all of the features with respect to claim 9 as outlined above. Claim 10 further recites: 
wherein processing the modified attribute through the set of rules engines further comprises: responsive to determining that the attribute is indicative of a name, processing the modified attribute through a name engine that associates the attribute with associated names included in a listing of associated names; (Burdick, Fig. 1, e.g. 11th  row, dirty data “Name1 = •J . Smith" Name2 = 'Smith J." [0030] The correction step may update the existing field value to reflect a specific truth value (i.e., correcting the spelling of "Pittsburgh" in FIG. 7). The correction step may use a recognized source of correct data such as a dictionary or a table of correct known values; Fig. 8, First Name: Jim [Wingdings font/0xE0] First Name: James)
and responsive to determining that the attribute is indicative of an address, processing the modified attribute through an address library engine that adds the attribute to a library of addresses associated with the client.  (Burdick, Fig.7, “city: Pittsburgh" ;[0030] The correction step may update the existing field value to reflect a specific truth value (i.e., correcting the spelling of "Pittsburgh" in FIG. 7). The correction step may use a recognized source of correct data such as a dictionary or a table of correct known values;) 
  However, Burdick in view of Poduri in view of Miller does not clearly disclose:
wherein processing the modified attribute through the set of rules engines further comprises: responsive to determining that the attribute is indicative of a name, processing the modified attribute through a name engine that associates the attribute with associated names included in a listing of associated names;
   However, Patel discloses:
responsive to determining that the attribute is indicative of a name, processing the modified attribute through a name engine that associates the attribute with associated names included in a listing of associated names; (Patel, column 2, line 35-the user may provide a search request including search strings for an entity name ( e.g., "Acme" "Widgets,") etc., search strings for the entity address (e.g., "Kansas City," "Reed Street."), etc; column 4, line 61- At optional operation 204, the database management system normalizes the sets of search strings. For example, synonyms and stop words may be filtered from the sets of search strings. Synonyms are words that have the same meaning; column 7, line 8- a model is generated relating search strings provided with historical search requests to the corresponding record field value of the record field or fields returned in the response, which may be approved names on the list of approved names 106;)
and responsive to determining that the attribute is indicative of an address, processing the modified attribute through an address library engine that adds the attribute to a library of addresses associated with the client.  (Patel, column 2, line 35-the user may provide a search request including search strings for an entity name ( e.g., "Acme" "Widgets,") etc., search strings for the entity address (e.g., "Kansas City," "Reed Street."), etc; column 4, line 61- At optional operation 204, the database management system normalizes the sets of search strings. For example, synonyms and stop words may be filtered from the sets of search strings. Synonyms are words that have the same meaning; column 6, line 59- At operation 302, the database management system 102 60 compares column search strings to a list of approved names 106. The list of approved names 106 is a list of record field values that have been manually and/or automatically approved; column 7, line 19- A training file including the list of approved names may be provided to the neural network. Input features can include values for various record fields such as, for example, raw supplier name, street address, city, postal code, state, country, etc. An output of the neural network is a record identifier, where a unique identifier describes each record in the relevant table; column 14, line 8- identifying a best-fit name from an approved list of names using the first set of 1 o strings; determining that a first string of the first set of strings matches a second string of the best-fit name; and adding the first string to the set of first column keywords… receiving, from a user, an indication that a second string of the first set of strings corresponds to a name associated with the first column of the database table; and adding the name to the approved list of names.)
  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick in view of Poduri in view of Miller with the teaching of Patel to provide suitable search results (Patel, column 2, line 42)  and also to identify matching strings between the set of search strings and the best-fit name (Patel, column 7, line 28)  

  Claim 13 is  rejected under 35 U.S.C. 103 as being unpatentable over Burdick (US 2004/0107203 ) in view of Poduri (US 2021/0004998 ) in view of Miller (US 2018/0293327) in further view of Wang (US 2016/0283735) 

Regarding claim 13, Burdick in view of Poduri in view of Miller discloses all of the features with respect to claim 9 as outlined above. Burdick in view of Poduri in view of Miller does not clearly disclose: 
retrieving client-specific configuration information that includes a listing of labels, wherein each label in the listing of labels provides a client-specific indication of a type of information included in the dataset; and identifying a first label included in the listing of labels that is indicative of information included in the attribute, wherein the set of validation rules and the set of standardization rules correspond to the first label.  
   However, Wang discloses:
retrieving client-specific configuration information that includes a listing of labels, wherein each label in the listing of labels provides a client-specific indication of a type of information included in the dataset; (Wang, [0013], line 3- accessing, from a computing device associated with a first entity, one or more records having original data sensitive to a data owner; generating an original data matrix of original data content including sensitive features and a corresponding feature label set for use in classifying the feature data; generating a random feature matrix sharing the same subspace as the sensitive features of original data matrix; [0035] the generating of a matrix data feature set C and labels set d provides a data encryption function in which the original sensitive data could never be obtained);
and identifying a first label included in the listing of labels that is indicative of information included in the attribute, wherein the set of validation rules and the set of standardization rules correspond to the first label.  (Wang,  abstract, line 4- masked data set having masked data corresponding to the original sensitive data, and further including a masked feature label set for use in classifying the masked data contents; [0012], line 8-forming a shared data collection of the masked data and the masked feature label sets received from the first entities)
   Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick in view of Poduri in view of Miller with the teaching of Wang to share sensitive data with the properties of privacy-preserving and model-preserving and also guarantees that the shared data is safe, i.e., the shared data cannot be used to recover the original ( sensitive) data. (Wang, [0008]-[009]) 

  Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Burdick (US 2004/0107203 ) in view of Miller (US 2018/0293327)


    Regarding claim 19, Burdick discloses all of the features with respect to claim 14 as outlined above. Burdick does not clearly disclose: wherein said compare the attribute with the set of validation rules to validate information included in the attribute further comprises: determine whether the attribute includes a null value that is identified in the set of validation rules, wherein the attribute is validated responsive to determining that the attribute does not include the null value.  
    However, Miller discloses:
 wherein said compare the attribute with the set of validation rules to validate information included in the attribute further comprises: determine whether the attribute includes a null value that is identified in the set of validation rules, wherein the attribute is validated responsive to determining that the attribute does not include the null value.  (Miller, [0276]The null values column 1332 can identify the percentage of events in the result group 1315 that include a null value for identified in the field name column 1324. [0278] Further, various numerical values within the table, such as the percentages of events that match and/or have no (null) corresponding match, may also be provided graphically next to the numerical value, such as via a bar graph.)
   Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Burdick with the teaching of Miller to  extract pre-specified data items and to facilitate efficient retrieval and analysis of those data items at search time, (Miller, [0057]) , and also to provide one or more tools for searching and analyzing large sets of data to locate data of interest, (Miller, [0002]). 




Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Faezeh Forouharnejad whose telephone number is (571)270-7416.  The examiner can normally be reached on generally Monday through Friday. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark Featherstone can be reached on (571) 270-3750. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 



/F.F. /
Examiner, Art Unit 2166

/MARK D FEATHERSTONE/Supervisory Patent Examiner, Art Unit 2166