Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In respect to remark filed on 02/14/2022, the claims 1-7 are pending.
Response to Arguments
Applicant's arguments filed 02/14/2022 have been fully considered but they are not persuasive.   The following the Applicant’s argument:

    PNG
    media_image1.png
    368
    729
    media_image1.png
    Greyscale

Examiner does not agree with Applicant’s argument since Examiner has addressed the new limitations as rejection below.  

    PNG
    media_image2.png
    175
    686
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    54
    657
    media_image3.png
    Greyscale

	Examiner does not agree with Applicant’s argument since Szczepanik and Breeden et al. discloses new amendment limitations (see rejection below).

    PNG
    media_image4.png
    275
    720
    media_image4.png
    Greyscale

	Examiner does not agree with Applicant’s argument since Szczepanik discloses computing expected data output of a data pipeline-based on the metadata, the value distribution, the data types, and the data schema of the unprocessed data “The algorithm may learn by comparing the actual output with the correct outputs in order to find errors. The machine learning module 114 may modify the model of data according to the correct outputs to refine the decision making of the machine learning module 114, improving the accuracy of the automated decision making of the machine learning module 114 to provide the correct inputs. During the training phase, the machine learning module 114 may learn the correct outputs by analyzing and describing well known data and information, that may be stored by the knowledge base 110, which may be used as a reference describing data types and attribute”(0086) and correct outputs is expected data input as claimed invention and “Machine learning that is unsupervised may not be “told” the right answer the way supervised learning algorithms do. Instead, during unsupervised learning, the algorithm may explore the data to find a common structure between the files being explored. Embodiments of an unsupervised learning algorithm can identify common attributes of metadata between each of the files streamed to or stored by the raw data storage 117. Examples of unsupervised machine learning may include self-organizing maps, nearest-neighbor mapping, k-means clustering, and singular value decomposition.” (0087) and organizing map is schema of claimed invention); Further, Applicant’s argument that Szczepanik fails to mention value distribution of any kind.  Examiner does not agree with value distribution can be any value, for Examine asserts that Szczepanik discloses “identifying common attributes metadata between each of the file streamed” can be the value distribution as claimed invention.  The claims do not define what is the value distribution and there is no distinct the value distribution of claimed invention and any value distribution such as attributes.  Further, the claim recites “identifying a schema for unprocessed data” at line 7, so there is any schema that can used for unprocessed data (i.e., “one or more associated operational databases 123 comprising database engines 119 capable of applying a particular schema to the files of the raw data storage 117. Schemas of the operational databases 123 implemented by the database engines 119 may control how files from the raw data storage 117 may be processed upon subsequently being queried by a user or administrator of the data lake system 101” (0071) Database engine 119 capable of applying a particular schema is the same identifying a schema for the unprocessed data as claimed invention); however Szczepanik discloses more than the claim requires such as the data schema of the unprocessed data (“Machine learning that is unsupervised may not be “told” the right answer the way supervised learning algorithms do. Instead, during unsupervised learning, the algorithm may explore the data to find a common structure between the files being explored. Embodiments of an unsupervised learning algorithm can identify common attributes of metadata between each of the files streamed to or stored by the raw data storage 117. Examples of unsupervised machine learning may include self-organizing maps, nearest-neighbor mapping, k-means clustering, and singular value decomposition.” (0087) and organizing map is schema of claimed invention).  Therefore, the Applicant’s arguments are not persuasive.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, and 6 are rejected under 35 U.S.C 103(a) as being unpatentable over Szczepanik et al. (U.S. Pub. 2020/0174966 A1) in view of Breeden et al. (U.S. Pat. 11,238,048 B1)
With respect to claim 1, Szczepanik et al. discloses a method to facilitate data monitoring in a computing system, the method comprising:
receiving a call from a data pipeline to ingest unprocessed data intended for the data pipeline from one or more data input streams;
ingesting the unprocessed data from the one or more data input streams (i.e., “the files ingested into the data lake system 101 and being stored by the raw data storage 117, may be managed using a flat file architecture to track and maintain each of the files without having to apply a structure or schema to the raw data storage 117 at the time of file ingestion” (0064) and “The data of each file may be transmitted to the data lake systems 101 in discrete data packets or by streaming the file data over network 150 and storing the streaming files to the raw data storage 117” (0065)); 
generating metadata using the unprocessed data, determining a value distribution of the unprocessed data, checking data types of the unprocessed data, and i.e., “More specifically, embodiments of the data lake system 101 operating within a computing environment 100, 180, 190, 200, 280, 350, may perform the functions associated with ingesting at least one streaming file from one or more data streams 203, 205, 207 into the data lake, storing the streaming files in an unprocessed, native format, scanning metadata associated with the streaming files (either embedded within the file or as a separate metadata file), analyzing the metadata, categorizing the content of the streaming files using the metadata and one or more machine learning techniques, and generating a list of the files entering or being stored by the data lake system 101” (0061)( “analyzing the metadata” is generating metadata using the unprocessed data as claimed invention; further, “categorizing the content of the streaming files” is the a value distribution of the unprocessed data of claimed invention; “a list of the files entering” is the checking data types of claimed invention  and “each file being stored by the raw data storage 117 can be assigned a unique identifier. In some instances, each file entering the data lake system 101 may also be tagged with a set of metadata tags further describing the type of data being stored to the raw data storage 117 as well as the content of the file being ingested”(0064), Examine asserts the describing the type of data is data type of claimed invention and “one or more associated operational databases 123 comprising database engines 119 capable of applying a particular schema to the files of the raw data storage 117. Schemas of the operational databases 123 implemented by the database engines 119 may control how files from the raw data storage 117 may be processed upon subsequently being queried by a user or administrator of the data lake system 101” (0071) Database engine 119 capable of applying a particular schema is the same identifying a schema for the unprocessed data as claimed invention); 
computing one or more expected data outputs of the data pipe line based on the metadata, the value distribution, the data types and the data  schema of the unprocessed data (i.e., “Embodiments of the reasoning engine may rank the records of the past data lakes based upon how closely the categorization of data matches with the current data lake, the expected performance of the database engine 119 and/or the frequency of using the categorized data identified in step 509” (0112) or step 507 to generate a file list identifying each file and “The algorithm may learn by comparing the actual output with the correct outputs in order to find errors. The machine learning module 114 may modify the model of data according to the correct outputs to refine the decision making of the machine learning module 114, improving the accuracy of the automated decision making of the machine learning module 114 to provide the correct inputs. During the training phase, the machine learning module 114 may learn the correct outputs by analyzing and describing well known data and information, that may be stored by the knowledge base 110, which may be used as a reference describing data types and attribute”(0086) and correct outputs is expected data input as claimed invention and “Machine learning that is unsupervised may not be “told” the right answer the way supervised learning algorithms do. Instead, during unsupervised learning, the algorithm may explore the data to find a common structure between the files being explored. Embodiments of an unsupervised learning algorithm can identify common attributes of metadata between each of the files streamed to or stored by the raw data storage 117. Examples of unsupervised machine learning may include self-organizing maps, nearest-neighbor mapping, k-means clustering, and singular value decomposition.” (0087) and organizing map is schema of claimed invention); 
receving a call from the data pipeline to ingest processed data generated by the data pipeline from one or more data output streams wherein the processed data comprises one or more actual data output (i.e., ““The algorithm may learn by comparing the actual output with the correct outputs in order to find errors. The machine learning module 114 may modify the model of data according to the correct outputs to refine the decision making of the machine learning module 114, improving the accuracy of the automated decision making of the machine learning module 114 to provide the correct inputs. During the training phase, the machine learning module 114 may learn the correct outputs by analyzing and describing well known data and information, that may be stored by the knowledge base 110, which may be used as a reference describing data types and attribute” (0086))
ingesting processed data from one or more data output streams (i.e., “The algorithm may learn by comparing the actual output with the correct outputs in order to find errors. The machine learning module 114 may modify the model of data according to the correct outputs to refine the decision making of the machine learning module 114, improving the accuracy of the automated decision making of the machine learning module 114 to provide the correct inputs” (0086)); 
compare the one or more expected data output with the one or more actual data outputs of the processed data and responsively determining that at least one of the one or more actual data outputs does not align with the one or more expected data outputs ((i.e., “The algorithm may learn by comparing the actual output with the correct outputs in order to find errors. The machine learning module 114 may modify the model of data according to the correct outputs to refine the decision making of the machine learning module 114, improving the accuracy of the automated decision making of the machine learning module 114 to provide the correct inputs” (0086) or fig. 5C show step 559 if existing data lake identifier, no mean does not align as claimed invention); 
generating an alert signifying that at least one of the one or more expected data outputs does not align with the one or more actual data outputs ((i.e., “The algorithm may learn by comparing the actual output with the correct outputs in order to find errors. The machine learning module 114 may modify the model of data according to the correct outputs to refine the decision making of the machine learning module 114, improving the accuracy of the automated decision making of the machine learning module 114 to provide the correct inputs” (0086); and 
sending the alert to a client (i.e., “the reporting engine 125 of the data lake system 101 may send a report, notification or error alerting a user or administrator of the data lake system 101 that an operational database 123 or database engine 119 could not be found that matches the management requirements of the file types or data categories being received or stored” (0093)).  
Szczepanik et al. process the stream unprocessed data but does not explicitly disclose receiving a call from a data pipeline to ingest unprocessed data intended for the data pipeline from one or more data input streams; However, Breeden et al. discloses receiving a call from a data pipeline to ingest unprocessed data intended for the data pipeline from one or more data input streams (i.e., “a monitoring component 112 may be configured to collect device performance information by monitoring one or more client device operations, or by making calls to an operating system and/or one or more other applications executing on a client device 102 for performance information”(col. 13, lines 32-38) or “the processing pipeline 3550 can be considered as a tree or directed graph of nodes, where data flows in a specific direction from node to node along the interconnections…  If any nodes in the pipeline were configured with DSL expressions, the application managing the user interface on the user device can call back to the GUI pipeline creator 3420 to convert the DSL expressions into sub-ASTs and to merge them back into the full AST prior to executing a preview or activating the pipeline” (col. 144, lines 30-55) and fig. 10).
It would have been obvious for a person of ordinary skill in the art, before the effective filing date of the claimed invention, to use pipeline to process the stream in order to different way to transmit, process and analysis the raw data to quickly search analyze large set of raw machine data of visually identify data for the stated purpose has been well known in the art as evidenced by teaching of  et al.
	

i.e., “the reporting engine 125 of the data lake system 101 may send a report, notification or error alerting a user or administrator of the data lake system 101 that an operational database 123 or database engine 119 could not be found that matches the management requirements of the file types or data categories being received or stored. In some embodiments, the data lake system 101 may request human input from the user of administrator to resolve the error in identifying a suitable operational database 123 or database engine 119” (0093)).  
With respect to claim 6, Szczepanik et al. discloses further comprising generating a metadata confidence level wherein the metadata confidence level indicates at least an accuracy of the metadata (i.e., “The statistical analysis of the different types of data or data categorizations may be compared to the types of data being stored with historically provisioned data lakes, in order to calculate at a level of confidence (confidence interval) that one of the historically provisioned data lakes, more likely than not, has been provisioned with one or more operational databases 123 that successfully managed or organized the same categories of data stored and/or streamed to the newly registered data lake. For example, the most closely matched historical data lake may be considered the closest match within a 99% confidence interval (CI), a 95% CI, 90% CI, 85% CI, 75% CI, etc.” (0113)).  
Claims 4-5 are rejected under 35 U.S.C 103(a) as being unpatentable over Szczepanik et al. (U.S. Pub. 2020/0174966 A1), Breeden et al. (U.S. Pat. 11,238,048 B1) and further in view of Goldentouch (U.S. Pub. 2011/0082848 A1).
With respect to claim 4, Szczepanikand Breeden et al. disclose all limitations recited in the claim 1, However, Goldentouch discloses further comprising tracking i.e., “Monitor and send alerts. The user may activate alert mechanism, which sends alerts upon some preset events, including changes of selected objects, new search results, analysis committed by other group member or other suitable events” (0250) and “Get search results, including monitoring the search provider, verifying that a set of results is returned, analyzing the format of the set of results, getting the raw results or performing similar operations” (0263)).  It would have been obvious for a person of ordinary skill in the art, before the effective filing date of the claimed invention, to include Goldentouch’s feature in order to get accurate the expect result and correct for the stated purpose has been well known in the art as evidenced by teaching of Goldentouch.
With respect to claim 5, Goldentouch discloses the method of claim 4, further comprising detecting, in real time, changes to object records of the unprocessed data and notifying the client of changes to the object records of the unprocessed data (i.e., “Monitor and send alerts. The user may activate alert mechanism, which sends alerts upon some preset events, including changes of selected objects, new search results, analysis committed by other group member or other suitable events” (0250)).
Claims 7 is reject under 35 U.S.C 103(a) as being unpatentable over Szczepanik et al. (U.S. Pub. 2020/0174966 A1), Breeden et al. (U.S. Pat. 11,238,048 B1) and further in view of Colley et al.  (U.S. Pub. 2021/00906994 A1)
With respect to claim 7, Szczepanik and Breeden et al. disclose all limitations recited in claim 1 except for wherein the data output stream originates from an extract/transform/load (ETL) orchestrated environment.  However, Colley et al. discloses wherein the data output stream originates from an extract/transform/load (ETL) orchestrated environment (i.e. “respectively, an event reporting bus 316, system micro-services 186, various data lake APIs 332, 334 and 336, an ETL module 338, data lake query and analytics modules 346 and 348, respectively, an ETL platform 360 as well as data marts database 190” (0994), 0998).  It would have been obvious for a person of ordinary skill in the art, before the effective filing date of the claimed invention, to include Colley et al.’s feature in order to have optimized model format for replicating the output data to different system for the stated purpose has been well known in the art as evidenced by teaching of Coolley et al. (0998).
Allowable Subject Matter
Claim 3 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, since the prior art of record and considered pertinent to the applicant’s disclosure does not teach or suggest the claimed wherein comparing the one or more expected data outputs with the one or more actual data outputs comprises: generating output metadata for the one of more actual data outputs of the processed data; determining [[a]] an output value distribution of the one or more actual data outputs of the tuprocessed data; checking output data types of the one or more actual data outputs of the uiprocessed data; and identifying [[a]] an output data schema for the one or more actual data outputs of the tmprocessed data; comparing the metadata, the value distribution, the data types, and the schema for the one or more expected data outputs with the output metadata, the output value distribution, the output data types, and the output data schema for the one or more actual data outputs; and determining that at least one of the output metadata, the output value distribution, the output data types, and the output data schema for the one or more actual data 

Close reference:
U.S. 2019/0028557 teaches minoring ingesting data, generating metadata (006-009)
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUNG T VY whose telephone number is (571)272-1954. The examiner can normally be reached M-F 8-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tony Mahmoudi can be reached on (571)272-4078. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/HUNG T VY/Primary Examiner, Art Unit 2163                                                                                                                                                                                             March 21, 20224-