DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
Applicant’s Information Disclosure Statements, filed 05/11/2022 (2), 06/07/2022 08/03/2022, 08/18/2022, 08/24/2022, and 09/28/2022, have been received, entered into the record, and considered.  See attached form PTO-1449. 

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of U.S. Patent No. 11347482. Although the claims at issue are not identical, they are not patentably distinct from each other because claims 1-20 of U.S. Patent No. 11347482 contain every element of claims 1-20 of the instant application. 


A claim in the instant application compared to a claim in U.S. Patent No. 11347482
Instant Application
U.S. Patent No. 11347482
A method for use with a data integration or other computing environment comprising:
A method for use with a data integration or other computing environment comprising:

providing, at a computer including a processor, a graphical user interface for creation of a data flow associated with a software application, including a specification of: one or more sources of data, and a data target, wherein each source of data comprises one or more datasets having attributes, semantics, and relationships with other datasets, and wherein an event coordinator receives notifications of the data received from the one or more sources of data, and state transactions associated with the data;
receiving, from a knowledge source that stores profile information and other metadata associated with one or more data sources comprising datasets, a metadata associated with processing a data flow associated with the one or more data sources;
 receiving, from a knowledge source that stores profile information and other metadata associated with data sources, datasets, and entities, a metadata associated with processing the data flow associated with the one or more sources of data and the data target; 
ingesting data from the one or more sources of data, and 
ingesting data from the one or more sources of data, via an edge layer, and providing the data to a scalable input/output layer that provides access to the data structured as topics; 
writing ingested data to a data repository operating as a data lake, for use by an input/output layer that provides access to the data structured as topics; and as the data is received from the one or more data source sources of data: identifying portions of the ingested data; identifying portions of the ingested data;
writing ingested data to a data repository operating as a data lake, for use by a compute layer executing data flow applications; as the data is received from the one or more sources of data, and used by downstream data flow applications, identifying temporal slices the ingested data;
receiving from the knowledge source a metadata associated with the portions of the ingested data; and 
accessing the knowledge source to obtain metadata about the ingested data represented by the temporal slices; and 
writing the portions of the ingested data to the data lake, and associating a lineage tracking information therewith for use by one or more data flow applications.
managing the data represented by the temporal slices, including writing the temporal slices to the data lake, and, for each temporal slice, updating a lineage tracking information descriptive of the lineage of the temporal slice.
Claims 2, 9, and 16
Claims 2, 9, and 16
Claims 3, 10, and 17
Claims 3, 10, and 17
Claims 4, 11, and 18
Claims 4, 11, and 18
Claims 5, 12, and 19
Claims 5, 12, and 19
Claims 6, 13, and 20
Claims 6, 13, and 20
Claims 7 and 14
Claims 7 and 14
Claims 8 and 15
Claims 8 and 15


A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s)

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-3, 7-10, and 14-17 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Bar-Or; Amir et al. (“Bar-Or”) US 20170177309 A1.

Regarding claim 1, Bar-Or teaches A method for use with a data integration or other computing environment as Signal Hub integrates data from a variety of sources, which enables the process of signal creation and utilization by business users and systems. Signal Hub provides a layer of maintained and refreshed intelligence (e.g., Signals) on top of the raw data that serves as a repository for scientists (e.g., data scientists) and developers (e.g., application developers) to execute analytics [0056] comprising: 
receiving, from a knowledge source that stores profile information and other metadata associated with one or more data sources comprising datasets, a metadata associated with processing a data flow associated with the one or more data sources as the Workbench 70 is used by data scientists for profiling and schema discovery of unfamiliar data sources. Signal Hub provides tools that can discover schema (e.g., data types and column names) from a flat file or a database table. It also has built-in profiling tools, which automatically compute various statistics on each column of the data such as missing values, distribution parameters, frequent items, and more. These built-in tools accelerate the initial data load and quality checks ([0075 and 0076]);
The Workbench 306 could include a workflow to process signals that includes loading 330, data ingestion and preparation 332, descriptive signal generation 336, use case building 338, and sending 340 [0089];
Multiple features of the Knowledge Center facilitate accessing and consuming intelligence. The first is its filtering and searching capabilities. When signals are created, they are tagged based on metadata and organized around a taxonomy [0099].
 ingesting data from the one or more sources of data, and writing ingested data to a data repository operating as a data lake, for use by an input/output layer that provides access to the data structured as topics as Signal Hub 304 allows companies to absorb information from various data sources 302 to be able to address many types of problems. More specifically, Signal Hub 304 can ingest both internal and external data as well as structured and unstructured data [0086]. 
Raw data is stored in the raw data database 258 of the Hadoop Data Lake 256… The results of step 268 are then stored in the model information and parameters database 270. In step 272, the model execution module 272 of the Hadoop/Yarn and Signal Hub 254 processes signals and model input data 266 and/or model information and parameters data 270. The results of step 272 are then stored in the model output database 274 [0083].
The Signal Hub Server automates the processing of inputs to outputs. Because of its data flow architecture, it has a speed advantage [0084].
FIG. 4C is a screenshot of an event pattern matching feature of the system of the present disclosure. The system allows users to determine whether a specified sequence of events occurred in the data and then submit a query to retrieve information about the matched data. For example, in FIG. 4C, for the raw input data shown, a user can (1) define an event; (2) create a pattern matcher; and (3) query the pattern matcher to return the output as shown [0071]; and 
as the data is received from the one or more data source sources of data: 
identifying portions of the ingested data as The user interface of the Workbench could include components such as a tree view 72, an analytic code development window 74, and a supplementary display portion 76. The tree view 72 displays each collection of raw data files (e.g., indicated by “Col” 73a) ([0073 and 0072]); 
receiving from the knowledge source a metadata associated with the portions of the ingested data as The Signal Hub platform 600 further includes a main view portion 604. The main view portion 604 diagrammatically displays data sources 606 (e.g., business inputs), descriptive signals 608 (e.g., grouped and organized by metadata), and predictive signals 610 ([0102]); and 
writing the portions of the ingested data to the data lake as Raw data is stored in the raw data database 258 of the Hadoop Data Lake 256. In step 260, Hadoop/Yarn and Signal Hub 254 process the raw data 258 with ETL (extract, transform, and load) modules, data quality management modules, and standardization modules. The results of step 260 are then stored in a staging database 262 of the Hadoop Data Lake [0083], and associating a lineage tracking information therewith for use by one or more data flow applications as The lineage is used to understand the transformation from raw data to descriptive signals and predictive signals (e.g., how is the number of trips required to move to the next loyalty tier signal generated and which models consume it). As shown, when the definition level diagram button 652 is activated, the Signal Hub platform 600 displays the lineage of a particular signal, which includes what data is being pulled, and what models the signal is being used in. Once a signal of interest is identified, users can gain a deeper understanding of the signal by exploring its lineage from the raw data through all transformations, providing insight into how a particular Signal was created and what the value truly represents ([0110 and 0101] and Fig. 23A).

Regarding claims 2, 9, and 16, Bar-Or further teaches wherein the one or more sources of data are HUBs, and the knowledge source is a system HUB as Signal Hub integrates data from a variety of sources, which enables the process of signal creation and utilization by business users and systems. Signal Hub provides a layer of maintained and refreshed intelligence (e.g., Signals) on top of the raw data that serves as a repository for scientists (e.g., data scientists) and developers (e.g., application developers) to execute analytics [0056].

Regarding claims 3, 10, and 17, Bar-Or further teaches wherein metadata received through the interface is stored in the system HUB to be accessed by the system for processing a data flow as The main view portion 604 diagrammatically displays data sources 606 (e.g., business inputs), descriptive signals 608 (e.g., grouped and organized by metadata), and predictive signals 610 [0102]. The Signal Hub platform can display model description, metadata, input signal, output column, etc. all in one centralized page for each model. FIG. 21E also illustrates a user interface screen generated by the system for commenting signals using the Knowledge Center 600 generated by the system [0108].

Regarding claims 7 and 14, Bar-Or further teaches wherein the method is performed in a cloud or cloud-based computing environment as The Hadoop system can manage resources (e.g., split workload and/or automatically optimize how and where computation is performed). For example, the system could be fully or partially executed on Hadoop, a cloud-based implementation, or a stand-alone implementation on a single computer [0058].

Regarding claim 8, the claims recites a system with similar limitations as claim 1 as such rejected under the same rationale as noted above for claim 1.

Regarding claim 15, the claims recites a non-transitory computer readable storage medium with similar limitations as claim 1 as such rejected under the same rationale as noted above for claim 1.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 4-6, 11-13, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Bar-Or; Amir et al. (“Bar-Or”) US 20170177309 A1 as applied to claims 1, 8, and 15 in view of Overman; Stephen et al. (“Overman”) US 20110047056 A1.
Regarding claims 4, 11, and 18, Bar-Or further teaches The system can automatically detect the high level lineage between the data, signal and use-cases when hovering over specific items. The system can also allow a user to further drill down specific data, signal and use-cases by predefined metadata which can also allow a user to view the high level lineage as well ([0101 and 0110]).
Bar-Or does not explicitly teach wherein a data reconstruction and lineage tracking information includes provenance and lineage information.
Overman; however, teaches wherein a data reconstruction and lineage tracking information includes provenance and lineage information as Data provenance becomes important when expectations do not match outcomes. Data provenance provides the means to track possible causes of the discrepancy by allowing an analyst or auditor to reconstruct the events that took place in the system [0235].
Data Provenance refers to the history of data including its origin, key events that occur over the course of its lifecycle, and other traceability related information associated with its creation, processing, and archiving. It is the essential ingredient that ensures that users of data (for whom the data may or may not have been originally intended) understand the background of the data This includes concepts such as, What (sequence of resource lifetime events), Who generated the event (Person Or Organization). Where the event came from (location), How the event transformed the resource, the assumptions made in generating it, and the processes used to modify it, When the event occurred (started/ended), Quality measure (used as a general quality assessment to assist in assessing this information, within the DATA policy governance) and Genealogy (defines sources used to create a resource) [0254].
Data, Quality: The lineage can be used via policy to estimate data quality and data reliability based on the (Who, Where) source of the information and the process (What, How) used to transform the information. The level of detail in the Data Provenance will determine the extent to which the quality of the data can be estimated. This information can be used to help the user of the data determine authenticity and avoid spurious data sources. Since a "trusted data information exchange" governed by policy provides a certified semantic knowledge of the Data Provenance, it is possible to automatically evaluate it based on Quality metrics that are defined and provide a "quality score" [0255].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Overman’s teaching would have allowed Bar-Or’s to facilitate estimating of data quality by including data provenance lineage information for reconstruction of history. 

Regarding claims 5, 12, and 19, Bar-Or further teaches The Signal Hub Server has multiple capabilities to automate server management. It can detect data changes within raw file collections and then trigger a chain of processing jobs to update existing Signals with the relevant data changes without transactional system support [0084].
The system can automatically detect the high level lineage between the data, signal and use-cases when hovering over specific items. The system can also allow a user to further drill down specific data, signal and use-cases by predefined metadata which can also allow a user to view the high level lineage as well [0101].
Bar-Or does not explicitly teach wherein at subsequent times, the data reconstruction and lineage tracking information is updated to include updated provenance and lineage information.
Overman; however, teaches wherein at subsequent times, the data reconstruction and lineage tracking information is updated to include updated provenance and lineage information as FIG. 49 shows an example of Document Update Graph that illustrates the relationships of the What, When, Who, How, Where and Quality of a documented being updated ([0289, 0290, and 0292]).

It can detect data changes within raw file collections and then trigger a chain of processing jobs to update existing Signals with the relevant data changes without transactional system support [0084].

Regarding claims 6, 13, and 20, Bar-Or does not explicitly teach wherein the system provides a graphical user interface that can indicate a lifecycle of data flow based on lineage tracking, including where the data has been processed.
Overman; however, teaches teach wherein the system provides a graphical user interface that can indicate a lifecycle of data flow based on lineage tracking, including where the data has been processed as Events are based on the information lifecycle of data and with a lifecycle of events: creation, storage, review, approval, verification, access, archiving, and deletion. Events are collectively described as: [0109]
Where--location where an event happens [0110]
When--the time when an event occurs [0111]
Who--the people or organizations involved in data creation and transformation [0112]
How--documents actions upon the data. These actions are labeled as data processes. It describes the details of how data has been created or transformed [0113]. 
Which--describes the instruments or software applications used in creating or processing the data [0114].
Why--decision making rational of actions [0115].

Data provenance is the historical recording and querying of information lifecycle data with a life cycle of events. We conceptualize data provenance as consisting of five interconnected elements including when, where, who, how and why ([0147 and 0181] and Fig. 26).

FIG. 2 shows how these events are categorized as information lifecycle, intellectual rights and archive. It is from the What that drives all operations for Record and Delete actions acting upon Data Provenance. Events are associated with message requests invoking the CCA policy. The Information Lifecycle events are solid concepts [0264].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Overman’s teaching would have allowed Bar-Or’s to provide a big-picture of different stages of the system development.

Conclusion
The prior art made of record and not relied upon in PTO-892 is considered pertinent to applicant's disclosure.


Any inquiry concerning this communication or earlier communications from the examiner should be directed to LESLIE WONG whose telephone number is (571)272-4120. The examiner can normally be reached Monday-Friday 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ashish K. Thomas can be reached on : 571-272-0631. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LESLIE WONG/Primary Examiner, Art Unit 2164