DETAILED ACTION

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/02/2021 has been entered.


Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 21, 22, 24, 31 – 33, 35, and 38 - 40 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Allen et al (US 2016/0148114).
As to claim 21, Allen et al teaches a method for documents data (paragraph [0047]...inputs from various sources including input over a network, a corpus of electronic documents or other data, data from a content creator, information from one or more content users, and other such inputs from other possible sources of input. Data storage devices store the corpus of data. A content creator creates content in a document for use as part of a corpus of data with the QA system. The document may include any file, text, article, or source of data for use in the QA system) ingestion in a paragraph [0004]...implementing a Question and Answer (QA) system pipeline), the method comprising: 
ingesting a plurality of input documents (paragraph [0113]...reading and processing historical data in a corpus of information (step 610)) by a full sub-pipeline (elements 610, 620, 630 of figure 6) of an ingestion pipeline (paragraph [0112]...QA system pipeline 510)  to extract first data (paragraph [0113]...through processing of the historical data, features or attributes, and temporal characteristics associated with the features or attributes, are identified and used to generate one or more training cases (step 620)) for building a knowledge base (paragraph [0047]...body of knowledge about the domain) of the QA system; 
applying, by the ingestion pipeline, a filtering rule to the plurality of input documents to generate a filtered documents subset of the plurality of input documents (paragraph [0114]...one or more filter criteria are applied to the historical data based on training objectives (e.g., source filtering, results filtering, time frame filtering, etc.) so as to generate a filtered subset of the historical data (step 630)); 
ingesting the filtered documents subset of the plurality of input documents by an exclusive sub-pipeline (elements 640, 650 of figure 6) of the ingestion pipeline to extract second data (paragraph [0114]...relevant attributes, actions taken, and dates of actionable event are extracted from the filtered subset of historical data (step 640)); and 
filtering, using the second data, undesirable data from the knowledge base generated by the full sub-pipeline (paragraph [0114]...the correlation of relevant attributes with the action taken and the date of the actionable event is then used to create a new training answer key entry with the correct answer for that answer key entry being the action taken (step 650)).

paragraph [0017]...in the healthcare industry, data in patient records continues to increase over time as the patient's treatments and care by the patient's various physicians continues) by an incremental sub-pipeline elements 640, 650, 660 of figure 6) of the ingestion pipeline to extract third data for building the knowledge base (paragraph [0114]...the correlation of relevant attributes with the action taken and the date of the actionable event is then used to create a new training answer key entry with the correct answer for that answer key entry being the action taken (step 650). The operation then determines if there is any more filtered historical data to be processed (step 660). If so, the operation returns to step 640 and additional relevant attributes, actions taken, and dates of actionable events are extracted and used to generate additional answer key entries. If there is no more filtered historical data to process, the operation terminates), wherein the plurality of incremental documents are new documents that are not part of the plurality of input documents (paragraph [0018]...in the healthcare industry, new discoveries are being made on a regular basis which may change or invalidate previous treatments or patient recommendations that, at a previous time, were considered to be the correct treatments or recommendations. That is, as understanding of an area increases, previous conclusions based on a different set of knowledge may become obsolete. In the context of a QA system, this may lead to a QA system that was trained on a previous training data set and/or previous answer key giving incorrect answers to questions at a later time when the corpus of data, representing current knowledge in a particular domain, has expanded and answers to training questions have changed over time. However, at the time that the QA system was trained, the answers generated (which are now obsolete) may have been the correct answers for the knowledge at the time).

As to claim 24, Allen et al teaches the method, wherein the filtered documents subset of the plurality of input documents (paragraph [0114]...one or more filter criteria are applied to the historical data based on training objectives (e.g., source filtering, results filtering, time frame filtering, etc.) so as to generate a filtered subset of the historical data (step 630)) comprises of a number of documents below a predetermined threshold (paragraph [0025]...applying the second filter criterion mentioned above, temporal characteristics of the data in the patient medical records may be compared to a current date/time, and one or more selection thresholds, so as to select a sub-portion of the patient medical records to be used to generate the training answer key. The thresholds may be set so as to specify the historical time frame of interest for training the QA system, e.g., only the most contemporary data is utilized) of the plurality of input documents for processing by the exclusive sub-pipeline (elements 640, 650 of figure 6) of the ingestion pipeline (paragraph [0112]...QA system pipeline 510). 

Claim 32 has similar limitations as claim 21. Therefore, the claim is rejected for the same reasons as above. 

Claim 33 has similar limitations as claim 22. Therefore, the claim is rejected for the same reasons as above. 

As to claim 35, Allen et al teaches the system, wherein the at least one processor is further configured to execute the instructions (paragraph [0006]...one or more processors and a memory coupled to the one or more processors. The memory may comprise instructions which, when executed by the one or more processors, cause the one or more processors to perform various ones of, and combinations of, the operations outlined above with regard to the method illustrative embodiment) to generate the filtering rule based on the undesirable data from the knowledge base (paragraph [0023]...historical data is filtered based on training objectives which specify one or more filter criteria to be applied to the historical data in the corpus so as to generate a filtered corpus of information (or training corpus) upon which training is to be performed. These filter criteria may take many different forms. For example, one filter criterion may be to select historical data associated with specific sources of data in the corpus or specific types of sources of data in the corpus. Another filter criterion may be to select historical data that is more contemporary, i.e. not older than a particular time period. Another filter criterion may be to select historical data with a particular level of confidence associated with the data).

Claim 38 has similar limitations as claim 21. Therefore, the claim is rejected for the same reasons as above. 

Claim 39 has similar limitations as claim 22. Therefore, the claim is rejected for the same reasons as above. 

Claim 40 has similar limitations as claim 35. Therefore, the claim is rejected for the same reasons as above. 


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.

3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 23, 25 – 27, 34, and 36 is/are rejected under 35 U.S.C. 103 as being unpatentable over Allen et al (US 2016/0148114) in view of Watts et al (US 2020/0082015).
As to claim 23, Allen et al teaches the method of claim 21, wherein ingesting the plurality of input documents (paragraph [0113]...reading and processing historical data in a corpus of information (step 610)) by the full sub-pipeline (elements 610, 620, 630 of figure 6)  and ingesting the filtered documents subset of the plurality of input documents (paragraph [0114]...one or more filter criteria are applied to the historical data based on training objectives (e.g., source filtering, results filtering, time frame filtering, etc.) so as to generate a filtered subset of the historical data (step 630)) by the exclusive sub-pipeline (elements 640, 650 of figure 6).
Allen et al fails to explicitly show/teach that the full sub-pipeline and the exclusive sub-pipeline operate concurrently and in parallel. 
However, Watts et al teaches full sub-pipeline (235-1 of figure 2) and the exclusive sub-pipeline (235-3 of figure 1) operate concurrently and in parallel (paragraph [0034]... the search pipeline (e.g., the locality operator group) may distribute as much of a query as possible to ensure that the indexers 132 can do as much initial processing in parallel as possible ; paragraph [0041]...optimal performance may be achieved in use cases that put as many parallel search modules 162 as possible upstream of the first collapsing search module 162, as this approach may decrease pressure on the communication pipe and allow for greater parallelism in the data analytics platform 100).
Therefore, it would have been obvious for one having ordinary skill in the art before the effective filing date of the claimed invention, for Allen et al’s full sub-pipeline 

As to claim 25, Watts et al teaches concurrently, and in parallel (paragraph [0034]... the search pipeline (e.g., the locality operator group) may distribute as much of a query as possible to ensure that the indexers 132 can do as much initial processing in parallel as possible ; paragraph [0041]...optimal performance may be achieved in use cases that put as many parallel search modules 162 as possible upstream of the first collapsing search module 162, as this approach may decrease pressure on the communication pipe and allow for greater parallelism in the data analytics platform 100) to ingesting the plurality of input documents by the full sub- pipeline (631A of figure 6B)  of an ingestion pipeline, ingesting the plurality of input documents by a second full sub- pipeline (631B of figure 6B)  of an ingestion pipeline to extract third data for building the knowledge base of the QA system (paragraph [0039]...search query).
It would have been obvious for concurrently, and in parallel to ingesting the plurality of input documents by the full sub- pipeline of an ingestion pipeline, ingesting the plurality of input documents by a second full sub- pipeline of an ingestion pipeline to extract third data for building the knowledge base of the QA system, for the same reasons as above. 

As to claim 26, Watts et al teaches the first data from the full sub-pipeline (631A of figure 6B)   is input data for both a second full sub-pipeline (631B of figure 6B)  of an ingestion pipeline and a third full sub-pipeline (632B of figure 6B)  of an ingestion pipeline (630A and 630B of figure 6B)  .


As to claim 27, Watts et al teaches extending an abstract sub-pipeline (paragraph [0006]... each module operating only on the data groupings that the module is capable of operating on, passing on all other data down the pipeline. The final operator group named "renderer" may then receive the processed output and down select the data to only that which is needed in order to visualize or otherwise convey a fused and final output to the user. Accordingly, the unified, linear, and concurrent processing methodology set forth herein may allow for simplified abstraction of data processing while increasing the capability of a processing pipeline without dramatically increasing the complexity as presented to the user ; paragraph [0026]... The linear pipeline processing framework may define a data fusion pipeline assembly mechanism according to an abstracted query language without requiring data location, context, extraction, and/or normalization to be explicitly defined) to create both the full sub-pipeline (elements 610, 620, 630 of figure 6)  and the exclusive sub-pipeline (235-3 of figure 1) of the ingestion pipeline.
It would have been obvious to extend an abstract sub-pipeline to create both the full sub-pipeline and the exclusive sub-pipeline of the ingestion pipeline, for the same reasons as above.

Claim 34 has similar limitations as claim 23. Therefore, the claim is rejected for the same reasons as above. 

Claim 36 has similar limitations as claim 27. Therefore, the claim is rejected for the same reasons as above. 

Claims 28 – 31 and 37 is/are rejected under 35 U.S.C. 103 as being unpatentable over Allen et al (US 2016/0148114) in view of Swann et al (US 2018/0077183).
As to claim 28 – 31, Allen et al teaches a sub-pipeline (elements 610, 620, 630 of figure 6) of the ingestion pipeline (paragraph [0112]...QA system pipeline 510).
Allen et al fails to explicitly show/teach modifying a sub-pipeline of the ingestion pipeline without disrupting the documents data ingestion of other sub-pipelines of the ingestion pipeline.
However, Swann et al teaches modifying (paragraph [0015]...pipeline manager 120) a sub-pipeline of the ingestion pipeline without disrupting the documents data ingestion of other sub-pipelines of the ingestion pipeline (paragraph [0017]...the pipeline manager 120 is a specialized process 130 or plurality of processes 130 that are run on an individual server 110 and may have additional processes added or removed from the plurality of processes 130 modularly. The pipeline manager 120 is discussed in greater detail in regard to FIG. 2. The pipeline manager 120 provides an orchestrator and a health management service to manage the flow of data within the pipeline and correct or address any hangs, program crashes, or exceptions thrown by modules within the pipeline ; paragraph [0041]... [0041] At OPERATION 360 a new event object is created using the identifying information received in the event and the other information to create a baseline for the event object that may be updated when a subsequent event with the same identifying information is seen in the future. In various aspects, the event object is added to an associated dictionary for future lookup via the identifying information (e.g., as a key value or values for the object). As will be appreciated, the new event objects may be created in response to the event processor 250 never having seen an event with the given identifying information, or a previously created event object having been removed or deleted from the audit event persistor 270. For example, an event object may only be persisted for a set period of time (e.g., n minutes, n days), the audit event ingestor 280 may remove/delete event objects when consuming them, or the pipeline manager 120 may remove/delete objects in response to a process 130 associated with those event objects terminating).
Therefore, it would have been obvious for one having ordinary skill in the art before the effective filing date of the claimed invention, for Allen et al to modifying a sub-pipeline of the ingestion pipeline without disrupting the documents data ingestion of other sub-pipelines of the ingestion pipeline, as in Swann et al, for the purpose of achieving optimal performance by pipeline streamlining. 

Claim 37 has similar limitations as claim 28. Therefore, the claim is rejected for the same reasons as above. 

Response to Arguments
Applicant's arguments filed 12/02/2021 have been fully considered but they are not persuasive. 
The applicant argues that the following highlighted limitations are not shown by Allen et al:
A method for documents data ingestion in a question answering (QA) system, the method comprising: ingesting a plurality of input documents by a full sub-pipeline of an ingestion pipeline to extract first data for building a knowledge base of the QA system; applying, by the ingestion pipeline, a filtering rule to the plurality of input documents to generate a filtered documents subset of the plurality of input documents; ingesting the filtered documents subset of the plurality of input documents by an exclusive sub-pipeline of the ingestion pipeline to extract second data; and filtering, using the second data, undesirable data from the knowledge base generated by the full sub-pipeline. 

Allen et al clearly teaches the limitation “for building a knowledge base (paragraph [0047]...body of knowledge about the domain) of the QA system.” The QA system has a knowledge base that is continuously being built and updated. Allen et al teaches in 
	Allen et al clearly teaches the limitation “ingesting the filtered documents subset of the plurality of input documents by an exclusive sub-pipeline (elements 640, 650 of figure 6) of the ingestion pipeline to extract second data,” The applicant doesn’t claim what makes a pipeline an exclusive pipeline. In other words what makes an exclusive pipeline different from any other pipeline? Under the broadest reasonable interpretation Allen et al clearly shows an exclusive pipeline. 
Allen et al clearly teaches the limitation filtering, using the second data, undesirable data from the knowledge base generated by the full sub-pipeline (paragraph [0114]...the correlation of relevant attributes with the action taken and the date of the actionable event is then used to create a new training answer key entry with the correct answer for that answer key entry being the action taken (step 650)). What does the applicant consider “undesirable data?” The applicant doesn’t claim what they interpret as undesirable. Allen et al paragraph [0112]. teaches to filter out candidate answers that are based on information not available within the historical context of the training case specified by the reference date. Under the broadest reasonable interpretation Allen et al clearly shows undesired date being filtered out.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRANDON S COLE whose telephone number is (571)270-5075. The examiner can normally be reached Mon - Fri 7:30pm - 5pm EST (Alternate Friday's Off).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez can be reached on 571-272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.