DETAILED ACTION
This action is in response to the initial filing of Application no. 16/781590 on 02/04/2020. Claims 1 – 20 are still pending in this application, with claims 1, 15 and 19 being independent

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Allowable Subject Matter
Claims 3, 6 (with claim 7), 8 (with claims 9), 11 and 12 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims in view of the prior art failing to teach or suggest in reasonable combination the limitations recited in claims 3, 6, 8, 11 and 12. For example, Burns et al. (EP 1 107 169 A2) discloses limitations recited in claim 3, except for identifying sections by identifying phrases that are abbreviations. Additionally, the prior art cited below, Lan et al., discloses a 3-layer QA architecture, yet fails to teach or suggest the 3-layer QA architecture having hidden states of size 384 or 192,
Claims 15 – 18 are allowed. The following is a statement of reasons for the indication of allowable subject matter: After further search and consideration of the prior art, claim 15 is determined to comprise allowable subject matter in view of the prior art failing to teach or suggest in reasonable combination the limitations recited in claim 15, i.e. building a tree-structure using the subject from the user query and the answer span identified by the MC model

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 10, 14 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Semnani et al. (“Domain-Specific Question Answering at Scale for Conversational Systems”) (“Semnani”) in view of  Kim et al. (US 2021/0149994) (“Kim”), and further in view Lan et al. (“Albert: A Lite Bert for Self-Supervises Learning of Language Representations”) (“Lan”), and further in view of Kuczmarski et al. (US 2020/0320984) (“Kuczmarski”).
For claims 1 and 19, Semnani discloses an automatic question answering system (Abstract), comprising: functionality to: receive a plurality of documents (webpages of Snowflake), wherein the plurality of documents include one or more sections (documents comprise sections) and one or more sub-sections (paragraphs) are retrieved (we crawled the publicly available webpages of Snowflake Computing) (3.Data, 3.1 Corpora, pg.2), wherein the plurality of documents pertain to a specific domain (domain-specific corpus for IT infrastructure, 3.Data, 3.1 Corpora, pg.2); generate one or more candidate answers using key entities (key words) extracted from each of the sub-sections (We train a BERT-base model to detect/extract keywords {potential answers to the queries that will be generate} in paragraphs, 4.2, Query generation, pg. 4 and 5); automatically generate questions based on each of the candidate answers (We then use the trained LSTM-based model to generate questions from the Snowflake corpus and extracted keywords, 4.2 Query generation, pg. 5); train a machine comprehension 
Yet, Semnani fails to teach the following: the functionality further comprises a at least one processor and at least a non-transitory processor-readable medium storing machine-readable instructions that cause the processor to perform the automatic question answering; the MC model includes a 3-layer Question Answer (QA) architecture with a predetermined number of hidden neurons; the term vector model scores indicate relevance of each of the set of context and 
However, Kim discloses a device and method for machine reading comprehension (Abstract), wherein the device comprises at least one processor ([0079]) and at least a non-transitory processor-readable medium storing machine-readable instructions that cause the processor to ([0080] [0081]): receive a query from a user (Fig.9, S910; [0071]) and score a set of contexts (passage) using a term vector model score (TF-IDF) (Fig.9, S940; [0037 – 0043] [0073]), wherein the score indicates relevance of each set of contexts to the query of the user (a TF-IDF value is determined based on a user question, Fig.5;  [0037 – 0043]) and questions to the query of the user (scores for the plurality of passages are determined if a user query is dissimilar to questions with an established best answer, Fig.9, S930; [0033 – 0036] [0072] [0073]). 
Additionally, Lan discloses a language representation model (ALBERT) (Abstract) which is functionally similar to  BERT (Introduction and 3. The Elements of ALBERT, pg.1, 2, 4 and 5) comprising a 3- layer network parameter with a predetermined number of hidden neurons (3.2 Model Setup and 4.7 Effect of Network Depth and Width, pg. 5,6, 8 and 9).
Furthermore, Kuczmarski discloses a method for facilitating end-to-end communications (Abstract), wherein an natural language output is extracted unaltered from documents and provided as is since it is already in complete sentence form or composed as a complete sentence from responsive content when the responsive content does not include a complete sentence form ([0053]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve Semnani’s invention in the same way Kim’s invention has been 
Moreover, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to substitute the MC model disclosed by the combination of Semnani and Kim with Lan’s MC model to achieve the predictable results of detecting an start and end point position of an answer span (Semnani, 4.1.4 Reader, pg. 4)  for the purpose of adapting and enabling a question answering system to search for answers in a broad range of specialized domains including IT infrastructure (Semnani, Abstract) while addressing the problems of increasing a model size when pretraining natural language representations to improve performance on downstream tasks, e.g. GPU/TPU memory limitations, longer training times, and unexpected model degradation (Lan, Abstract).
Additionally, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve the invention disclosed by the combination of Semnani, Kim and Lan in the same way that Kuczmarski’s invention has been improved to achieve the following predictable results for the purpose increasing user satisfaction by proving responses in complete sentences: provide the answer span to the user as a reply to the query via an output user interface if the answer span forms a complete sentence, else provide a response generated from the answer span as the reply to the query.


For claim 14, Semnani further discloses wherein the processor is to: train the MC model on a generic dataset in addition to the candidate answers and the questions (Semnani, 5.2. Fine-tuning the reader, pg.6).

Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Semnani et al. (“Domain-Specific Question Answering at Scale for Conversational Systems”) (“Semnani”) in view of  Kim et al. (US 2021/0149994) (“Kim”), and further in view Lan et al. (“Albert: A Lite Bert for Self-Supervises Learning of Language Representations”) (“Lan”), and further in view of Kuczmarski et al. (US 2020/0320984) (“Kuczmarski”) and further in view of Phillips et al. (US 2016/0314104) (“Phillips”).
For claim 2, the combination of Semnani, Kim, Lan and Kuczmarski further disclose parsing  a plurality of documents and identifying sections of the plurality of documents (Semnani, 3.1 Corpora, pg. 2). Yet, the combination of Semnani, Kim, Lan and Kucmarski fails to teach that the parsing and section identification further comprises the following:
produce a stream of text by parsing each of the plurality of documents; extract metadata of the plurality of documents wherein the metadata includes one or more of titles, list of sections, list of figures and tables and a list of references; and identify each of the sections in the list of sections from the plurality of documents.
	However, Phillips discloses a method for extracting text from unstructured documents (Abstract) comprising the following: producing a stream of text by parsing a document (PDF 
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve the invention disclosed by the combination of Semnani, Kim, Lan and Kuczmarski in the same way that Phillips’ invention has been improved to achieve the following predictable results for the purpose of efficiently and accurately extracting text from unstructured documents (Semnani, 3.1 Corpora, pg. 2) (Phillips, [0001 – 0008]): the parsing and section identification further comprises producing a stream of text by parsing each of the plurality of documents; extracting metadata of the plurality of documents wherein the metadata includes one or more of titles, list of sections, list of figures and tables and a list of references; and identifying each of the sections in the list of sections from the plurality of documents.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Semnani et al. (“Domain-Specific Question Answering at Scale for Conversational Systems”) (“Semnani”) in view of  Kim et al. (US 2021/0149994) (“Kim”), and further in view Lan et al. (“Albert: A Lite Bert for Self-Supervises Learning of Language Representations”) (“Lan”), and further in view of Kuczmarski et al. (US 2020/0320984) (“Kuczmarski”), and further in view of Phillips et al. (US .
For claim 4, the combination of Semnani, Kim, Lan, Kuczmarski and Phillips further discloses categorizing the sections into one of a body text (Phillips, main body of flowing lines, [0089] [0092 – 0100]), table text (Phillips, lines of words are included in the bounding rectangle of the table, [0076] [0077] [0085] [0086] [0089]), and image text (Phillips, lines of words are included in the bounding rectangle of the figure, [0076] [0077] [0088] [0089]). Yet, the combination of Semnani, Kim, Zhang, Kuczmarski and Phillips fails to teach the following: wherein the processor is to further categorize each of the sections into one of a caption, wherein the body text can include one or more paragraphs and the categorization occurs based at least on a font style, font size and text alignment.
However, Chen discloses a method for the automatic conversion of static documents into dynamic documents (Abstract), wherein sections of a document are categorized into captions and body text including one or more paragraphs (pg. 11 lines 24 – pg.12 line 4), and the categorization occurs based at least on a font style, font size (character size)  and text alignment (justification) (pg. 12 lines 21 – column 15 line 5; column 16 lines 1 – 5).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve the invention disclosed by the combination of Semnani, Kim, Lan Kuczmarski and Phillips in the same way that Chen’s invention has been improved to achieve the following predictable results for the purpose of efficiently and accurately extracting text from unstructured documents (Semnani, 3.1 Corpora, pg. 2) (Phillips, [0001 – 0008]): the processor is to further categorize each of the sections into one of a caption, wherein the body text can include .

Claim 5 is  rejected under 35 U.S.C. 103 as being unpatentable over Semnani et al. (“Domain-Specific Question Answering at Scale for Conversational Systems”) (“Semnani”) in view of  Kim et al. (US 2021/0149994) (“Kim”), and further in view Lan et al. (“Albert: A Lite Bert for Self-Supervises Learning of Language Representations”) (“Lan”), and further in view of Kuczmarski et al. (US 2020/0320984) (“Kuczmarski”), and further in view of  Devlin et al. (“BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”) (“Devlin”).
For claim 5, the combination of Semnani, Kim, Lan and Kuczmarski fails to teach, wherein to extract the key entities the processor is to further split textual content in the sub-sections into word tokens.
However, Devlin discloses a language representation model called BERT which accepts word tokens as inputs (Figure 2, 3.Bert, Model Architecture and Input/Output Representation).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify the combined teachings of Semnani, Kim, Lan and Kuczmarski with Devlin’s teachings so that the BERT-based model which extracts key entities (Semnani, 4.2 Query generation, pg. 4 and 5) further requires the textual content in the sub-sections (paragraphs) to be split into word tokens for the purpose of adapting and enabling a question answering system to search for answers in a broad range of specialized domains including IT infrastructure (Semnani, Abstract).

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Semnani et al. (“Domain-Specific Question Answering at Scale for Conversational Systems”) (“Semnani”) in view of  Kim et al. (US 2021/0149994) (“Kim”), and further in view Lan et al. (“Albert: A Lite Bert for Self-Supervises Learning of Language Representations”) (“Lan”), and further in view of Kuczmarski et al. (US 2020/0320984) (“Kuczmarski”) and further in view of Wang et al. (“QG-Net: A Data-Driven Question Generation  Model For Educational Content”).
For claim 20, the combination of Semnani, Kim, Lan and Kuczmarski further discloses a LSTM model to generate the questions (Semnani, 4.2 Query Generation, pg. 4 and 5), yet fails to teach wherein a Seq2Seq model is employed for automatically generating the questions based on the candidate answers.
However, Wang discloses a LSTM question generation model (QG-Net) which is a sequence to sequence model (The QG-Net Model).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify the combined teachings of  Semnani, Kim, Lan and Kuczmarsk with Wang’s teachings so that the LSTM model is a sequence to sequence model for the purpose of adapting and enabling a question answering system to search for answers in a broad range of specialized domains including IT infrastructure (Semnani, Abstract).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SONIA L GAY whose telephone number is (571)270-1951. The examiner can normally be reached Monday-Friday 9-5 ET.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SONIA L GAY/Primary Examiner, Art Unit 2657