DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 8-10 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. 
Claims 8 and 10 recites the limitation "the selected sentence" in line 1.  There is insufficient antecedent basis for this limitation in the claim.  Claim 9 depends on claim 8 and therefore claim 9 has been rejected for the same reason.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-6, 8, 10-17 and 19 are rejected under 35 U.S.C. 102(a)(1)/(a)(2) as being anticipated by Non-Patent Literature “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding” by Devlin et al. (“Devlin”).
As to claims 1, 11-12, Devlin discloses a method, a computer readable program and a system for natural language processing [pages 1-14], comprising: pretraining a machine learning model that is based on a bidirectional encoder representations from transformers model [Abstract, section 3, 3.1,  Fig. 2], using a span selection training data set that associates a masked word with a passage [sections 3.3.1, 3.3.2, 3.4]; and performing a natural language processing task using the span selection pretrained machine learning model [Abstract, section 3.3.2].  
As to claims 2, 13, Devlin discloses pretraining the machine learning model using one or more pretraining tasks selected from the group consisting of multi-wordpiece cloze and next sentence prediction [sections 3.3.1-3.3.2].  
As to claims 3, 14, Devlin discloses generating elements of the span selection training data set by selecting a sentence in a text corpus and masking a portion of the selected sentence to remove one or more words from the selected sentence [sections 3.3.1-3.3.2].  
claims 4, 15, Devlin discloses wherein generating the span selection training data set further includes selecting a plurality of passages from the text corpus that are similar to the masked sentence [sections 3.3.1-3.3.2, 3.4].  
As to claims 5, 16, Devlin discloses wherein generating the span selection training data set further includes selecting a most similar passage that includes the masked portion from the plurality of passages [sections 3.3.1-3.3.2, 3.4, 4.1].  
As to claims 6, 17, Devlin discloses wherein generating the span selection training data set further includes pairing the masked sentence with the selected passage as one element of the span selection training data set [sections 3.2, 3.4, 4.1].  
As to claims 8, 19, Devlin discloses wherein masking the portion of the selected sentence comprises masking multiple words of the selected sentence [section 3.3.2, 3.4].
As to claim 10, Devlin wherein masking the portion of the selected sentence comprises replacing the portion of the selected sentence with a placeholder that indicates a location where one or more words are removed [3.3.1-3.3.2].  
	Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and 

Claims 7 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Devlin (as applied above) in view of Foreign Patent Publication No. CN-110414004 to Yang (“Yang”).  
As to claims 7, 18, Devlin discloses the method of claim 5 and the system of claim 16 [see rejections of claims 5 and 16].
Devlin does not expressly disclose measuring a similarity between the masked sentence and each passage from the selected plurality of passages that include the masked portion using a similarity metric that is selected from the group consisting of term frequency inverse document frequency metrics, latent semantic indexing metrics, and neural network information retrieval.   
In the same or similar field of invention, Yang discloses measuring a similarity between the masked sentence and each passage from the selected plurality of passages that include the masked portion using a similarity metric that is selected from the group consisting of term frequency inverse document frequency metrics, latent semantic indexing metrics, and neural network information retrieval [Yang page 5 lines 40-50, page 4 lines 5-21, page 8 lines 1-6, page 9 lines 22-30].
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Devlin to have a feature of measuring a similarity between the masked sentence and each passage from the selected plurality of passages that include the masked portion using a similarity metric that is selected from the group consisting of term frequency inverse document frequency metrics, latent semantic indexing metrics, and neural network information retrieval as taught by Yang.  The Yang page 2 lines 1-2]. 
Claims 9 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Devlin (as applied above) in view of Foreign Patent Publication No. CN-110414004 to ARDHANARI et al. (“Ardhanari”).  
As to claim 9, Devlin discloses the method of claim 8 [see rejections of claim 8].
Devlin does not expressly disclose wherein the multiple words are consecutive and have a total number of characters in a predetermined length range.   
In the same or similar field of invention, Ardhanari discloses wherein the multiple words are consecutive and have a total number of characters in a predetermined length range [Ardhanari paragraph 0185, Figs. 11-12].
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Devlin to have a feature of wherein the multiple words are consecutive and have a total number of characters in a predetermined length range
as taught by Yang.  The suggestion/motivation would have been to provide a system and a method analyze biomedical (and other types of) data by computational processes under the constraint of maintaining the privacy of the individual patient or consumer. Such a system and methods will consequently be of great commercial, social and scientific benefit to society [Ardhanari paragraph 0005]. 
As to claim 20, Devlin wherein a location where one or more words are removed [3.3.1-3.3.2].  Further, Ardhanari discloses wherein the multiple words are consecutive and have a total number of characters in a predetermined length range that indicates [Ardhanari paragraph 0185, Figs. 11-12]. In addition, the same motivation is used as the rejection of claim 9.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANTIM G SHAH whose telephone number is (571)270-5214. The examiner can normally be reached Mon-Fri 7:30am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached on 571-272-7488. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ANTIM G SHAH/Primary Examiner, Art Unit 2652