DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
 Applicants’ amendment filed on 5/6/22 has been entered. Claims 1, 3, 9-12, 14, 20 have been amended. Claims 8 and 19 have been canceled. No new claims have been added. Claims 1-7, 9-18, 20 are still pending in this application, with claims 1, 11-12 being independent.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 3, 11-12 and 14 are rejected under 35 U.S.C. 102(a)(1)/(a)(2) as being anticipated by Non-Patent Literature “SpanBERT: Improving Pre-Training by Representing and Predicting Spans” by Joshi et al. (“Joshi”).
As to claims 1, 11-12, Joshi discloses a method, a computer readable program and a system for natural language processing [pages 1-12], comprising: pretraining a machine learning model that is based on a bidirectional encoder representations from transformers model [Abstract, section 1,  Fig. 1], using a span selection training data set that associates a masked multi-word term with a passage [Abstract, sections 1, 3.1, Fig. 1]; and performing a natural language processing task using the span selection pretrained machine learning model [Abstract, section 1: “task such as question answering”, sections 4.1, 5.1].  
As to claims 3, 14, Joshi discloses generating elements of the span selection training data set by selecting a sentence in a text corpus and masking a portion of the selected sentence to remove multiple words from the selected sentence [Abstract, sections 1, 3.1, Fig. 1].  
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 4-6, 10, 13 and 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Non-Patent Literature “SpanBERT: Improving Pre-Training by Representing and Predicting Spans” by Joshi et al. (“Joshi”) in view of  Non-Patent Literature “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding” by Devlin et al. (“Devlin”).
As to claims 2, 13, Joshi discloses the method of claim 1 and the system of claim 12 ( see rejection of claims 1 and 12).  Joshi does not expressly disclose pretraining the machine learning model using one or more pretraining tasks selected from the group consisting of multi-wordpiece cloze and next sentence prediction. Even though, Joshi mentions providing the background regarding Devlin’s BERT disclosing  pretraining the machine learning model using one or more pretraining tasks selected from the group consisting of multi-wordpiece cloze and next sentence prediction [page 2, sections 2].  
In the same or similar field of invention, Devlin discloses pretraining the machine learning model using one or more pretraining tasks selected from the group consisting of multi-wordpiece cloze and next sentence prediction [Devlin 3.3.1-3.3.2].
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Joshi to have a feature of pretraining the machine learning model using one or more pretraining tasks selected from the group consisting of multi-wordpiece cloze and next sentence prediction as taught by Devlin.  The suggestion/motivation would have been to provide a fine-tuned pre-trained BERT representation with just one additional output layer to create state-of -the art models for a wide range of tasks [Devlin Abstract]. 
As to claims 4, 15, Devlin discloses the method of claim 3 (see rejection of claim 3). wherein generating the span selection training data set further includes selecting a plurality of passages from the text corpus that are similar to the masked sentence [sections 3.3.1-3.3.2, 3.4].  In addition, the same motivation is used as the rejection of claims 2 and 12.
As to claims 5, 16, Devlin discloses wherein generating the span selection training data set further includes selecting a most similar passage that includes the masked portion from the plurality of passages [sections 3.3.1-3.3.2, 3.4, 4.1].  In addition, the same motivation is used as the rejection of claims 2 and 12.
As to claims 6, 17, Devlin discloses wherein generating the span selection training data set further includes pairing the masked sentence with the selected passage as one element of the span selection training data set [sections 3.2, 3.4, 4.1].   In addition, the same motivation is used as the rejection of claims 2 and 12.
As to claim 10, Devlin wherein masking the portion of the selected sentence comprises replacing the portion of the selected sentence with a placeholder that indicates a location where one or more words are removed [3.3.1-3.3.2].  In addition, the same motivation is used as the rejection of claim 2.
Claims 9 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Joshi (as applied above) in view of U.S. Patent Application Publication No. 2021/0248268 to ARDHANARI et al. (“Ardhanari”).  
As to claim 9, Joshi discloses the method of claim 3 [see rejections of claim 3].
Joshi does not expressly disclose wherein the multiple words are consecutive and have a total number of characters in a predetermined length range.   
In the same or similar field of invention, Ardhanari discloses wherein the multiple words are consecutive and have a total number of characters in a predetermined length range [Ardhanari paragraph 0185, Figs. 11-12].
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Joshi to have a feature of wherein the multiple words are consecutive and have a total number of characters in a predetermined length range
as taught by Yang.  The suggestion/motivation would have been to provide a system and a method analyze biomedical (and other types of) data by computational processes under the constraint of maintaining the privacy of the individual patient or consumer. Such a system and methods will consequently be of great commercial, social and scientific benefit to society [Ardhanari paragraph 0005]. 
As to claim 20, Joshi discloses wherein a location where one or more words are removed [Fig. 1, also see section 2].  Further, Ardhanari discloses wherein the multiple words are consecutive and have a total number of characters in a predetermined length range that indicates [Ardhanari paragraph 0185, Figs. 11-12]. In addition, the same motivation is used as the rejection of claim 9.
Allowable Subject Matter
Claims 7 and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-7, 9-18 and 20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
	Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANTIM G SHAH whose telephone number is (571)270-5214. The examiner can normally be reached Mon-Fri 7:30am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached on 571-272-7488. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ANTIM G SHAH/Primary Examiner, Art Unit 2652