Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Drawings
Color photographs and color drawings are not accepted in utility applications unless a petition filed under 37 CFR 1.84(a)(2) is granted. Any such petition must be accompanied by the appropriate fee set forth in 37 CFR 1.17(h), one set of color drawings or color photographs, as appropriate, if submitted via EFS-Web or three sets of color drawings or color photographs, as appropriate, if not submitted via EFS-Web, and, unless already present, an amendment to include the following language as the first paragraph of the brief description of the drawings section of the specification:
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
Color photographs will be accepted if the conditions for accepting color drawings and black and white photographs have been satisfied. See 37 CFR 1.84(b)(2).

The examiner notes that the petition filed on 9/11/2020 was dismissed.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

1,2 4-6  provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of copending Application No. 17/018,372 (herein after ‘372)in view of Chen et al “Big Self Supervised Models are Strong Semi Supervised Learners” 17 June 2020. 
This is a provisional nonstatutory double patenting rejection.

Re claim 1, claim 1 of ‘372 discloses A computer-implemented method for, the method comprising: obtaining a training image (see obtaining step); performing a plurality of first augmentation operations on the training image to obtain a first augmented image (see performing a plurality step); separate from performing the plurality of first augmentation operations, performing a plurality of second augmentation operations on the training image to obtain a second augmented image (see separated from step); respectively processing, with a base encoder neural network, the first augmented image and the second augmented image to respectively generate a first intermediate representation for the first augmented image and a second intermediate representation for the second augmented image(see respectively processing with base encoder step); 

respectively processing, with a projection head neural network, the first intermediate representation and the second intermediate representation to respectively generate a first projected representation for the first augmented image and a second projected representation for the second augmented image(see respectively processing with the projection head step ); evaluating a loss function that evaluates a difference between the first projected representation and the second projected representation (see evaluating step); modifying one or more values of one or more parameters of one or both of the base encoder neural network and the projection head neural network based at least in part on the loss function (see modifying step; 




Claim 1 does not expressly discloses 
performing semi-supervised contrastive learning of visual representations, 
training images in a set of one or more unlabeled training images
projection head neural network comprising a plurality of layers 
after said modifying, generating an image classification model from the base encoder neural network and the projection head neural network, the image classification model comprising some but not all of the plurality of layers of the projection head neural network; and performing fine-tuning of the image classification model based on a set of labeled images.

Chen discloses 
performing semi-supervised contrastive learning of visual representations (see abstract), 
training images in a set of one or more unlabeled training images (see abstract)
projection head neural network comprising a plurality of layers (see figure 3 not projection head has multiple layers)
after said modifying, generating an image classification model from the base encoder neural network and the projection head neural network, the image classification model comprising some but not all of the plurality of layers of the projection head neural network (see page 4 section entitled fine tuning: Instead of throwing it all away, we propose to incorporate part of the MLP projection head into the base encoder during the fine-tuning. This is equivalent to fine-tuning from a middle layer of the projection head, instead of the input layer of the projection head as in SimCLR); 
and performing fine-tuning of the image classification model based on a set of labeled images (see page 4 section entitled fine tuning).

The motivation to combine is Using a big (deep and wide) neural network for self-supervised pretraining and fine-tuning greatly improves accuracy. Therefore it would have Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Chen and ‘372 to reach the aforementioned advantage.


Re claim 2 Chen further discloses after performing the fine-tuning, performing distillation training using the set of unlabeled training images, wherein the distillation training distills the image classification model to a student model comprising a relatively smaller number of parameters relative to the image classification model model (see page 4 section entitled Self training /knowledge distillation via unlabeled examples note that that a smaller architecture is used  see also section 3.3 ). The motivation to combine is Using a big (deep and wide) neural network for self-supervised pretraining and fine-tuning greatly improves accuracy. Therefore it would have Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Chen and ‘372 to reach the aforementioned advantage.


Re claim 4 Chen further discloses  wherein the plurality of layers of the projection head neural network comprise an initial layer, an output layer, and one or more hidden layers between the initial layer and the output layer (see figure 3 note that the projection head contains 3 layer the first is a input the second as the hidden and the third layer is an output layer), and wherein the some but not all of the plurality of layers of the projection head neural network comprise at least the initial layer of the projection head neural network (see page 4 section 

Re claim 5 Chen further discloses wherein the plurality of layers of the projection head neural network comprise an initial layer, an output layer, and one or more hidden layers between the initial layer and the output layer (see figure 3 note that the projection head contains 3 layer the first is a input the second as the hidden and the third layer is an output layer), and wherein the some but not all of the plurality of layers of the projection head neural network further comprises at least one of the one or more hidden layers of the projection head neural network (see page 4 section entitled fine tuning: Instead of throwing it all away, we propose to incorporate part of the MLP projection head into the base encoder during the fine-tuning. This is equivalent to fine-tuning from a middle layer of the projection head, instead of the input layer of the projection head as in SimCLR , note that the from the middle layer would includes the input and middle layer). The motivation to combine is Using a big (deep and wide) neural network for self-supervised pretraining and fine-tuning greatly improves accuracy. Therefore it would have Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Chen and ‘372 to reach the aforementioned advantage.



Claim 3  provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of copending Application No. 17/018,372 (herein after ‘372)in view of Chen et al “Big Self Supervised Models are Strong Semi Supervised Learners” 17 June 2020 in futher view of Lin et al US 2016/0328644. 

Re claim 3 Lin and Chen do not disclose  deploying the student model to one or more computing devices after the distillation training. Lin discloses deploying the student model to one or more computing devices after the distillation training see paragraph 26 note that ANN is downloaded to mobile device  see paragraph 67 note that the student network is used for mobile applications). The motivation is that the student network is preferred on mobile applications due to its size (see paragraph 67). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine ’372 Chen and Lin to reach the aforementioned advantage.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 5 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.   The examiner notes that claim 5 includes the claim language “wherein the plurality of layers of the projection head neural network comprise an initial layer, an output layer, and one or more hidden layers between the initial layer and the output layer,” This claim language is repeated from claim 4 and  its not clear if  “an initial layer, an output layer, and one or more hidden layers between the initial layer and the output layer” are a second iteration of these elements or refer back to the features already defined in claim 4.




Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 13-15 and 18-20   is/are rejected under 35 U.S.C. 102 (a)(1) as being anticipated  by Chen et al “Big Self Supervised Models are Strong Semi Supervised Learners” 17 June 2020 .  The examiner notes that while this reference is authored by the joint inventors  by another.    




Re claim 13 Chen discloses  

A computer-implemented method for performing semi-supervised contrastive learning (see abstract), the method comprising: 
performing contrastive learning based on a set of one or more unlabeled training data examples (see page 3 Second entitled self supervised pretraining with sim CLRv2 ); 
generating an image classification model based on a base encoder neural network used in performing the contrastive learning and based on some but not all of a plurality of layers in a projection head neural network used in performing the contrastive learning (see page 4 section entitled fine tuning: Instead of throwing it all away, we propose to incorporate part of the MLP projection head into the base encoder during the fine-tuning. This is equivalent to fine-tuning from a middle layer of the projection head, instead of the input layer of the projection head as in SimCLR); 
performing fine-tuning of the image classification model based on a set of one or more labeled training data (see page 4 section entitled fine tuning); 
and after performing the fine-tuning of the image classification model, performing distillation training using the set of unlabeled training data examples, the distillation training distilling the image classification model to a student model comprising a relatively smaller number of parameters relative to the image classification model (see page 4 section entitled 

Re claim 14 Chen disclose wherein the fine-tuning of the image classification model is performed using the some but not all of the plurality of layers of the projection head neural network(see page 4 section entitled fine tuning: Instead of throwing it all away, we propose to incorporate part of the MLP projection head into the base encoder during the fine-tuning. This is equivalent to fine-tuning from a middle layer of the projection head, instead of the input layer of the projection head as in SimCLR).

Re claim 15 Chen discloses wherein the fine-tuning of the image classification model is performed using the some but not all of the plurality of layers of the projection head neural network and the base encoder neural network(see page 4 section entitled fine tuning: Instead of throwing it all away, we propose to incorporate part of the MLP projection head into the base encoder during the fine-tuning. This is equivalent to fine-tuning from a middle layer of the projection head, instead of the input layer of the projection head as in SimCLR).

Re claim 18 Chen discloses further comprising: wherein the some but not all of the plurality of layers of the projection head neural network comprise one or more non-input layers of the projection head neural network(see page 4 section entitled fine tuning: Instead of throwing it all away, we propose to incorporate part of the MLP projection head into the base encoder during the fine-tuning. This is equivalent to fine-tuning from a middle layer of the projection head, instead of the input layer of the projection head as in SimCLR).

Re claim 19 Chen discloses wherein the some but not all of the plurality of layers of the projection head neural network comprise a non-input first layer of the projection head neural 

Re claim 20 Chen discloses wherein the projection head neural network comprises at least three layers (see figure 3 note that the projection head has three layers)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 16 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al “Big Self Supervised Models are Strong Semi Supervised Learners” 17 June 2020  in further view of Lin et al US 2016/0328644

Re claim 16 Chen discloses all the elements of claim 13 Chen does not expressly disclose further comprising: deploying the student model to one or more client computing devices after performing the distillation training. Lin discloses : deploying the student model to one or more client computing devices after performing the distillation training (see paragraph 26 note that ANN is downloaded to mobile device  see paragraph 67 note that the student network is used for mobile applications). The motivation is that the student network is preferred on mobile applications due to its size (see paragraph 67). Therefore it would have been obvious to one of 

Re claim 17 Lin further discloses wherein the one or more client computing devices comprise at least one mobile device (see paragraph 26 note that ANN is downloaded to mobile device  see paragraph 67 note that the student network is used for mobile applications).


Allowable Subject Matter
Claim 7-12 are  allowed.
Re claim 7 Chen discloses  
A computing system for performing semi-supervised contrastive learning of visual representations (see abstract), the computing system comprising: one or more processors; and one or more non-transitory computer-readable media that collectively store (see abstract note that this is a computer vision application intended to be executed by a computer ): an image classification model comprising a base encoder neural network, one or more projection head neural network layers (see page 4 section entitled fine tuning: Instead of throwing it all away, we propose to incorporate part of the MLP projection head into the base encoder during the fine-tuning. This is equivalent to fine-tuning from a middle layer of the projection head, instead of the input layer of the projection head as in SimCLR), , wherein the base encoder neural network and the one or more projection head neural network layers have been pretrained using contrastive learning based on a set of one or more unlabeled visual data (see page 3 Second entitled self supervised pretraining with sim CLRv2 ), and wherein the one or more projection head neural network layers comprise some but not all of a plurality of projection head neural network layers from a projection head neural network employed during said contrastive learning (see page 4 section entitled fine tuning: Instead of throwing it all away, we propose to 


 Chen does not expressly disclose: an image classification model comprising a base encoder neural network, one or more projection head neural network layers, and a classification head, performing distillation training using the one or more projection head neural network layers pretrained using contrastive learning


Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEAN T MOTSINGER whose telephone number is (571)270-1237. The examiner can normally be reached 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on (571)272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/SEAN T MOTSINGER/Primary Examiner, Art Unit 2669