Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement filed 01/08/2020 fails to comply with the provisions of 37 CFR 1.97, 1.98 and MPEP § 609 because references 1 and 3 are missing the month and year of publication and because references 3 is missing the place of publication. See MPEP 609.04(a)(I). Examiner respectfully requests Applicant to resubmit the IDS containing this information. 
Applicant is advised that the date of any re-submission of any item of information contained in this information disclosure statement or the submission of any missing element(s) will be the date of submission for purposes of determining compliance with the requirements based on the time of filing the statement, including all certification requirements for statements under 37 CFR 1.97(e).  See MPEP § 609.05(a).
The information disclosure statements (IDS) submitted on 08/03/2018 and 03/16/2020 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.
Drawings
The drawings are objected to because In Fig. 1, at numeral 128, “D/C” should only be “C”.  
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference character(s) not mentioned in the description: 
Fig. 2, numerals 244(a)-(n)
Fig. 4, numeral 454 
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should 
Specification
The abstract of the disclosure is objected to because of a minor informality. In the first sentence, “An system” should read “A system”.  Correction is required.  See MPEP § 608.01(b).
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
The disclosure is objected to because of the following informalities: It is unclear what “[1-4]” refers to in the last sentence of para. [0034]. Appropriate correction is required.
Claim Objections
Claims 7 and 21 are objected to because of the following minor informalities: 
In Claim 7 line 1, “claim1” should read “”claim 1”
In claim 21, third-to-last line, insert “and” after “neural network;”
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):



The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 9 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term "similar" in CLAIM 9, line 2 is a relative term which renders the claim indefinite.  The term "similar" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.  For examining purposes, Examiner is interpreting “similar” as “similar within a predefined threshold”. Amending claim 9 as such would lean more favorably towards eligibility.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


CLAIM 21 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
identifying one or more first layers; (But for the recitation of a neural network, this limitation is a mental process which can reasonably be performed in the mind with the aid of pencil and paper.)
(But for the recitation of a neural network, this limitation is a mental process which can reasonably be performed in the mind with the aid of pencil and paper. In Fig. 3, the layers are parsed into fixed layers 352 and programmable layers 358, 360.)
generating one or more first maps as a function of the input data and the fixed portion of the one or more first layers; (But for the recitation of a neural network, this limitation is a mathematical process. This limitation is also a mental process because mapping input data can reasonably be performed in the mind with the aid of pencil and paper.)
generating one or more second maps as a function of the input data and the programmable portion of the one or more first layers; (But for the recitation of a neural network, this limitation is a mathematical process. This limitation is also a mental process because mapping input data can reasonably be performed in the mind with the aid of pencil and paper.)
Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. The claim includes the following additional elements:
accessing input data; 
accessing a neural network; 
[layers of] the neural network
utilizing the one or more first maps and the one or more second maps with subsequent layer
	Accessing input data and accessing a neural network are insignificant extra-solution activities under MPEP § 2106.05(g) because they amount to mere data gathering. The neural network is not a meaningful limitation under MPEP § 2106.05(e) because it is associated with the description of the layers used in judicial exceptions (i.e., “layers of the neural network”). 
associating or providing the maps to the layers of the neural network, which amounts to insignificant extra-solution activity under MPEP § 2106.05(g). Specification para. [0073] reads: “The first map(s), such as intermediate F maps and the second map(s), such as concatenate F maps are provided to subsequent neural network layer(s) (618).” Specification para. 101 repeats the “utilizing” limitation verbatim. To integrate the judicial exception into a practical application, the neural network must somehow benefit from the claim limitations explicitly, such as by processing data. The claim would lean more favorably towards eligibility by amending the claim along these lines. 
	Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B:  The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because accessing input data, accessing a neural network, and utilizing the one or more first maps and the one or more second maps with subsequent layer amounts to no more than insignificant extra-solution activity under MPEP 2106.05(g); and because the neural network is not a meaningful limitation under MPEP § 2106.05(e). The claim is not patent eligible.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –




Claim(s) 1-21 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by non-patent literature “How transferable are features in deep neural networks?” (Nov. 6, 2014) to Yosinski et al., hereinafter Yosinski.
Regarding Claim 1, Yosinski teaches by Fig. 1 (below) and its caption: A method comprising: 
accessing a first neural network that has one or more layers; (Neural network baseA)
training the first neural network on a first dataset; (baseA is trained on dataset A)
generating a first set of weights associated with one or more layers of the first neural network, the first set of weights based on the training of the first neural network on the first dataset; (Weights                         
                            
                                
                                    W
                                
                                
                                    A
                                    1
                                
                            
                        
                     to                         
                            
                                
                                    W
                                
                                
                                    A
                                    8
                                
                            
                        
                     of baseA are generated by the training on dataset A.)
accessing a second dataset; (database B)
modifying selected ones of the first set of weights to generate a second set of weights, based on the second dataset; and (The first set of weights                         
                            
                                
                                    W
                                
                                
                                    A
                                    1
                                
                            
                        
                     to                         
                            
                                
                                    W
                                
                                
                                    A
                                    8
                                
                            
                        
                     from baseA are copied to neural network A3B+ where weights                         
                            
                                
                                    W
                                
                                
                                    A
                                    4
                                
                            
                        
                     to                         
                            
                                
                                    W
                                
                                
                                    A
                                    8
                                
                            
                        
                     are randomized, and where all layers may learn. See Yosinki p. 3, second bullet: “A transfer network A3B: the first 3 layers are copied from baseA and frozen. The five higher layers (4–8) are initialized randomly and trained toward dataset B. Intuitively, here we copy the first 3 layers from a network trained on dataset A and then learn higher layer features on top of them to classify a new target dataset B.” And fourth bullet: “A transfer network A3B+: just like A3B, but where all layers learn.”
The claim limitation “based on the second dataset” is interpreted as being in preparation for training using dataset B.)
utilizing the second set of weights to train a second neural network. (Network A3B+ is trained on the second dataset)

    PNG
    media_image1.png
    475
    606
    media_image1.png
    Greyscale

Yosinski Fig. 1
Regarding claim 2, Yosinski teaches: The method as claimed in claim 1, further comprising identifying similarities of the first set of weights and the second set of weights. (This limitation is broadly interpreted as the weights                                 
                                    
                                        
                                            W
                                        
                                        
                                            A
                                            1
                                        
                                    
                                
                             to                                 
                                    
                                        
                                            W
                                        
                                        
                                            A
                                            3
                                        
                                    
                                
                             being identical in networks baseA and A3B+.)

Regarding claim 3, Yosinski teaches: The method as claimed in claim 1, further comprising determining that the first dataset is the same domain as the second dataset. (Yosinski teaches that the first and second datasets are images, which is a domain of data (P. 2, second-to-last paragraph: “To create base and target datasets that are similar to each other, we randomly assign half of the 1000 

Regarding claim 4, Yosinski teaches: The method as claimed in claim 1 further comprising identifying one or more programmable layers. (Programmable layers are interpreted as layers whose weights may be updated. Layers 4 to 8 of network A3B+ reads on this limitation. On p. 2, largest middle paragraph, Yosinski calls these layers fine-tuned layers.)

Regarding claim 5, Yosinski teaches: The method as claimed in claim 1, further comprising: identifying one or more layers in the first neural network; and utilizing the identified one or more layers in the first neural network in the second neural network. (Yosinski teaches identifying layers 1-3 from network baseA and copying them to (utilizing them in) network A3B+.)

Regarding claim 6, Yosinski teaches: The method as claimed in claim 5, further comprising updating one or more weights associated with the one or more layers of the first neural network. (Yosinski teaches that weights                                 
                                    
                                        
                                            W
                                        
                                        
                                            A
                                            1
                                        
                                    
                                
                             to                                 
                                    
                                        
                                            W
                                        
                                        
                                            A
                                            3
                                        
                                    
                                
                              in layers 1-3 of network A3B+ are trained (updated) by dataset B)

Regarding claim 7, Yosinski teaches: The method as claimed in claim1 [sic], further comprising: identifying selected layers of the first neural network having particular connectivity properties; and (Yosinkski teaches layers 1-3 of network baseA and AnB+ are connected.)
updating the identified selected layers of the first neural network based on the second neural network. (Layers 1-3 in network A3B+ are updated during training.)

Regarding claim 8, Yosinski teaches by Fig. 1 and its caption: An apparatus comprising: 
a first neural network that has one or more layers; (Neural network baseA)
a first dataset that is used to train the first neural network; (baseA is trained on dataset A)
a first set of weights associated with one or more layers of the first neural network, the first set of weights generated based on the training of the first neural network on the first dataset; (Weights                         
                            
                                
                                    W
                                
                                
                                    A
                                    1
                                
                            
                        
                     to                         
                            
                                
                                    W
                                
                                
                                    A
                                    8
                                
                            
                        
                     of baseA are generated by the training on dataset A.)
a second dataset; (database B)
a second set of weights generated by modifying selected ones of the first set of weights and the second set of weights based on the second dataset; and (The first set of weights                         
                            
                                
                                    W
                                
                                
                                    A
                                    1
                                
                            
                        
                     to                         
                            
                                
                                    W
                                
                                
                                    A
                                    8
                                
                            
                        
                     from baseA are copied to neural network A3B+ where weights                         
                            
                                
                                    W
                                
                                
                                    A
                                    4
                                
                            
                        
                     to                         
                            
                                
                                    W
                                
                                
                                    A
                                    8
                                
                            
                        
                     are randomized, and where all layers may learn. See Yosinski p. 3, second bullet: “A transfer network A3B: the first 3 layers are copied from baseA and frozen. The five higher layers (4–8) are initialized randomly and trained toward dataset B. Intuitively, here we copy the first 3 layers from a network trained on dataset A and then learn higher layer features on top of them to classify a new target dataset B.” And fourth bullet: “A transfer network A3B+: just like A3B, but where all layers learn.”)
The claim limitation “based on the second dataset” is interpreted as being in preparation for training using dataset B.)
a second neural network trained based on the second set of weights. (Network A3B+ is trained on the second dataset)

Regarding claim 9, Yosinski teaches: The apparatus as claimed in claim 8, where the first set of weights and the second set of weights are similar. (This limitation is broadly interpreted as the weights                                 
                                    
                                        
                                            W
                                        
                                        
                                            A
                                            1
                                        
                                    
                                
                             to                                 
                                    
                                        
                                            W
                                        
                                        
                                            A
                                            3
                                        
                                    
                                
                             being identical in networks baseA and A3B+.)

Regarding claim 10, Yosinski teaches: The apparatus as claimed in claim 8, where the first dataset is the same domain as the second dataset. (Yosinski teaches that the first and second datasets are images, which is a domain of data (P. 2, second-to-last paragraph: “To create base and target datasets that are similar to each other, we randomly assign half of the 1000 ImageNet classes to A and half to B.”). Yosinski teaches that “ImageNet contains clusters of similar classes, particularly dogs and cats”, which are domains of data.)

Regarding claim 11, Yosinski teaches: The apparatus as claimed in claim 8, where the one or more layers include one or more programmable layers. (Programmable layers are interpreted as layers whose weights may be updated. Layers 4 to 8 of network A3B+ reads on this limitation. On p. 2, largest middle paragraph, Yosinski calls these layers fine-tuned layers.)

Regarding claim 12, Yosinski teaches: The apparatus as claimed in claim 8, where the one or more layers include one or more layers in the first neural that are used in the second neural network. (Yosinski teaches layers 1-3 of network baseA are copied to (used in) network A3B+.)

Regarding claim 13, Yosinski teaches: The apparatus as claimed in claim 12, further comprising one or more weights associated with the one or more layers of the first neural network. (Yosinski teaches layers 1-3 of network A3B+ contain the same weights                                 
                                    
                                        
                                            W
                                        
                                        
                                            A
                                            1
                                        
                                    
                                
                             to                                 
                                    
                                        
                                            W
                                        
                                        
                                            A
                                            3
                                        
                                    
                                
                              as layers 1-3 of network baseA.)

Regarding claim 14, Yosinski teaches: The apparatus as claimed in claim 8, where selected layers of the first neural network have particular connectivity properties. (Yosinkski teaches layers 1-3 of network baseA and AnB+ are connected.)

Regarding claim 15, Yosinski teaches by Fig. 1 and its caption: A system comprising: a memory; and a processor, coupled to the memory, that executes instructions stored in the memory, the instructions comprising: (Memory and instructions are implied by experimental results on p. 4 and by the Supplementary Materials at the end of the document. Processor is taught by Supplementary Materials, page 1, section A, ¶ 2: “ NVidia K20 GPU”)
accessing a first neural network that has one or more layers; (Neural network baseA)
training the first neural network on a first dataset; (baseA is trained on dataset A)
generating a first set of weights associated with one or more layers of the first neural network, the first set of weights based on the training of the first neural network on the first dataset; (Weights                         
                            
                                
                                    W
                                
                                
                                    A
                                    1
                                
                            
                        
                     to                         
                            
                                
                                    W
                                
                                
                                    A
                                    8
                                
                            
                        
                     of baseA are generated by the training on dataset A.)
accessing a second dataset; (database B)
modifying selected ones of the first set of weights to generate a second set of weights, based on the second dataset; and (The first set of weights                         
                            
                                
                                    W
                                
                                
                                    A
                                    1
                                
                            
                        
                     to                         
                            
                                
                                    W
                                
                                
                                    A
                                    8
                                
                            
                        
                     from baseA are copied to neural network A3B+ where weights                         
                            
                                
                                    W
                                
                                
                                    A
                                    4
                                
                            
                        
                     to                         
                            
                                
                                    W
                                
                                
                                    A
                                    8
                                
                            
                        
                     are randomized, and where all layers may learn. See Yosinski p. 3, second bullet: “A transfer network A3B: the first 3 layers are copied from baseA and frozen. The five higher layers (4–8) are initialized randomly and trained toward dataset B. Intuitively, here we copy the first 3 layers from a network trained on dataset A and then learn higher layer features on top of them to classify a new target dataset B.” And fourth bullet: “A transfer network A3B+: just like A3B, but where all layers learn.”
The claim limitation “based on the second dataset” is interpreted as being in preparation for training using dataset B.)
utilizing the second set of weights to train a second neural network. (Network A3B+ is trained on the second dataset)

Regarding claim 16, Yosinski teaches: The system as claimed in claim 15, where the instructions further comprise identifying similarities of the first set of weights and the second set of weights. (This limitation is broadly interpreted as the weights                                 
                                    
                                        
                                            W
                                        
                                        
                                            A
                                            1
                                        
                                    
                                
                             to                                 
                                    
                                        
                                            W
                                        
                                        
                                            A
                                            3
                                        
                                    
                                
                             being identical in networks baseA and A3B+.)

Regarding claim 17, Yosinski teaches: The system as claimed in claim 15, where the instructions further comprise determining that the first dataset is the same domain as the second dataset. (Yosinski teaches that the first and second datasets are images, which is a domain of data (P. 2, second-to-last paragraph: “To create base and target datasets that are similar to each other, we randomly assign half of the 1000 ImageNet classes to A and half to B.”). Yosinski teaches that “ImageNet contains clusters of similar classes, particularly dogs and cats”, which are domains of data.)

Regarding claim 18 Yosinski teaches: The system as claimed in claim 15, where the instructions further comprise identifying one or more programmable layers. (Programmable layers are interpreted as layers whose weights may be updated. Layers 4 to 8 of network A3B+ reads on this limitation. On p. 2, largest middle paragraph, Yosinski calls these layers fine-tuned layers.)

Regarding claim 19, Yosinski teaches: The system as claimed in claim 15, where the instructions further comprise: identifying one or more layers in the first neural network; and utilizing the identified one or more layers in the first neural network in the second neural network. (Yosinski teaches identifying layers 1-3 from network baseA and copying them to (utilizing them in) network A3B+.)

Regarding claim 20, Yosinski teaches: The system as claimed in claim 19, where the instructions further comprise updating one or more weights associated with the one or more layers of the first neural network. (Yosinski teaches that weights                                 
                                    
                                        
                                            W
                                        
                                        
                                            A
                                            1
                                        
                                    
                                
                             to                                 
                                    
                                        
                                            W
                                        
                                        
                                            A
                                            3
                                        
                                    
                                
                              in layers 1-3 of network A3B+ are trained (updated) by dataset B)

Regarding claim 21, Yosinski teaches in Fig. 1 and its caption:  A method comprising: 
accessing input data; (dataset B)
accessing a neural network; (Network A3B)
identifying one or more first layers of the neural network; (Layers 1-6)
parsing the one or more first layers of the neural network into a fixed portion (Layers 1-3) and a programmable portion; (Layers 4-6)
generating one or more first maps as a function of the input data and the fixed portion of the one or more first layers of the neural network; (First maps are the outputs of layers 1-3)
generating one or more second maps as a function of the input data and the programmable portion of the one or more first layers of the neural network; (Second maps are the outputs of layers 4-8)
utilizing the one or more first maps and the one or more second maps with subsequent layers of the neural network. (Results from first and second maps are processed by subsequent layers 7-8)

    PNG
    media_image2.png
    243
    656
    media_image2.png
    Greyscale


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Neural network transfer learning is taught by "Learning multiple visual domains with residual adapters" (Nov. 2017) to Rebuffi et al. and "Efficient parametrization of multi-domain deep neural networks" (March 2018) to Rebuffi et al. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher H. Jablon whose telephone number is (571)270-7648.  The examiner can normally be reached on Monday - Friday, 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/ASHER H. JABLON/Examiner, Art Unit 2122                                                                                                                                                                                                        
/ERIC NILSSON/Primary Examiner, Art Unit 2122