DETAILED ACTION
1.	This communication is in response to Application No. 16/586,675 filed on September 27, 2019 in which claims 1-20 are presented for examination. 

Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
3.	The information disclosure statement submitted on 01/02/2020 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.

Specification
4.	The abstract of the disclosure is objected to because the abstract exceeds 150 words.  Correction is required.  See MPEP § 608.01(b).

Claim Rejections - 35 USC § 103
5.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


6.	Claims 1-2, 8-14, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al. (hereinafter Wu) (US PG-PUB 20180330205), in view of Oquab et al. (hereinafter Oquab) (“Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks”).
Regarding Claim 1, Wu teaches a method for training a target neural network on a target machine learning task, the method comprising: 
obtaining a target dataset for training the target neural network on the target machine learning task, the target dataset comprising a plurality of target training examples (Wu, Abstract, “A target neural network is trained to perform the image classification task in the target domain. The training is based on the plurality of pairs of task-irrelevant images.”, thus, a target neural network is trained on a target machine learning task (image classification) with the training based on a target dataset with a plurality of pairs of task-irrelevant images); 
obtaining a source dataset for training a source neural network on a source machine learning task, the source dataset comprising a plurality of source training examples (Wu, Par. [0035], “The flow diagram 100 shown in FIG. 1 includes training the source CNN 106 using synthetic rendering images 108. In an embodiment, the synthetic rendering images 108 include labeled depth data generated from CAD. In an embodiment, the source CNN 106 is trained with the synthetic rendering images 108 with the objective of the training being to recognize the class (or category) and the pose of the object in the image. The class and pose of the object are examples of discriminative abstract features in the depth domain.”, thus, a source neural network (CNN) is trained on a target machine learning task (image classification) with the training based on a plurality of source training examples, including synthetic rendering images); 
training the target neural network on the target machine learning task using the target dataset to obtain trained values of the feature layer parameters and the target classification parameters (Wu, Par. [0044-0045], “By integrating the source training pipeline and the target training pipeline together as shown in FIG. 5 to create a joint neural network, the task of transferring abstract features from the source domain to the target domain, and optimization over the target task objective can be achieved simultaneously. The output of the training as shown in FIG. 5 is two analytics pipelines one with the source modality and the other with the target modality. This output can be used to solve the task objective (i.e., to recognize the class, or category, as well as the pose of an object in an image) effectively, even though no task-relevant data from the target domain was used throughout the training process.”, thus, the target neural network is trained based on transfer learning and transferring the abstract features from the source to target domain – thus, obtaining feature and classification parameters).

Wu does not teach wherein each of the target neural network and the source neural network has the same feature neural network layers having feature layer parameters, the target neural network further comprises one or more target classification layers having target classification parameters, and the source neural network further comprises one or more source classification layers having source classification parameters; 
However, Oquab teaches wherein each of the target neural network and the source neural network has the same feature neural network layers having feature layer parameters (Oquab, Pg. 1719, Figure 2, which depicts that the source and target have the same neural network layers and feature parameters are transferred, such that they are the same between source and target domains), the target neural network further comprises one or more target classification layers having target classification parameters, and the source neural network further comprises one or more source classification layers having source classification parameters (Oquab, Pgs. 1718-1719, “The CNN architecture of [24] contains more than 60 million parameters. Directly learning so many parameters from only a few thousand training images is problematic. The key idea of this work is that the internal layers of the CNN can act as a generic extractor of mid-level image representation, which can be pre-trained on one dataset (the source task, here ImageNet) and then re-used on other target tasks (here object and action classification in Pascal VOC), as illustrated in Figure 2.”, therefore, as further depicted by Figure 2, the target and source networks both have one or more layers that comprise classification parameters that are transferred from source to target. Further, the target tasks include classification parameters used for object and action classification); 

Wu does not teach generating, from the source training examples in the source dataset, a pre-training dataset using the source dataset and the target dataset so that the pre-training dataset captures features that are relevant to the target dataset; 
However, Oquab teaches generating, from the source training examples in the source dataset, a pre-training dataset using the source dataset and the target dataset so that the pre-training dataset captures features that are relevant to the target dataset (Oquab, Pgs. 1718-1719, “The key idea of this work is that the internal layers of the CNN can act as a generic extractor of mid-level image representation, which can be pre-trained on one dataset (the source task, here ImageNet) and then re-used on other target tasks (here object and action classification in Pascal VOC), as illustrated in Figure 2.”, thus, a pre-training dataset using the source dataset and target dataset is generated and used. Further, the pre-training dataset captures features relevant to the target dataset, which is further supported by the caption on Figure 2 that states pre-trained parameters of the internal layers of the CNN are transferred to target tasks);

Wu does not teach training the source neural network on the source machine learning task using the pre-training dataset to obtain first values of the feature layer parameters and the source classification parameters; 
However, Oquab teaches training the source neural network on the source machine learning task using the pre-training dataset to obtain first values of the feature layer parameters and the source classification parameters (Oquab, Pg. 1719, Figure 2, “Figure 2: Transferring parameters of a CNN. First, the network is trained on the source task (ImageNet classification, top row) with a large amount of available labelled images. Pre-trained parameters of the internal layers of the network (C1-FC7) are then transferred to the target tasks (Pascal VOC object or action classification, bottom row). To compensate for the different image statistics (type of objects, typical viewpoints, imaging conditions) of the source and target data we add an adaptation layer (fully connected layers FCa and FCb) and train them on the labelled data of the target task.”, therefore, the source network is trained using the pre-training dataset to obtain values of the feature parameters and classification parameters in the 1. Feature Learning step);

Wu does not teach initializing the feature layer parameters of the target neural network using the first values of the feature layer parameters from the training of the source neural network; and 
However, Oquab teaches initializing the feature layer parameters of the target neural network using the first values of the feature layer parameters from the training of the source neural network (Oquab, Pg. 1719, “The parameters of layers C1... C5, FC6 and FC7 are first trained on the source task, then transferred to the target task and kept fixed. Only the adaptation layer is trained on the target task training data as described next.”, thus, the parameters are first trained with the source network and are then transferred to the target network for use); and
		
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method for training a target neural network on a target machine learning task, consisting of obtaining a target and source dataset and training the target neural network on the target machine learning task using the target dataset to obtain trained values, as disclosed by Wu to include the feature layer parameters, classification layer parameters, and use of a pre-training dataset, as disclosed by Oquab. One of ordinary skill in the art would have been motivated to make this modification to produce a method for training a target neural network on a target machine learning task based on feature and classification parameters and a pre-training dataset from a source domain, such that the target neural network does not require a very large dataset to learn relevant parameters – thus, creating a more efficient and robust network without the need for a large amount of data and training time (Oquab, Pgs. 1718-1719, “The CNN architecture of [24] contains more than 60 million parameters. Directly learning so many parameters from only a few thousand training images is problematic. The key idea of this work is that the internal layers of the CNN can act as a generic extractor of mid-level image representation, which can be pre-trained on one dataset (the source task, here ImageNet) and then re-used on other target tasks (here object and action classification in Pascal VOC), as illustrated in Figure 2. However, this is difficult as the labels and the distribution of images (type of objects, typical viewpoints, imaging conditions, etc.) in the source and target datasets can be very different, as illustrated in Figure 3. To address these challenges we (i) design an architecture that explicitly remaps the class labels between the source and target tasks (Section 3.1), and (ii) develop training and test procedures, inspired by sliding window detectors, that explicitly deal with different distributions of object sizes, locations and scene clutter in source and target tasks (Sections 3.2 and 3.3)).

Regarding Claim 2, Wu in view of Oquab teaches the method of claim 1, wherein each source training example in the source dataset comprises a source training input and a respective ground-truth source output, wherein the respective ground-truth source output belongs to a set of possible source outputs (Wu, Par. [0035], “The flow diagram 100 shown in FIG. 1 includes training the source CNN 106 using synthetic rendering images 108. In an embodiment, the synthetic rendering images 108 include labeled depth data generated from CAD. In an embodiment, the source CNN 106 is trained with the synthetic rendering images 108 with the objective of the training being to recognize the class (or category) and the pose of the object in the image.”, therefore, the source dataset comprises a source training input (synthetic rendering images) and a respective ground-truth source output as indicated by relevant labels), and 
wherein each target training example in the target dataset comprises a target training input and a respective ground-truth target output (Oquab, Pg. 1720, “Sampled image patches may contain one or more objects, background, or only a part of the object. To label patches in training images, we measure the overlap between the bounding box of a patch P and ground truth bounding boxes B of annotated objects in the image.”, thus, as shown in Figure 4, the target dataset comprises target training input and respective ground truth output)
The reasons of obviousness have been noted in the rejection of Claim 1 above and applicable herein.

Regarding Claim 8, Wu in view of Oquab teaches the method of claim 1, wherein training the source neural network on the source machine learning task using the pre-training dataset to obtain the first values of the feature layer parameters and the source classification parameters comprises: 
adjusting values of the feature layer parameters and the source classification parameters to optimize a source objective function, wherein the source objective function measures an average performance of the source neural network on the source machine learning task given the source training examples in the pre-training dataset (Wu, Par. [0035], “The source CNN 106 can be used to produce a source representation 104, implemented for example as a one dimensional feature vector. As shown in FIG. 1, class and pose labels are input to the triplet loss 102 for use in supervising the training. The triplet loss 102 is an objective function that provides feedback that is used to adjust the source CNN 106.”, therefore, as shown in Figure 1, the triplet loss (label 102) is an objective function that is used to adjust the parameters of the source CNN. Since class and pose labels are input to the triplet loss and feedback is output, the objective function is able to measure the performance of the source neural network given the training examples).

Regarding Claim 9, Wu in view of Oquab teaches the method of claim 1, wherein training the target neural network on the target machine learning task using the target dataset to obtain trained values of the feature layer parameters and the target classification parameters comprises: 
adjusting values of the feature layer parameters and the target classification parameters to optimize a target objective function, wherein the target objective function measures an average performance of the target neural network on the target machine learning task given the target training examples in the target dataset (Wu, Par. [0043], “Turning now to FIG. 5, a flow diagram 500 illustrating a joint-training pipeline integrating a target task objective function (e.g., identify class and poses) and L2 loss enforcing the extraction of abstract features shared by both source and target domains is generally shown in accordance with one or more embodiments of the present invention. In an embodiment, the method shown in FIG. 5 is implemented using a computer such as computer 906 of FIG. 9 or computer 1101 of FIG. 11.”, therefore, similar to the source objective function disclosed in the rejection of Claim 8 above, a target task objective function and loss is also able to provide feedback on the inputted classes and poses, such that values within the target network can be adjusted based on the performance of the target network given the target training examples)

Regarding Claim 10, Wu in view of Oquab teaches the method of claim 1, wherein the source learning task (Wu, Par. [0035], “In an embodiment, the synthetic rendering images 108 include labeled depth data generated from CAD. In an embodiment, the source CNN 106 is trained with the synthetic rendering images 108 with the objective of the training being to recognize the class (or category) and the pose of the object in the image. The class and pose of the object are examples of discriminative abstract features in the depth domain.”, thus, the source network may have a task of identifying the category and pose of the object within the image) and the target machine learning task are different image classification tasks (Wu, Abstract, “A target neural network is trained to perform the image classification task in the target domain. The training is based on the plurality of pairs of task-irrelevant images.”, therefore, the target network may also consider pose of an object during training (disclosed in the rejection of Claim 9 above), but overall the task of the target neural network is to perform image classification).

Regarding Claim 11, Wu in view of Oquab teaches the method of claim 1, further comprising: 
using the trained target neural network to process a new input to generate a new output (Wu, Claim 1, “ […] performing the image classification task in the target domain, the performing including applying the target neural network to an image in the target domain and outputting an identified feature.”, thus, once trained, the target neural network may be used to process a new input and generate a new output by performing an image classification task).

Regarding Claim 12, Wu in view of Oquab teaches the method of claim 1, further comprising: 
providing the trained target neural network to a system that uses the trained neural network to process a new input to generate a new output (Wu, Claim 9, “[…] performing the image classification task in the target domain, the performing including applying the target neural network to an image in the target domain and outputting an identified feature.”, thus, a system comprising a memory, one or more processors, and instructions may also utilize the trained target neural network to process a new input to generate a new output by performing an image classification task).

Regarding Claim 13, Wu in view of Oquab teaches a system comprising one or more computers and one or more storage devices storing instructions that, when executed by the one or more computers, cause the one or more computers to perform operations (Wu, Claim 9, “A system comprising: a memory having computer readable instructions; and one or more processors for executing the computer readable instructions, the computer readable instructions controlling the one or more processors to perform operations comprising: receiving a request to perform an image classification task in a target domain, the image classification task including identifying a feature in images in the target domain;”, thus, a system comprising one or more computers and storage devices is disclosed).
	The rest of the claim language recites substantially the same limitations as Claim 1, in the form of a system, therefore it is rejected under the same rationale. 
The reasons of obviousness have been noted in the rejection of Claim 1 above and applicable herein.

Claim 14 recites substantially the same limitations as Claim 2, in the form of a system, therefore it is rejected under the same rationale.

Regarding Claim 20, Wu in view of Oquab teaches one or more non-transitory computer-readable storage media encoded with instructions that, when executed by one or more computers, cause the one or more computers to perform operations (Wu, Claim 17, “A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform operations comprising: receiving a request to perform an image classification task in a target domain, the image classification task including identifying a feature in images in the target domain;”, thus, one or more computer readable storage media encoded with instructions is disclosed).
	The rest of the claim language recites substantially the same limitations as Claim 1, in the form of one or more non-transitory computer-readable storage media, therefore it is rejected under the same rationale. 
The reasons of obviousness have been noted in the rejection of Claim 1 above and applicable herein.

7.	Claims 3-7 and 15-19 are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al. (hereinafter Wu) (US PG-PUB 20180330205), in view of Oquab et al. (hereinafter Oquab) (“Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks”), further in view of Sarkar et al. (hereinafter Sarkar) (US PG-PUB 20180349788).
Regarding Claim 3, Wu in view of Oquab teaches the method of claim 2.
Wu in view of Oquab does not teach wherein generating the pre-training dataset using the source dataset and the target dataset comprising: 
generating, for each source output in the set of possible source outputs, a respective importance weight based on the source dataset and the target training inputs, the respective importance weight indicating the importance of the source output in training the target neural network; and 
generating the pre-training dataset by sampling a set of source training examples from the source dataset based on the importance weights.
However, Sarkar teaches wherein generating the pre-training dataset using the source dataset and the target dataset comprising: 
generating, for each source output in the set of possible source outputs, a respective importance weight based on the source dataset and the target training inputs, the respective importance weight indicating the importance of the source output in training the target neural network (Sarkar, Fig. 3 & Fig. 4, which depict flowcharts on how respective weight history for both the source and target networks are generated during training, and then used to accelerate the training of the target network); and 
generating the pre-training dataset by sampling a set of source training examples from the source dataset based on the importance weights (Sarkar, Par. [0017], “The source network is a neural network that provides, during its own training period, the data used to generate training examples for the introspection network. The source neural network is also referred to as just the source network for the sake of brevity. The scalar weight may also be referred to as a parameter. The weights are parameters that a neural network uses in its mapping function to provide an output value given the inputs. Once trained, the introspection network can then be used to accelerate the training of an unseen network, or target network, by predicting the value of the weights several thousand steps into the future.”, thus, training examples are generated based on the source network training/training dataset – further, the weights are considered when generating training examples, as the value of the target network weights may be predicted based on the source network weights during training).

It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of training a target neural network on a target machine learning task as per Claim 1, as disclosed by Wu in view of Oquab to include the use of importance weights, as disclosed by Sarkar. One of ordinary skill in the art would have been motivated to make this modification to allow for the use of importance weights based on the source network/dataset to train the target neural network and enable the target network to reach convergence with fewer training rounds (Sarkar, Par. [0016], “Systems and methods use the introspection network to predict weights that are used during training of another neural network. The other network trained using the introspection network is referred to as the target neural network or just target network. Thus target network and target neural network both refer to a neural network to be trained using application of the introspection network at one or more training steps. The history of a weight may include the weight value at as few as four previous training steps. The introspection network propels training of the target network by enabling the target network to reach convergence (complete the training phase) with fewer training rounds, which can represent a savings of hours or days of computer processing time. The introspection network has a low memory footprint and can be used in conjunction with other optimizations. The introspection network can be used to accelerate training of different target networks, e.g., with different inputs, different configurations, different tasks, without retraining.”)

Regarding Claim 4, Wu in view of Oquab further in view of Sarkar teaches the method of claim 3, wherein generating, for each source output in the set of possible source outputs, a respective importance weight based on the source dataset and the target training inputs comprising: 
training a classifier neural network on the source dataset, wherein the classifier neural network is configured to receive an input and to generate for the input a respective output that belongs to the set of possible source outputs (Wu, Par. [0050], “As shown in FIG. 7, output from the source representation 714 and the target representation 104 concatenated representation 708, which is input to an RGB-D classifier 710 is trained using the softmax loss 712 as the objective function and supervised by the class label at training time. At testing time, there will be no softmax loss 712, and the RGB-D classifier 710 directly outputs the predicted class label.”, thus, a classifier (label 710) is trained based on the inputted source representation/dataset and is configured to receive an input and generate a respective output).

Regarding Claim 5, Wu in view of Oquab further in view of Sarkar teaches the method of claim 4, wherein generating, for each source output in the set of possible source outputs, a respective importance weight based on the source dataset and the target training inputs comprising: 
for each target training input in the target dataset, processing the target training input using the trained classifier neural network to generate a respective temporary predicted output for the target training input (Wu, Par. [0051], “Turning now to FIG. 8, a flow diagram 800 illustrating a testing time pipeline of fusing using both a source modality and a target modality is generally shown in accordance with one or more embodiments of the present invention. In an embodiment, the pipeline shown in FIG. 8 is implemented using a computer such as computer 906 of FIG. 9 or computer 1101 of FIG. 11. After learning the fusion analytics pipeline, the simulated target analytics pipeline can be changed back to the real target analytics pipeline when real data from the target domain, including task-irrelevant real RGB images 308 are available for input. As shown in FIG. 8, the RGB-D classifier 710 outputs a prediction, or class label.”, therefore, the trained classifier (label 710) is able to take input from the target network using the target representation/dataset to generate a predicted output); 
determining, for each source output in the set of possible source outputs, a respective first rate of appearance of the source output in a set of the temporary predicted outputs with respective to the target machine learning task; determining, for each source output in the set of possible source outputs, a respective second rate of appearance of the source output in the source dataset with respective to the source machine learning task (Oquab, Pg. 1720, “We employ a sliding window strategy and extract around 500 square patches from each image by sampling eight different scales on a regularly-spaced grid with at least 50% overlap between neighboring patches. […] Sampled image patches may contain one or more objects, background, or only a part of the object. To label patches in training images, we measure the overlap between the bounding box of a patch P and ground truth bounding boxes B of annotated objects in the image.”, therefore, a patch would be considered a sample of an image that is used in training. Further, under section 3.3 Classification, Formula (1) which is also shown below, is used to compute the overall score for an object Cn in an image. Where                         
                            
                                
                                    y
                                    (
                                    
                                        
                                            C
                                        
                                        
                                            n
                                        
                                    
                                    |
                                    
                                        
                                            P
                                        
                                        
                                            i
                                        
                                    
                                    )
                                
                                
                                     
                                
                            
                        
                    is the output for the network for the class Cn on image patch Pi and M is the total number of patches in the image. Therefore, similar to what is explained as the “rate of appearance” in the instant application’s specification Pgs. 6-7 Equations 1-3 & Par. [34], the overall score of patches is able to compare the number of times an output appears according to a class within an image patch (number of times source output appears) by the total number of patches in the image (total number of tsource training examples in the source dataset) to determine the score for an object/rate of appearance of the object. Further, this calculation may be repeated for both source and target machine learning tasks, as mentioned on Pg. 1719 of the Oquab reference.)
Formula (1):                         
                            s
                            c
                            o
                            r
                            e
                            
                                
                                    
                                        
                                            C
                                        
                                        
                                            n
                                        
                                    
                                
                            
                            =
                             
                            
                                
                                    1
                                
                                
                                    M
                                
                            
                             
                            
                                
                                    ∑
                                    
                                        i
                                        =
                                        1
                                    
                                    
                                        M
                                    
                                
                                
                                    
                                        
                                            y
                                            (
                                            
                                                
                                                    C
                                                
                                                
                                                    n
                                                
                                            
                                            |
                                            
                                                
                                                    P
                                                
                                                
                                                    i
                                                
                                            
                                            )
                                        
                                        
                                            k
                                        
                                    
                                
                            
                        
                    
; and 
generating, for each source output, the respective importance weight based on the respective first rate of appearance and the respective second rate of appearance (Oquab, Pg. 1720, “As discussed above, the target task has an additional background label for patches that do not contain any object. One additional difficulty is that the training data is unbalanced: most patches from training images come from background. This can be addressed by re-weighting the training cost function, which would amount to re-weighting its gradients during training. We opt for a slightly different procedure and instead re-sample the training patches to balance the training data distribution. This resampled training set is then used to form mini-batches for the stochastic gradient descent training. This is implemented by sampling a random 10% of the training background patches.”, therefore, according to the rates of appearance in both source and target for each subsample training patch that is used, the training cost function can be re-weighted accordingly to account for how often an object appears throughout prediction and training – this can eliminate or re-weight patches that come from background and provide no relevance/benefit to the dataset)
The reasons of obviousness have been noted in the rejection of Claim 3 above and applicable herein.

Regarding Claim 6, Wu in view of Oquab further in view of Sarkar teaches the method of claim 3, wherein the set of source training examples is sampled from the source dataset with replacement (Oquab, Pg. 1720, “We employ a sliding window strategy and extract around
500 square patches from each image by sampling eight different scales on a regularly-spaced grid with at least 50% overlap between neighboring patches.”, thus, in sampling with replacement, as stated in instant application’s specification Par. [51], the system samples source training examples at a rate proportional to the importance weights computed before, repeating examples as needed. Similarly, there is a 50% overlap between neighboring patches, and as stated in the rejection of Claim 5 above, resampling is done as necessary, based on re-weighting the training cost function to account for importance weight and rate of appearance)
The reasons of obviousness have been noted in the rejection of Claim 3 above and applicable herein.

Regarding Claim 7, Wu in view of Oquab further in view of Sarkar teaches the method of claim 3, wherein the set of source training examples is sampled from the source dataset without replacement (Wu, Par. [0025], “Contemporary approaches include different types of domain adaptation approaches such as, but not limited to: unsupervised domain adaptation, where a learning sample contains a set of labeled source examples, a set of unlabeled source examples, and an unlabeled set of target examples; semi-supervised domain adaptation that includes a small set of labeled target examples; and fully supervised domain adaptation, where all the examples considered are labeled.”, thus, in the case of training in the source and target domain of the Wu reference, training samples are not weighted accordingly (weights discussed by Sarkar and Oquab references), thus, re-sampling does not occur according to weights and rate of appearance and instead, sets of labeled/unlabeled examples are processed accordingly without replacement)

Claim 15 recites substantially the same limitations as Claim 3, in the form of a system, therefore it is rejected under the same rationale.

Claim 16 recites substantially the same limitations as Claim 4, in the form of a system, therefore it is rejected under the same rationale.

Claim 17 recites substantially the same limitations as Claim 5, in the form of a system, therefore it is rejected under the same rationale.

Claim 18 recites substantially the same limitations as Claim 6, in the form of a system, therefore it is rejected under the same rationale.

Claim 19 recites substantially the same limitations as Claim 7, in the form of a system, therefore it is rejected under the same rationale.

Conclusion
8.	The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
Weiss et al. (“A Survey of Transfer Learning”) disclosed current solutions and published research as related to transfer learning.
Kulis et al. (“What You Saw is Not What You Get: Domain Adaptation Using Asymmetric Kernel Transforms”) disclosed methods for knowledge transfer from source to target domains, using existing transfer learning paradigms.
Su et al. (“Transfer Learning for Video Recognition with Scarce Training Data for Deep Convolutional Neural Network”) disclosed a deep convolution neural network which utilizes transfer learning with weakly labeled data to learn important visual patterns for natural image data.
Shao et al. (“Transfer Learning for Visual Categorization: A Survey”) disclosed using transfer learning to address cross-domain learning problems by extracting useful information from data in a related domain and transferring them for use in target tasks.
Liang et al. (“Transfer Learning for High Resolution Aerial Image Classification”) disclosed using transfer learning techniques for aerial image classification, in which a CNN is pre-trained on a larger dataset beforehand.
Tan et al. (“A Survey on Deep Transfer Learning”) disclosed using transfer learning to solve the problem of insufficient training data.
Ding et al. (“Task-Driven Deep Transfer Learning for Image Classification”) disclosed a task-driven deep transfer learning framework for image classification, in which more discriminative features are generated by using the classifier performance as a guide.
French et al. (“Self-ensembling for Visual Domain Adaptation”) disclosed the use of a student teacher model for visual domain adaptation problems.
Ciresan et al. (“Transfer Learning for Latin and Chinese Characters with Deep Neural Networks”) disclosed transfer learning with deep neural networks on various character recognition tasks.
Krupat et al. (US PG-PUB 20180196432) disclosed image analysis performed for a two-sided data hub, which contains classification and feature layers.
Sawada et al. (US PG-PUB 20160224892) disclosed a transfer learning apparatus, with a transfer target data evaluator and an output layer adjuster.
Sawada et al. (US PG-PUB 20180025271) disclosed a learning apparatus with first and second neural networks that utilize transfer learning. 
Shapiro et al. (US Patent 11137462) disclosed source and target domains trained with transfer learning for tracking magnetically-labeled substances using magnetic resonance imaging (MRI).
Lecue et al. (US Patent 10474495) disclosed receiving source, target, and external data and a target task to generate features between source and target data.

9.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Devika S Maharaj whose telephone number is 571-272-0829. The examiner can normally be reached Monday - Thursday 7:30am - 4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/D.S.M./Examiner, Art Unit 2123                                                                                                                                                                                                        

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123