DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Examiner notes the entry of the following papers:
Amended claims filed 6/30/2022.
Applicant arguments/remarks filed 6/30/2022.
Claims 1 and 14 are amended.  Claims 7, and 15-17 are cancelled.  Claims 21-24 are new. 4.	Claims 1-6, 8-14, and 18-24 are presented for examination.
Response to Arguments
Applicant presents arguments filed 6/30/2022.  Each is addressed.
Applicant argues that “the rejection has failed to properly establish through evidence that the relied-upon subject matter from Roz actually has a publication date that predates Applicant’s filing date. Specifically, while Applicant’s filing date is October 31, 2018, the top of page 801 of Roz indicates that Roz was published about half a year later in April of 2019.  Additionally, while the bottom of the left-hand column of page 801 indicates that a prior version was published on March 7, 2018, that version has not been used in the rejection and nothing has been placed into evidence demonstrating that the relied-upon subject matter from the April 2019 publication was also disclosed in the prior version dated March 7, 2018.” (Remarks, page 11, paragraph 1, line 2.)  However,  the IEEE identifies the Date of Publication for Journals and Standards as the very first instance of public dissemination, rather than by versions which may be formatted differently. See below: 

    PNG
    media_image1.png
    286
    1030
    media_image1.png
    Greyscale

The IEEE publication clearly shows the Date of Publication is 7 Mar 2018. 

    PNG
    media_image2.png
    180
    727
    media_image2.png
    Greyscale

In addition, the USPTO uses publication dates for identifying when the content of a publication is accessible to the public rather than version dates.  See MPEP sections below:

    PNG
    media_image3.png
    115
    1142
    media_image3.png
    Greyscale


    PNG
    media_image4.png
    246
    1189
    media_image4.png
    Greyscale

Therefore, the rejection is proper and maintained.
Applicant argues that “the rejection of Claim 3 construes output layers as “final layers” rather than any set “of parallel layers other than the final layers” which Claim 3’s rejection alleges are themselves “layers other than output layers”.  But this is inconsistent with the claim construction of an “output layer” in the rejections of Claims 1 and 11 themselves since those rejections allege in terms of the first and second layers being output layers that “each layer output[ting] data which is used as input for the subsequent layer” is apparently an  “output layer” (see, e.g., page 4 of the Office Action).” (Remarks, page 12, paragraph 3, line 3.) However, Examiner is distinguishing between the output of a layer and an “output layer”.  
Claim 3 recites:
The apparatus of claim 1,wherein the third and fourth layers are layers other than output layers.
	It is known in the art that for multilayer neural networks, each layer performs calculations and produces an output which is the input of the subsequent layer.  However, this is distinguished from the phrase “output layer” which is typically regarded as the final layer. Since every layer produces output which is at least used as input to the subsequent layer, it is unclear what the claim is intending by distinguishing “output layer” if it isn’t indicating the final layer.  
	From Claim 1, the third and fourth layers are not part of the same neural network, they are intermediate (i.e. not final) parallel layers belonging to two separate networks, respectively.   (See Claim 1 “the third layer being an intermediate layer of the second neural network; select the third layer and a fourth layer, the fourth layer being an intermediate layer of the first neural network, the third and fourth layers being parallel intermediate layers;”) Therefore, Examiner is interpreting “output layer” as the final layer of the respective neural network. This means that the third and fourth layers aren’t the final layer of their respective neural networks, which is what is recited by the claims. Therefore, rejection is proper and maintained.  See detailed rejection.
Applicant argues that “The rejection of Claim 12 points only to Roz’s Figure 1 and also references the rejection of Claim 10.  However, on its face Roz’s Figure 1 says nothing about adding together first and second adjustments of any kind.” (Remarks, page 14, paragraph 3, line 1.)  However, as described in the mapping of claim 12 “the loss function is determine adjustment of weights, after the loss functions are determined for both the source network and the target network, the adjustments are compared by the regularization step, in Fig. 1.” Figure 1 shows the loss that is calculated through backpropagation being sent to both streams.  The loss is what is used for adjusting the weights of each layer.  The first stream makes an adjustment, and the second “parallel” stream makes an adjustment to weights based on the “loss”.  This is the first adjustment. The regularization step is the second adjustment which is added to the first adjustment and is made to prevent the corresponding weights of the parallel layers from being too different from each other.  Claim 10 is referenced because it maps the “discrepancy functions” which are used for the regularization step. (As described in Figure 1 “loss functions that prevent corresponding weight from being too different from each other.”)  See detailed rejection.
Applicant argues “the rejection of Claim 15 itself does not even establish that Jamal (Deep Domain Adaptation in Action Space) has a publication date that predates Applicant’s filing date.  To reiterate, Applicant’s filing date is October 31, 2018.” (Remarks, page 16, paragraph 2, line 1.) However, Jamal was published and presented at the BMVC 2018 conference that was held in September 2018.  See below:

    PNG
    media_image5.png
    269
    589
    media_image5.png
    Greyscale
 and,

    PNG
    media_image6.png
    416
    977
    media_image6.png
    Greyscale

Applicant argues that “Dependent Claim 15 has now been embodied in independent Claim 14. As such, independent Claim 14 now recites the first domain includes real world video data and that the second domain includes computer game video data.” (Remarks, page 15, paragraph 2, line 1.)  And, the cited prior art “fails to explicitly indicate anything about video games, making it impossible for this portion to fairly approach what is recited now in Claim 14.” (Remarks, page 16, paragraph 4, line 1.) The prior Claim 15 (now embodied in Claim 14) recited: 
The apparatus of Claim 14, wherein
	the first domain comprises real world video data and the second domain comprises computer game video data.
	Examiner notes that, nowhere in the specification is computer game video data distinguished from real world video data technically.  The first instance of the phrase in the Summary recites “A pair of training data domains may be established by, for instance, real world video and computer game video, first and second speaker voices (for voice recognition), standard font text and cursive script (for handwriting recognition), etc.” (Specification, page 2, line 2.) The rest of the instances are similar. There is no description that explains how the two video domains are qualitatively different in a technical manner such as frame rate, interlace, meta data, format, etc. 
	In the absence of any distinguishing technical characteristics, Examiner is interpreting that there is no technical difference between real world video data and computer game video data other than the subject.  Jamal teaches this (Jamal, page 3, paragraph 6, line 1 “All the DA (domain adaptation -added by Examiner) techniques found in the literature address the image/object classification problem.  In fact, we could hardly find any work on the video-to-video domain adaptation problem.  There are a few studies [6, 19, 31] on cross-view action recognition and a few on heterogeneous domain adaptation [8,9].  In that sense, to the best of our knowledge, this paper is one of the first few papers for the video-video adaptation.”).  Therefore, rejection is proper and maintained.
Applicant argues that “the proffered motivation to combine the references from the rejection of Claim 15 appears to have also been made in error. Specifically, at page 15 of the Office Action, the rejection cites “safety of the public to be able to understand what is happening in videos” as a reason for the proposed combination, citing Jamal’s page 1, paragraph 2, in support.  But this portion of Jamal says absolutely nothing about “safety” of anything for any purpose.” (Remarks, page 18, paragraph 2, line 1.)  However, the motivation paragraph recites (Jamal, page 1, paragraph 2, line 1 “Today, surveillance cameras are everywhere be it city streets, market place, building or airports.  These cameras operate 24x7, generating a massive amount of video data that needs to be processed for autonomous understanding of events and activities occurring in the scene.”)  Examiner acknowledges that Jamal does not explicitly recite the word  “safety”. Examiner summarized the typical justification by Governments’ for operating surveillance cameras ubiquitously, that is, in order to stop crime, terrorists, etc., i.e. for public safety.   Examiner does not agree that the summarization is incorrect but deletes the word “safety”. Nevertheless, it would be obvious to combine Jamal into Rozantsev because video cameras are everywhere which generate a massive amount of video data that needs to be processed.
Claim Rejections - 35 USC § 102
A person shall be entitled to a patent unless -
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-5, 8-13, 21, and 24 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Rozantsev et al (Beyond Sharing Weights for Deep Domain Adaptation, herein Rozantsev).
Regarding claim 1,
Rozantsev teaches an apparatus, comprising: at least one processor; configured with instructions executable by the at least one processor to: (Rozantsev, page 801, column 1, paragraph 1, line 7 “To this end, we introduce a two-stream architecture, where one operates in the source domain and the other in the target domain.” In other words, two-stream architecture is a machine learning method.  It is implicit that a machine learning method requires at least one processor configured with instructions executable by the at least one processor in order to execute.)
	access a first neural network, the first neural network being associated with a first data type; access a second neural network, the second neural network being associated with a second data type different from the first data type; (Rozantsev, Fig. 1, and page 803, column 2, paragraph 2, line 5 “To implement this idea, we therefore introduce a two-stream architecture, such as the one depicted by Fig. 1.  The first stream operates on the source data, the second on the target one, and they are trained jointly.”  

    PNG
    media_image7.png
    503
    578
    media_image7.png
    Greyscale

In other words, first stream is first neural network, source data is the first data type, second stream is second neural network, target stream is the second data type, and (from fig. 1) the target data (real images) is different from the source data (synthetic images) is the second data type different from the first data type.)
provide, as input, first training data to the first neural network; provide, as input, second training data to the second neural network, the first training data being different from the second training data; (Rozantsev, page 803, column 2, paragraph 2, line 7 “The first stream operates on the source data, the second on the target one, and they are trained jointly.” In other words, source data (synthetic images) is first training data, first stream is first neural network, the target data (real images) is the second training data which is different from the first, the second stream is the second neural network, and the two streams of data are provided as input to the two neural networks, separately.)
	identify a first output from a first layer, the first layer being an output layer of the first neural network, the first output being based on the first training data; identify a second output from a second layer, the second layer being an output layer of the second neural network, the second output being based on the second training data; (Rozantsev, Fig.1,  “Our two-stream architecture.  One stream operates on the source data and the other on the target one.  Their weights are not shared.  Instead, we introduce loss functions that prevent corresponding weights from being too different from each other.” Examiner notes each layer outputs data which is used as input for the subsequent layer in the respective stream.  In other words, Fig. 1 shows a first output from a first layer of the first neural network based on the first training data, and a second output from a second layer of the second neural network based on the second training data.)
	based on the first and second outputs, determine a first adjustment to one or more weights of a third layer, the third layer being an intermediate layer of the second neural network; select the third layer and a fourth layer, the fourth layer being an intermediate layer of the first neural network, the third and fourth layers being parallel intermediate layers; (Rozantsev, Fig. 1, and, page 805, column 2, paragraph 2, line  1 “To learn the model parameters, we first pre-train the source stream using the source data only.  We then simultaneously optimize the weights of both streams according to the loss of Eqs. (2), (3), (4), (5) using both source and target data, with the target stream weights initialized from the pre-trained source weights.” In other words, pre-train the source stream using the source data only, then simultaneously optimize weights of both streams is based on the first and second outputs, optimize the weights is first adjustment to one or more weights of the third layer, and, from Fig. 1, the layers are intermediate layers, and the third and fourth layers are parallel intermediate layers.)
	compare a third output from the third layer to a fourth output from the fourth layer, the third and fourth outputs being respective outputs of the respective third and fourth layers prior to the third and fourth outputs being respectively provided to subsequent respective layers of the respective neural networks, the third and fourth outputs being respectively based on the second and first training data; (Rozantsev, Fig. 1, see prior mapping. And, page 804, column 1, paragraph 2, line 1 “While our goal is to go beyond sharing the layer weights, we still believe that corresponding weights in the two streams should be related.  This models the fact that the source and target domains are related, and prevents overfitting in the target stream, when only very few labeled samples are available.  Our weight regularizer rw(.)  therefore represents the distance between the source and target weights in a particular layer. In principle, we could take it to directly act on the difference of those weights, and thus write

    PNG
    media_image8.png
    62
    522
    media_image8.png
    Greyscale
This, however, would not truly attempt to model the domain shift, for instance to account for different means and ranges of values in the two types of data. To better model the shift and introduce more flexibility in our model, we therefore propose not to penalize linear transformations between the source and target weights. We then write our regularizer either by relying on the L2 norm as   

    PNG
    media_image9.png
    60
    513
    media_image9.png
    Greyscale
or in an exponential form as
  
    PNG
    media_image10.png
    52
    573
    media_image10.png
    Greyscale
In both cases, aj and bj are scalar parameters that are different for each layer j ϵ Ω and learned at training time along with all other network parameters.” In other words, the weight regularizer compares the weights of the third and fourth layers which are parallel to each other and belong to the second and first neural networks, respectively, and their outputs are based respectively on the second and first training data.)
	based on the comparison, determine a second adjustment to the one or more weights of the third layer; and adjust the one or more weights of the third layer based on consideration of both the first adjustment and the second adjustment. (Rozantsev, Fig. 1, see prior mapping. In other words, the first adjustment is done through normal backpropagation through the respective neural network to the third layer and the fourth layer.  Then, the regularization step determines a second adjustment to the one or more weights of the third layer and the fourth layer based on a comparison of the weights at the parallel levels to ensure the corresponding weights are not too different from each other. This is adjusting the one or more weights of the third layer based on consideration of both the first adjustment and the second adjustment.)
Regarding claim 2,
	Rozantsev teaches the apparatus of claim 1, wherein 
	the second neural network is established by a copy of the first neural network prior to the second training data being provided to the second neural network. (Rozantsev, Fig. 1, and, page 805, column 2, paragraph 2, line  1 “To learn the model parameters, we first pre-train the source stream using the source data only.  We then simultaneously optimize the weights of both streams according to the loss of Eqs. (2), (3), (4), and (5) using both source and target data, with the target stream weights initialized from the pre-trained source weights.” The two-stream architecture starts with identical neural networks (See Fig. 1). Then the source and target stream weights are initialized to the pre-trained source weights. This is the second neural network is established by a copy of the first neural network prior to the training data being provided to the second neural network.)
Regarding claim 3,
	Rozantsev teaches the apparatus of Claim 1, wherein
	the third and fourth layers are layers other than output layers. (Rozantsev, Fig. 1. Examiner notes that claim 1 explicitly states that the third and fourth layers are intermediate parallel layers.  Also, Examiner is interpreting “output layer” as final layer.  (See paragraph 5. b. above.) In other words, from Fig. 1, any of the sets of parallel layers other than the final layers are layers other than output layers.)
Regarding claim 4,
	Rozantsev teaches the apparatus of Claim 3, wherein 
	the third and fourth layers are intermediate hidden layers of the respective neural networks. (Rozantsev, Fig. 1. The parallel convolutional layers are intermediate hidden layers of the respective neural networks.)
Regarding claim 5,
	Rozantsev teaches the apparatus of Claim 1, wherein 
	the first training data is related to the second training data.  (Rozantsev, Fig. 1. The source training set is synthetic images, and the target training set are real images. They are both sets of images which is the first training data is related to the second training data.)
Regarding claim 8,
	Rozantsev teaches the apparatus of Claim 1, wherein 
	the instructions are executable by the at least one processor to: compare the third output to the fourth output to determine the similarity of the third output to the fourth output, the similarity evaluated using a first function. (Rozantsev, page 804, column 2, paragraph 4, line 1 “Maximum Mean Discrepancy. As the name suggests, given two sets of data, the MMD measures the distance between the mean of the two sets after mapping each sample to a Reproducing Kernel Hilbert Space (RKHS). In our context, let 
    PNG
    media_image11.png
    30
    145
    media_image11.png
    Greyscale
 be the feature representation at the last layer of the source stream, and 
    PNG
    media_image12.png
    34
    143
    media_image12.png
    Greyscale
 of the target stream.” In other words,  
    PNG
    media_image12.png
    34
    143
    media_image12.png
    Greyscale
 is a first function used to compare the similarity of the third output to the fourth output. )
Regarding claim 9,
	Rozantsev teaches the apparatus of Claim 8, wherein 
	the determination of the first adjustment to the one or more weights of the third layer is based on a second function different from the first function.  (Rozantsev, See mapping for claim 8.  In other words,  
    PNG
    media_image11.png
    30
    145
    media_image11.png
    Greyscale
 is a second function different from the first function.)
Regarding claim 10,
	Rozantsev teaches the apparatus of Claim 9, wherein 
	the first and second functions are discrepancy functions.  (Rozantsev, see mapping of claims 8 and 9, and page 804, paragraph 4, line 6 “The squared MMD between the source and target domains can be expressed as   
    PNG
    media_image13.png
    92
    530
    media_image13.png
    Greyscale
  where ф(.) denotes the mapping to RKHS.”  In other words,  
    PNG
    media_image12.png
    34
    143
    media_image12.png
    Greyscale
and 
    PNG
    media_image11.png
    30
    145
    media_image11.png
    Greyscale
 are the first and second discrepancy functions for the target stream and the source stream, respectively, that are used to calculate the Maximum Mean Discrepancy (MMD).)
Claim 11 is a method claim corresponding to apparatus claim 1.  Otherwise, they are the same.  It is implicit that a computer implemented method requires at least one processor and at least one computer storage that is not a transitory signal in order to execute.  Therefore, claim 11 is rejected for the same reasons as claim 1.
Regarding claim 12,
	Rozantsev teaches the method of Claim 11, wherein 
	the one or more weights of the third layer are adjusted by adding together the first adjustment and the second adjustment, the first and second adjustments both pertaining to weight changes.  (Rozantsev, Fig. 1, and, see mapping of claim 10.  In other words, loss function is determine adjustment of weights, after the loss functions are determined for both the source network and the target network, the adjustments are compared by the regularization step, in Fig. 1. This is compare the third output from the third layer to the fourth output from the fourth layer, loss function is the first adjustment, and the regularization step is the second adjustment both pertaining to weight changes.)
Regarding claim 13,
	Rozantsev teaches the method of Claim 11, comprising: 
	determining the first adjustment to one or more weights of the third layer using a first loss function; and comparing the third output to the fourth output using a second loss function different from the first loss function to determine the second adjustment.  (Rozantsev, Fig. 1, see prior mapping.  In other words, loss function is determine adjustment to one or more weights of the third layer using a loss function, after the loss functions are determined for both the source network and the target network, the adjustments are compared by the regularization step, in Fig. 1. This is compare the third output from the third layer to the fourth output from the fourth layer using a second loss function.)
Regarding claim 21, 
Rozantsev teaches the apparatus of Claim 1,
	wherein the one or more weights of the third layer are adjusted by adding together the first adjustment and the second adjustment, the first and second adjustment both pertaining to weight changes (Rozantsev, Fig. 1, See mapping of claim 13.  In other words, loss function is determine adjustment to one or more weights of the third layer, the regularization step is the second adjustment, and the regularization step applied to the weights of the third layer is adding together the first adjustment and the second adjustment, both pertaining to weight changes.).
Claim 24 is a method claim corresponding to apparatus claim 3.  Otherwise, they are the same. Therefore, claim 24 is rejected for the same reasons as claim 3.
	
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 6, 14, 18-20, and 22-23 are rejected under 35 U.S.C. 103 as being unpatentable over Rozantsev, and Jamal et al (Deep Domain Adaptation in Action Space, herein Jamal).
Regarding claim 6,
	Rozantsev teaches the apparatus of Claim 5, wherein 
	Thus far, Rozantsev does not explicitly teach the first and second neural networks pertain to action recognition, and wherein the first training data is related to the second training data in that the first and second training data both pertain to a same action. 
Jamal teaches the first and second neural networks pertain to action recognition, and wherein the first training data is related to the second training data in that the first and second training data both pertain to a same action. (Jamal, page 1, paragraph 1, “In this paper, we investigate the problem of Domain Shift in action videos, an area that has remained under-explored, and propose two new approaches named Action Modeling on Latent Subspace (AMLS) and Deep Adversarial Action Adaptation (DAAA). In the AMLS approach, the action videos in the target domain are modeled as a sequence of points on a latent subspace and adaptive kernels are successively learned between the source domain point and the sequence of target domain points on the manifold… The action adaptation experiments were conducted using various combinations of multi-domain action datasets, including six common classes of OLYMPIC Sports and UCF50 datasets and all classes of KTH, MSR and our own SonyCam datasets. In other words, action adaptation is pertain to action, and domain adaptation shift, and source domain and target domain is the first and second training data both pertain to the same action.) 
Both Rozantsev and Jamal are directed to deep domain adaptation, among other things.  Rozantsev teaches domain adaptation for images but does not specifically teach domain adaptation for detecting action in videos or video to video domain adaptation.  Jamal teaches domain adaptation for detecting action in videos and video to video domain adaptation.  In view of the teaching of Rozantsev it would be obvious to one of ordinary skill in the art before the effective filing data of the claimed invention to combine the teaching of Jamal into Rozantsev.  This would result in being able to perform transfer learning between two domains of videos for action detection.
	One of ordinary skill in the art would be motivated to do this because surveillance cameras are everywhere. The massive amount of data the cameras generate make it difficult to process and interpret all of the data. (Jamal, page 1, paragraph 2, line 1 “Today, surveillance cameras are everywhere, be it city streets, market place, buildings or airports.  These cameras operate 24x7, generating a massive amount of video data that needs to be processed for autonomous understanding of events and activities occurring in the scene.”)
Regarding claim 14,
	Rozantsev teaches an apparatus, comprising: 
	at least one computer storage that is not a transitory signal and that comprises instructions executable by at least one processor to: access a first domain, the first domain being associated with a first domain genre; access a second domain, the second domain being associated with a second domain genre different from the first domain genre; using training data provided to the first and second domains, classify a target data set; and output a classification of the target data set;  (Rozantsev, Fig. 1, and page 1, paragraph 1, line 1 “The performance of a classifier trained on data coming from a specific domain typically degrades when applied to a related but different one. While annotating many samples from the new domain would address this issue, it is often too expensive or impractical.  Domain Adaptation has therefore emerged as a solution to this problem;” Examiner notes, computer storage, executable instructions, and at least one processor have been previously mapped. See mapping of claim 1. In other words, the first domain genre is synthesized images, the second domain genre is real images, synthesized images are different from real images is the second domain being associated with a second domain genre different from the first domain genre, and the two-stream architecture is a domain adaptation module (from mapping of claim 1) which is a classifier that outputs a classification of the target data set.)
	wherein the first domain comprises real world video data and the second domain comprises computer game video data (Jamal, page 3, paragraph 6, line 1 “All the DA techniques found in the literature address the image/object classification problem.  In fact, we could hardly find any work on the video-to-video domain adaptation problem.  There are a few studies [6, 19, 31] on cross-view action recognition and a few on heterogeneous domain adaptation [8,9].  In that sense, to the best of our knowledge, this paper is one of the first few papers for the video-video domain adaptation.” And, page 1, paragraph 1, line 14 “The action adaptation experiments were conducted using various combinations of multi-domain action datasets, including six common classes of Olympic Sorts and UCF50 datasets and all classes of KTH, MSR and our own SonyCam datasets.” Examiner is interpreting that there is no technical difference between real world video data and computer game video data other than the subject (See paragraph 5. e.). Examiner notes that the MSR datasets include skeleton data in screen coordinates (MSRAction3DSkeleton (20joints).rar), (e.g. https://wangjiangb.github.io/my_data.html, page is also included in 892) which is virtual and therefore equivalent to a computer game video data set. In other words, video-video is the first domain comprises real world video data and the second domain comprises computer game video data.).
Regarding claim 18,
	Rozantsev teaches the apparatus of Claim 14, wherein 
	the target data set is classified at least in part based on execution of a domain adaptation module established at least in part by a loss function.  (Rozantsev, Fig.1, and page 801, column 2, paragraph 3, line 2 “To this end, we introduce the two-stream architecture depicted by Fig. 1.” And, page 801, column 2, paragraph 3, line 6 “To nonetheless encode the fact that both streams tackle the same recognition problem, albeit in different domains, we introduce a loss function that relates the corresponding weights in both layers.” In other words, from Fig. 1, the target data set is classified, the two-stream architecture is a domain adaptation module, and loss function that relates the corresponding weights is at least in part by a loss function.)
Regarding claim 19,
	Rozantsev teaches the apparatus of Claim 14, wherein 
	the target data set is classified by a domain adaptation module receiving input from multiple output points from the first and second domains of training data.  (Rozantsev, Fig. 3, and, page 805, column 2, paragraph 5, line 5 “We then demonstrate that it generalizes well to other classification problems by testing it on the Office, MNIST+USPS  and MNIST+SVHN datasets.”

    PNG
    media_image14.png
    433
    1174
    media_image14.png
    Greyscale

In other words, two-stream architecture is a domain adaptation module, and testing it on Office, MNIST+USPS  and MNIST+SVHN datasets is target data set is classified on input from multiple output points from the first and second domains of the training data. See Fig. 3 for a depiction of two domain genres (synthetic and real) of training data and the test dataset for classification of the UAV dataset.)
Regarding claim 20,
	Rozantsev teaches the apparatus of Claim 19, wherein 
	the domain adaptation module uses a discrepancy function to calculate a distance of overall data distribution between source and target data.  (Rozantsev, See mapping of claim 10. In other words, two stream architecture is domain adaptation module and Maximum Mean Discrepancy (MMD) is discrepancy function that calculates a distance of the overall data distribution between source and target data.)
Regarding claim 22,
	Rozantsev teaches the apparatus of Claim 1,
	wherein the first training data comprises real world video data and the second training data comprises computer game video data (See mapping of claim 14.). 
Claim 23 is a method claim corresponding to apparatus claim 22.  Otherwise, they are the same.  Therefore, claim 23 is rejected for the same reasons as claim 22.	
Conclusion
	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to BART RYLANDER whose telephone number is (571)272-8359. The examiner can normally be reached Monday - Thursday 8:00 to 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on 571-270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/B.I.R./Examiner, Art Unit 2124                                                                                                                                                                                                        

/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124