DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1, 4-5, and 7-14 are presented for examination.

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Response to Amendment
	Applicant’s amendment has obviated most, but not all, of the outstanding objections to the claims.  To the extent that an objection or rejection appears in the previous Office Action(s) but not this Office Action, that objection or rejection is withdrawn.  To the extent that is appears both in a previous Office Action(s) and this Office Action, the objection or rejection is maintained.

Claim Objections
Examiner objects to claims 1, 4-5, and 7-14.
Claims 1 and 10-11 are objected to because of the following informalities: “the plurality of intermediate layers included in the second model” should be “a plurality of intermediate layers included in the second model” since only intermediate layers included in the first model were previously recited.  Appropriate correction is required.
All claims dependent on a claim objected to hereunder are also objected to for being dependent on an objected-to base claim.

Claim Rejections - 35 USC § 103
Claims 1, 4, and 7-14 are rejected under 35 U.S.C. 103 as being obvious over Bousmalis et al., “Domain Separation Networks,” in 29 Advances in Neural Info. Processing Sys. 343-51 (2016) (“Bousmalis”) in view of Misra et al. (US 20190017374) (“Misra”) and further in view of Ducau, Adversarial Autoencoders (with Pytorch), https://blog.paperspace.com/adversarial-autoencoders-with-pytorch/ (2017) (“Ducau”).1
Regarding claim 1, Bousmalis discloses “[a] training apparatus comprising: 
a processor (all models were trained with TensorFlow and trained with stochastic gradient descent plus momentum [implying the existence of a processor on which to run TensorFlow] – Bousmalis, first paragraph of sec. 4.2) programmed to: 
acquire a first model (Bousmalis Fig. 1 discloses an architecture of a domain separation network [first model]) including: 
an input layer to which input information is input (Bousmalis Fig. 1 shows that source domain information             
                
                    
                        x
                    
                    
                        s
                    
                
            
         and target domain information             
                
                    
                        x
                    
                    
                        t
                    
                
            
         are input to the private and shared encoders); 
a plurality of intermediate layers that executes a calculation based on a feature of the input information that has been input (Bousmalis Fig. 1 shows that each of the three encoders contains at least two intermediate layers; last paragraph before sec. 3.1 discloses that the input information is mapped to hidden representations [mapping = calculation] representing features that are private to each domain or shared across domains); and 
an output layer that outputs output information that corresponds to output of the intermediate layer (Bousmalis Fig. 1 shows a shared decoder that outputs reconstructed source and target information             
                
                    
                        
                            
                                x
                            
                            ^
                        
                    
                    
                        s
                    
                
            
         and             
                
                    
                        
                            
                                x
                            
                            ^
                        
                    
                    
                        t
                    
                
            
        , respectively and a classifier that outputs a classification             
                
                    
                        y
                    
                    
                        s
                    
                
            
         corresponding to the source domain); and 
train the first model such that:
when predetermined input information is input to the first model, the first model outputs predetermined output information2 that corresponds to the predetermined input information (Bousmalis Fig. 1 shows that source domain information             
                
                    
                        x
                    
                    
                        s
                    
                
            
         and target domain information             
                
                    
                        x
                    
                    
                        t
                    
                
            
         [input information] are input to the private and shared encoders; Bousmalis Fig. 1 shows a shared decoder that outputs reconstructed source and target information             
                
                    
                        
                            
                                x
                            
                            ^
                        
                    
                    
                        s
                    
                
            
         and             
                
                    
                        
                            
                                x
                            
                            ^
                        
                    
                    
                        t
                    
                
            
        , respectively and a classifier that outputs a classification             
                
                    
                        y
                    
                    
                        s
                    
                
            
         corresponding to the source domain [all three being output information corresponding to the input information]; first paragraph of sec. 3 notes that DSNs are trained on data from the source domain and generalize to the target domain) and intermediate information output from a predetermined intermediate layer among the intermediate layers becomes closer to feature information that corresponds to a feature of correspondence information that corresponds to the predetermined input information (Bousmalis Fig. 1 and accompanying text show that one output             
                
                    
                        h
                    
                    
                        c
                    
                    
                        s
                    
                
            
         of the shared encoder [predetermined intermediate layer, “intermediate” because it is input to the shared decoder] is kept similar [closer] to the other output             
                
                    
                        h
                    
                    
                        c
                    
                    
                        t
                    
                
            
         [correspondence information, corresponding to input information             
                
                    
                        x
                    
                    
                        t
                    
                
            
        ] with a similarity loss L---similarity; last paragraph before sec. 3.1 indicates that the hidden representation hc represents features that are common or shared across domains [so that hc contains feature information]);
when input information related to a first domain is input as the predetermined input information to the first model, information indicating classification of the input information is output as the output information (Bousmalis Fig. 1 shows that input information xs for the source domain [first domain] is input to the private source encoder and the shared encoder and that the classifier, after receiving the hidden representation of the common source features, outputs a classification             
                
                    
                        
                            
                                y
                            
                            ^
                        
                    
                    
                        s
                    
                
            
         corresponding to the source domain [first information]) and the intermediate information becomes closer to feature information that takes account of correspondence information related to a second domain different from the first domain (Bousmalis Fig. 1 shows that hidden representation             
                
                    
                        h
                    
                    
                        c
                    
                    
                        t
                    
                
            
         [intermediate information] from the target domain [second domain, different from the first domain] is kept similar [closer] to hidden representation             
                
                    
                        h
                    
                    
                        c
                    
                    
                        s
                    
                
            
         [correspondence information] from the source domain with a similarity loss Lsimilarity; last paragraph before sec. 3.1 discloses that the hidden representations are composed of features [so the full vector/correspondence information takes account of the features contained therein]);
when first information and second information associated with the first information are input as the predetermined input information to the first model, a classification result of the second information is output as the output information (Bousmalis Fig. 1 shows that information xs is input to the private source encoder and the shared encoder; last paragraph before sec. 3.1 discloses that xs comprises data samples {(            
                
                    
                        x
                    
                    
                        i
                    
                    
                        s
                    
                
            
        ,             
                
                    
                        y
                    
                    
                        i
                    
                    
                        s
                    
                
            
        )} [so that the sample corresponding to i = 1 could be the first information and the sample corresponding to i = 2 could be the second information, associated with the first information insofar as both are part of the same training set]; Fig. 1 further shows that the classifier outputs a prediction              
                
                    
                        
                            
                                y
                            
                            ^
                        
                    
                    
                        s
                    
                
            
         associated with the source domain [including the data sample corresponding to the second information]) and the intermediate information becomes closer to feature information that corresponds to a feature of the second information and that takes account of a feature of the third information associated with the first information (Bousmalis Fig. 1 shows that the hidden representation             
                
                    
                        h
                    
                    
                        c
                    
                    
                        t
                    
                
            
         [intermediate information] is kept similar to a hidden representation             
                
                    
                        h
                    
                    
                        c
                    
                    
                        s
                    
                
            
         [feature information] with a similarity loss Lsimilarity [            
                
                    
                        h
                    
                    
                        c
                    
                    
                        s
                    
                
                 
            
        , per the last paragraph before sec. 3.1, contains shared features from the source domain and therefore corresponds to “feature information” such that each sample, including the sample corresponding to the “second information,” contains features]; last paragraph before sec. 3.1 discloses that the dataset xt [third information] includes an unlabeled dataset of samples from the target domain and is mapped to, inter alia, the hidden representation             
                
                    
                        h
                    
                    
                        c
                    
                    
                        s
                    
                
            
         [so, per the discussion in the last paragraph before sec. 3.1, the system takes account of the features of the target dataset/third information as well]), 
the intermediate information is a vector output from a hidden layer of a classifier of the first model at a stage prior to an output layer of the classifier into a term for training the first model (Bousmalis p. 4, paragraph before equation (5) indicates that             
                
                    
                        h
                    
                    
                        c
                    
                    
                        s
                    
                
            
         and             
                
                    
                        h
                    
                    
                        c
                    
                    
                        t
                    
                
            
         [intermediate information] are rows of matrices of hidden shared representations from samples of source and target data respectively [i.e., they are vectors]; Fig. 1 shows that the layers of the shared encoder that output             
                
                    
                        h
                    
                    
                        c
                    
                    
                        s
                    
                
            
         and             
                
                    
                        h
                    
                    
                        c
                    
                    
                        t
                    
                
            
         are layers prior to the classifier output             
                
                    
                        
                            
                                y
                            
                            ^
                        
                    
                    
                        s
                    
                
            
        ; p. 5, equation (7) and Fig. 1 show that             
                
                    
                        h
                    
                    
                        c
                    
                    
                        s
                    
                
            
         and             
                
                    
                        h
                    
                    
                        c
                    
                    
                        t
                    
                
            
         are used as inputs to determine a similarity loss Lsimilarity [term]) …,
wherein closeness of information is determined by vector distance (first paragraph of p. 5 of Bousmalis and the equation contained therein disclose that the maximum mean discrepancy similarity loss [measure of closeness] is a kernel-based distance function between pairs of source samples and target samples h [vectors]).”  
Bousmalis appears not to disclose explicitly the further limitations of the claim.  However, Misra discloses that “[the] processor [is] programmed to: …
acquire a second model that has learnt a feature of  third information (Misra Fig. 3 and paragraph 93 show that deep neural network model 310 [second model] encodes a latent space based on the dominant features [feature of third information] of a historical spectral response and produces a reproduced spectral response therefrom; paragraph 99 shows that the comparer 350 compares the reproduced spectral response from the decoder to the historical spectral response at the encoder to determine a similarity between the two, and when the similarity is at or above an initial threshold, the comparer instructs the decoder to freeze [third information = historical spectral responses during training; second information = historical spectral responses during testing]); … [and]
train the first model such that … the intermediate information … becomes closer to feature information generated from the second information by the second model (comparer 390 compares trained spectral response from the decoder 380 to the historical spectral response at the encoder 320 to determine a similarity between the trained spectral response [based on the dominant features/intermediate information] and the historical spectral response [second information]; weights and biases of the neural network 370 of neural network system 360 [first model] are reassigned until the similarity is at or above a final threshold [i.e., the network is trained such that the intermediate information becomes closer to the second information] – paragraph 103; encoder receives a historical spectral response from the database, determines dominant features of the response, encodes a latent space based on the dominant features, and provides the latent space to a latent space database – id. at paragraph 93; the decoder 340 receives the latent space from the latent space database and decodes a reproduced spectral response [feature information, to which the trained spectral response is ultimately compared] – id. at paragraph 98; see also Fig. 3)….”  
Misra and the instant application both relate to autoencoders and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Bousmalis to acquire a second model that has learned a feature of information and use the second model to generate other information that is approximated by the first model, as disclosed by Misra, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would allow the system to determine the most important features of the information provided to it, thereby aiding it in providing a more efficient classification.  See Misra, paragraph 93 (latent variables encoded by the deep neural network model 310 correspond to the dominant features of the historical spectral response).
Neither Bousmalis nor Misra appears to disclose explicitly the further limitations of the claim.  However, Ducau discloses that “the feature information [is] a vector output from a predetermined hidden intermediate layer among the plurality of intermediate layers included in the second model and input into [a] term (Figure 1 of Ducau and the paragraph following it show that in an adversarial autoencoder, a sample z [feature information] drawn according to a generator network is taken from an intermediate layer of the generator and used as input to a discriminator that receives the sample z and z’ sampled from the true prior p(z) and assigns a probability to each of coming from p(z) [i.e., z is input as a term of a function that outputs that probability]), the second model being an autoencoder and the predetermined intermediate layer being an intermediate layer prior to decoding (Ducau Fig. 1 and paragraph immediately preceding it show that the sample z is drawn according to the encoder of an autoencoder [second model] and that z is taken from a layer prior to the decoder that generates the reconstructed input x’)….”
Ducau and the instant application both relate to autoencoders that feed their hidden layer output into another network and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Bousmalis and Misra to use an autoencoder to output the result of a layer prior to encoding to another network, as disclosed by Ducau, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would provide a more compact representation of the input data for passing to the other network, thereby saving processing power and computational time.  See Ducau, first paragraph under section entitled “Denoising Autoencoders (dAE).”

	Claim 10 is a method claim corresponding to apparatus claim 1 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 11 is a non-transitory computer-readable storage medium claim corresponding to apparatus claim 1 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 4, Bousmalis, as modified by Misra and Ducau, discloses that “the processor is programmed to train the first model by using the first information and the second information belonging to the first domain and the third information belonging to the second domain different from the first domain (Bousmalis sec. 3, first paragraph, indicates that, given a labeled dataset in a source domain and an unlabeled dataset in a target domain [third information, associated with the dataset in the source domain], a DSN may be trained on data from the source domain that generalizes to the target domain).”

Regarding claim 7, Bousmalis, as modified by Misra and Ducau, discloses that “the processor is programmed to acquire, as the second model, a model that [has] previously learn[ed] a feature of the third information that is of the same type as the second information (Misra paragraphs 93 and 99 and Fig. 3 disclose that deep neural network model 310 [second model] encodes a latent space based on the dominant features [feature of third information] of a historical spectral response and produces a reproduced spectral response therefrom; the comparer 350 of deep neural network model 310 compares the reproduced spectral response from the decoder to the historical spectral response at the encoder to determine a similarity between the two, and when the similarity is at or above an initial threshold, the comparer instructs the decoder to freeze [so the model previously learned the dominant features of the historical spectral responses of the training set; note that the historical spectral response during training/third information is of the same type as the historical spectral response during testing/second information]).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Bousmalis such that the second model has learned a feature of information that is of the same type as another set of information, as disclosed by Misra, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would allow the system to determine the most important features of the information provided to it, thereby aiding it in providing a more efficient classification.  See Misra, paragraph 93 (latent variables encoded by the deep neural network model 310 correspond to the dominant features of the historical spectral response).

Regarding claim 8, Bousmalis, as modified by Misra and Ducau, discloses that “the processor is programmed to train the first model such that, when the first information and the second information are input to the first model, information indicating classification of the second information is output as the output information (Misra Fig. 3 shows that historical measurements [first information] and the reproduced spectral response [derived from the historical spectral response/second information] are input to neural network system 360 [first model] and that the decoder of neural network system 360 outputs a trained spectral response [classification/output information]; paragraph 103 states that the comparer 390 compares the trained spectral response to the historical spectral response [so that the trained spectral response is a classification corresponding to the historical spectral response/second information]) and the intermediate information becomes closer to feature information generated by the second model when the second information is input to the second model (Misra paragraph 103 discloses that the comparer 390 compares the trained spectral response [based on the dominant features/intermediate information, see Fig. 3] to the historical spectral response [from which the latent variables/feature information are generated by the second model when the historical spectral response/second information is input thereto, see Fig. 3], and if the similarity between the two is at or above a final threshold, the comparer 390 instructs the neural network 370 to freeze).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Bousmalis to classify the second information with the first model such that the intermediate information thereof approximates the features generated by the second model, as disclosed by Misra, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would allow the system to determine the most important features of the information provided to it, thereby aiding it in providing a more efficient classification.  See Misra, paragraph 93 (latent variables encoded by the deep neural network model 310 correspond to the dominant features of the historical spectral response).

Regarding claim 9, Bousmalis, as modified by Misra and Ducau, discloses that “the processor is programmed to: 
acquire, as the first model, a model including: 
a first encoder that outputs first encoded information by encoding first information when the first information is input (Bousmalis Fig. 1 shows a private target encoder             
                
                    
                        E
                    
                    
                        p
                    
                    
                        t
                    
                
                (
                
                    
                        x
                    
                    
                        t
                    
                
            
        ) [first encoder] that outputs             
                
                    
                        h
                    
                    
                        p
                    
                    
                        t
                    
                
            
         [encoded first information] when information from target domain             
                
                    
                        x
                    
                    
                        t
                    
                
            
         [first information] is input); 
a second encoder that outputs second encoded information by encoding second information when the second information is input (Bousmalis Fig. 1 shows a private source encoder             
                
                    
                        E
                    
                    
                        p
                    
                    
                        s
                    
                
                (
                
                    
                        x
                    
                    
                        s
                    
                
            
        ) [second encoder] that outputs             
                
                    
                        h
                    
                    
                        p
                    
                    
                        s
                    
                
            
         [encoded second information] when information from source domain             
                
                    
                        x
                    
                    
                        s
                    
                
            
         [second information] is input); -3-Application No. 16/250,460 
a third encoder that outputs, when the first information and the second information are input, third encoded information by encoding the first information and fourth encoded information by encoding the second information (Bousmalis Fig. 1 shows a shared encoder             
                
                    
                        E
                    
                    
                        c
                    
                
                (
                x
                )
            
         [third encoder] that outputs             
                
                    
                        h
                    
                    
                        c
                    
                    
                        s
                    
                
            
         and             
                
                    
                        h
                    
                    
                        c
                    
                    
                        t
                    
                
            
         [encoded fourth and third information, respectively] when information from both the source domain             
                
                    
                        x
                    
                    
                        s
                    
                
            
         and the target domain             
                
                    
                        x
                    
                    
                        t
                    
                
            
         [second and first information] are input); 
a decoder that generates first decoded information from the first encoded information and the third encoded information and generates second decoded information from the second encoded information and the fourth encoded information (Bousmalis Fig. 1 shows a shared decoder that outputs             
                
                    
                        
                            
                                x
                            
                            ^
                        
                    
                    
                        t
                    
                
            
         and             
                
                    
                        
                            
                                x
                            
                            ^
                        
                    
                    
                        s
                    
                
            
         [decoded first and second information, respectively] from the combination of             
                
                    
                        h
                    
                    
                        c
                    
                    
                        t
                    
                
            
         and             
                
                    
                        h
                    
                    
                        p
                    
                    
                        t
                    
                
            
         [third and first encoded information] and from the combination of             
                
                    
                        h
                    
                    
                        c
                    
                    
                        s
                    
                
            
         and             
                
                    
                        h
                    
                    
                        p
                    
                    
                        s
                    
                
            
         [fourth and second encoded information], respectively); and 
a classifier that includes a plurality of intermediate layers and generates, from the fourth encoded information, classification information indicating a classification result of the second information (Bousmalis Fig. 1 shows a classifier with at least two intermediate layers that outputs class             
                
                    
                        
                            
                                y
                            
                            ^
                        
                    
                    
                        s
                    
                
            
         of source information [classification result of second information] upon inputting             
                
                    
                        h
                    
                    
                        c
                    
                    
                        s
                    
                
            
         [fourth information]), and 
train the first model such that the first information and the first decoded information become more similar to one another, the second information and the second decoded information become more similar to one another (Bousmalis, last paragraph before sec. 3.1 states that each input datum x [information] is mapped to a hidden representation h, which is then decoded through a decoding function that maps h to a reconstruction             
                
                    
                        x
                    
                    ^
                
            
         [decoded information, which is similar to the original information]; see also Fig. 1 [showing that             
                
                    
                        x
                    
                    
                        t
                    
                
            
         [first information] is mapped to             
                
                    
                        
                            
                                x
                            
                            ^
                        
                    
                    
                        t
                    
                
            
         [first decoded information] and             
                
                    
                        x
                    
                    
                        s
                    
                
            
         [second information] is mapped to             
                
                    
                        
                            
                                x
                            
                            ^
                        
                    
                    
                        s
                    
                
            
         [second decoded information]]; see also abstract (disclosing that the model is trained to reconstruct the images from both the source and the target domains), sec. 3.1, first paragraph (showing that the reconstruction loss between the input information and the reconstructed input information is minimized – i.e., the two are made more similar to one another)), and information output from a predetermined intermediate layer among the intermediate layers included in the classifier becomes closer to the feature information (Bousmalis sec. 3.1, first paragraph, notes that the classification loss Ltask trains the model to predict the output labels of interest; the negative log-likelihood of the ground truth class is minimized for each source domain sample [i.e., the likelihood of             
                
                    
                        
                            
                                y
                            
                            ^
                        
                    
                    
                        i
                    
                    
                        s
                    
                
            
        , the output of the classifier [information output from an intermediate layer] being the same as             
                
                    
                        y
                    
                    
                        i
                    
                    
                        s
                    
                
            
        , the one-hot encoding of the class label for the corresponding source input [feature information], is maximized, or             
                
                    
                        
                            
                                y
                            
                            ^
                        
                    
                    
                        i
                    
                    
                        s
                    
                
            
         “becomes closer” to             
                
                    
                        y
                    
                    
                        i
                    
                    
                        s
                    
                
            
        ; note that for purposes of examination, Examiner considers all information relating to features, regardless of from which portion of the model they come, to be “feature information”]), similarity of information being determined by vector distance (Bousmalis sec. 3.1, first paragraphs and equations 3-4 show that the reconstruction loss is defined in terms of an L2 norm [distance measure] of the difference between the original and reconstructed inputs).”

Regarding claim 12, Bousmalis, as modified by Misra and Ducau, discloses that “the predetermined hidden intermediate layer of the second model is a hidden layer that outputs information with a smallest number of dimensions among the intermediate layers included in the second model (Ducau, section entitled “Denoising autoencoders (dAE)”, states that the intermediate layer of the network has dimensionality that is much lower than the dimensionality of the input; Figure 1 shows that z is the layer with the lowest dimensionality among all the layers of the autoencoder).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Bousmalis and Misra to extract information from the hidden layer of an autoencoder that has the lowest dimensionality, as disclosed by Ducau, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would provide a more compact representation of the input data for passing to the other network, thereby saving processing power and computational time.  See Ducau, first paragraph under section entitled “Denoising Autoencoders (dAE).”

Claim 13 is a method claim corresponding to apparatus claim 1 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 14 is a non-transitory computer-readable storage medium claim corresponding to apparatus claim 1 and is rejected for the same reasons as given in the rejection of that claim.

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Bousmalis in view of Misra and Ducau and further in view of Tao et al. (US 20200104288) (“Tao”).
Regarding claim 5, Bousmalis/Misra/Ducau appear not to disclose explicitly the further limitations of the claim.  However, Tao discloses that “the processor is programmed to train the first model by using the first information indicating a feature of a user, the second information indicating a selection target3 selected by a user having a feature indicated by the first information, and the third information indicating a selection target different from the selection target indicated by the second information (machine learning algorithm for recommendation system can be trained based on a history of the user’s prior activities, including prior selection of information contents and recommendation information related to the information item – Tao, paragraph 15; system may assign a larger weight to the features representing the user’s prior use of a portal to reflect that the user is more likely to select a particular information content – id. at paragraph 31 [particular information content previously selected by user = selection target/second information; user history of selecting content = first information/user feature]; set of features can be in the form of a multi-dimensional vector, and each feature can be associated with a particular prior online activity – id. at paragraph 29 [second feature/information content representing prior online activity = third information]).”  
Tao and the instant application both relate to recommendation systems and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Bousmalis/Misra/Ducau to base the training of the network on multiple user-selected features associated with the user himself, as disclosed by Tao, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would allow the system to provide more accurate recommendations to the user by taking into consideration the user’s own expressed selections.  See Tao, paragraph 31.

Response to Arguments
Applicant's arguments filed September 12, 2022 (“Remarks”) have been fully considered but they are not persuasive.
Applicant’s sole substantive argument is that the Bousmalis/Misra/Ducau combination allegedly does not disclose where or how the feature information is input to into the first model.  Remarks at 7-8.  However, the only portion of the independent claims as currently written that indicates where or how the input information is input into the first model is the recitation that the feature information is “input into [a] term” that is used for “training the first model”.  As an initial observation, looking at Figure 1 and equation 8 on page 38 of the specification as originally filed, it would appear that more properly the claim should read that the intermediate information (hi) is input into a function that outputs a term (Litem).  So construed, Ducau discloses the limitation.  Ducau discloses that a sample z drawn from the most intermediate layer of an autoencoder is input into a discriminator that outputs a probability that z comes from a true prior p(z).  The use of a term to train the first model is taught by Bousmalis, which shows that the loss Lsimilarity between source data samples                         
                            
                                
                                    h
                                
                                
                                    c
                                
                                
                                    s
                                
                            
                        
                     and target data samples                         
                            
                                
                                    h
                                
                                
                                    c
                                
                                
                                    t
                                
                            
                        
                     is used to train the model such that the source data samples are similar to the target data samples.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN C VAUGHN whose telephone number is (571)272-4849.  The examiner can normally be reached on M-R 7:50a-5:50p ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar, can be reached at 571-272-7796.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/R.C.V./Examiner, Art Unit 2125                                                                                                                                                                                                                                                                                                                                                                                                        /KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125                                                                                                                                                                                                        


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 The claims, while not legally indefinite, are not the model of clarity.  Examiner is construing the claims as best understood and will endeavor to provide claim interpretations whenever possible.  In particular, any data or information will be deemed to read on the “first information,” “second information,” etc., so long as they are separate from each other.  
        2 It is unclear how information output by the model can be “predetermined” in advance of the model running.  For purposes of examination, this limitation will be construed as meaning that the output information is determined by the model.
        3 The specification does not define “selection target,” and Examiner can find no evidence that it was an accepted term of art before the effective filing date.  For purposes of examination, any feature input to a model that has been selected by a user will be deemed a “selection target.”