DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) was submitted on 2/14/2018. The submission is in compliance with the provisions of 37 CFR1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 7-12, 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems 27 (2014) [hereinafter Sutskever] in view of Cho, Kyunghyun, et al. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014) [hereinafter Cho] further in view of Ye, Borui, et al. "Learning question similarity with recurrent neural networks." 2017 ieee international conference on big knowledge (icbk). IEEE, (2017) [hereinafter Ye] further in view of Das, Resul, Ibrahim Turkoglu, and Abdulkadir Sengur. "Effective diagnosis of heart disease through neural networks ensembles." Expert systems with applications 36.4 (2009): 7675-7680 [hereinafter Das].
Regarding claim 1, Sutskever teaches:
A neural network translation model constructing method performed in a computing device generating a neural network translation model, the neural network translation model constructing method comprising: generating a first neural network model and learns a feature of source domain data used in an unspecific field, generating a second neural network model and learns a feature of target domain data used in a specific field (Sutskever; 1 Introduction, page 2, paragraph 2 starting with “Sequences pose”; Sutskever teaches the use of two LSTM architectures. The first to read and process an input sequence, which can be mapped to the source domain data. The second to extract the output sequence. In combining Sutskever and Cho, we can keep Sutskever’s two network system and replace the LSTMs with the encoder-decoder model of Cho.),
Sutskever does not explicitly teach neural network translation models which includes a neural network having an encoder-decoder structure, generating a third neural network translation model which includes a neural network having the encoder-decoder structure and learns a common feature of the source domain data and the target domain data; and generating a combiner which combines translation results of the first to third neural network translation models.
Cho teaches:
neural network translation model which includes a neural network having an encoder-decoder structure (Cho; 2.2 RNN Encoder-Decoder, page 2, left column, paragraph 4 starting with “In this paper” to right column, paragraph 2 starting with “The decoder”, page 3, left column paragraph 2 starting with “Once the RNN”; The neural network model consists of an encoder-decoder structure, wherein the encoder is an RNN that reads an input sequence, and the decoder is an RNN that generates an output sequence. Cho’s model can then be used to generate a target sequence given an input sequence.);
Examiner notes that under the broadest reasonable interpretation and in light of the specification, the encoder and decoder components of the neural network translation model may be configured with a recurrent neural network. Furthermore, the input sequence in Cho reads on source domain data. The source domain can be the type of language of the input sequence.
neural network translation model which includes a neural network having the encoder-decoder structure (Cho; 2.2 RNN Encoder-Decoder, page 2, left column, paragraph 4 starting with “In this paper” to right column, paragraph 2 starting with “The decoder”, page 3, left column paragraph 2 starting with “Once the RNN”; The neural network model consists of an encoder-decoder structure, wherein the encoder is an RNN that reads an input sequence, and the decoder is an RNN that generates an output sequence. Cho’s model can also be used to score given input and output sequences.);
Examiner notes that under the broadest reasonable interpretation and in light of the specification, the encoder and decoder components of the neural network translation model may be configured with a recurrent neural network. Furthermore, Cho’s model can take both input and output sequences. The output sequences maps to thee target domain data. The target domain, under the broadest reasonable interpretation, can be the output sequence language type.
It would have been obvious before the effective filing date for a person of ordinary skill in the art to combine the teachings of the dual-network system as taught by Sutskever and combine it with the teachings of encoder-decoder neural network model as taught by Cho. The encoder and decoder of Cho’s model are jointly trained to maximize the conditional probability of getting the correct target sequences given a source sequence (Cho; Abstract). 

Cho does not teach generating a third neural network translation model which includes a neural network having the encoder-decoder structure and learns a common feature of the source domain data and the target domain data; and generating a combiner which combines translation results of the first to third neural network translation models.
Ye teaches generating a third neural network translation model which includes a neural network having the encoder-decoder structure and learns a common feature of the source domain data and the target domain data (Ye; IV RNN Archiecture, A. The RNN Encoder-Decoder Structure, page 11, right column paragraph 2 starting with “our model”, Table II; The RNN encoder-decoder that learns common meanings and/or word variations.);
Examiner notes that under the broadest reasonable interpretation, the common feature can be the common meanings or word variations between source and target domain data.
It would have been obvious before the effective filing date for a person of ordinary skill in the art to combine the teachings of the dual neural network system as taught by Sutskever in view of Cho [hereinafter Sutskever-Cho] and combine it with the teachings of an RNN encoder-decoder that learns common meanings and word variations between two sequences as taught by Ye. Ye’s method improves the accuracy of the classification and outperforms other traditional models (Ye; Abstract).
Ye does not teach generating a combiner which combines translation results of the first to third neural network translation models.
Das teaches:
and generating a combiner which combines translation results of the first to third neural network translation models (Das; 2.4 Ensemble based methods, page 7677, left column paragraph 3 starting with “The creation of”, 4 Experimental results and discussion, page 7678, left column paragraph 3, starting with “In this study”, Figure 3; Das’s approach is combining outputs of individual networks within the ensemble. Specifically, the ensemble consists of three independent neural network models.).
Examiner notes that what the network taught by Das combines the result of three neural network models working in ensemble. 
It would have been obvious before the effective filing date for a person of ordinary skill in the art to combine the teachings of the neural network ensemble system as taught by Sutskever-Cho in view of Ye [hereinafter Sutskever-Cho-Ye] and combine it with the teachings of a combiner that combines the results of 3 network models that are working in ensemble, as taught by Das. Das’s method of creating new models working in ensemble creates more effective models with greater classification accuracy (Das; Abstract).
Regarding claim 2, Das teaches:
The neural network translation model constructing method of claim 1, wherein based on a combination operation of the combiner, the neural network translation model functions as an ensemble model obtained by a combination of the translation results of the first to third neural network translation models (Das; 2.4 Ensemble based methods, page 7677, left column paragraph 3 starting with “The creation of”, 4 Experimental results and discussion, page 7678, left column paragraph 3, starting with “In this study”, Figure 3; Das’s approach is combining outputs of three individual neural network models.).
Examiner notes that Das uses the ensemble to process medical data. If the individual networks within the ensemble were replaced by the networks taught by Sutskever-Cho-Ye, then the ensemble system of Das would still work, but instead of processing medical data, it would be used for neural machine translation.
The rationale for combining the teachings of Sutskever-Cho-Ye and the teachings of Das is the same rationale previously stated under the rejection for claim 1. 
Regarding claim 3, Ye teaches:
The neural network translation model constructing method of claim 1, wherein in the generating of the third neural network translation model, the common feature is a feature where a word distribution and meaning of a sentence expressed by the source domain data are similar to a word distribution and meaning of a sentence expressed by the target domain data (Ye; III The Annotated Corpus, page 112 right column, paragraph 5 starting with “After crawling”, Table 2; The neural network model taught by Feng assigns “similar” label to pairs where the first statement and second statement have similar meaning and word distribution.).
Examiner notes that under the broadest reasonable interpretation, the source would be the first, or query sequence, and the target would be the second, or candidate sequence. The similarity (either in the form of meaning or words) is found and labeled.
The rationale for combining the teachings of Sutskever-Cho-Ye has already been established under the rejection for claim 1.
Regarding claim 4, Ye teaches:
The neural network translation model constructing method of claim 1, wherein the generating of the third neural network translation model comprises: generating an encoder outputting a common feature vector value obtained by encoding the common feature (Ye; IV The RNN Similarity Training Architecture, page 113 right column paragraph 2 starting with “Our model”; The encoder turns the query statement into a sentence representation vector which is output by the final time step’s hidden layer.);
Examiner notes that the entirety of the query statement is processed by the encoder. This means that any similarities, or common features, are encoded, since all contents are encoded.
generating a domain classifier classifying which of the source domain and the target domain the common feature vector value is included in (Ye; IV The RNN Similarity Training Architecture, page 114 right column paragraph 2 starting with “Our objective”; The RNN encoder-decoder can be used in both classification and ranking. The sentence pair could be labeled with the class of similarity that has the highest probability.);
Examiner notes that Feng’s model can output a classification for which type of similarity the two sequences (source and target) share.
and generating a decoder which decodes the common feature vector value classified by the domain classifier to output an output vector value corresponding to a translation result of the common feature (Ye; IV The RNN Similarity Training Architecture, page 113 right column paragraphs 2-3 starting with “Our model”; The decoder takes the sentence representation as one of the inputs for each time step so that the output of the decoder can be considered a similarity representation of the sentences. The decoder outputs a result for the two sentences.).
The rationale for combining the teachings of Sutskever-Cho-Ye has already been established under the rejection for claim 1.
Regarding claim 5, Cho teaches: 
The neural network translation model constructing method of claim 4, wherein the generating of the encoder comprises encoding the common feature so that the domain classifier does not accurately classify which of the source domain and the target domain the common feature vector value is included in (Cho; 2.2 RNN Encoder-Decoder, right column paragraph 2 starting with “The encoder”; The encoder reads each symbol of an input sequence sequentially and the hidden state of the RNN changes as it is reading. The hidden state is the summary of the whole input sequence.).
Examiner notes that similar to a previous explanation, since the entire input sequence is processed, then any common features are also processed. Furthermore, since the summary of the input sequences is in a hidden state, then the domain classifier will not be able to accurately classify it.
Regarding claim 7, Ye teaches:
The neural network translation model constructing method of claim 1, wherein the generating of the third neural network translation model comprises: learning the common feature so that a common feature vector value corresponding to the common feature encoded by the encoder of the third neural network translation model and a source feature vector value corresponding to a feature of the source domain data encoded by the encoder of the first neural network translation model are vertical to each other in a vector space (Ye; A The RNN Encoder-Decoder Structure, page 113, right column paragraph 1 starting with “Our model”, page 114, left column paragraph 1 starting with “where o is”, Figure 2; RNN takes in a sentence representation. The representation consists of a list of individual word vectors. The similarity and sentence representation from the Encoder are turned into a sentence representation.);
Examiner notes that a list of vectors is essentially a longer, comprehensive vector (see Sentence Representation in Figure 2). The query sentence from the encoder is processed first before being moved onto the decoder processing. Vectors in this configuration are inherently vertical to each other, otherwise they would not be able to be processed;
and learning the common feature so that the common feature vector value corresponding to the common feature encoded by the encoder of the third neural network translation model and a target feature vector value corresponding to a feature of the target domain data encoded by the encoder of the second neural network translation model are vertical to each other in the vector space (Ye; A The RNN Encoder-Decoder Structure, page 113, right column paragraph 2 startign with “A concrete example”, Figure 2; RNN takes in a sentence representation. The representation consists of a list of individual word vectors. The similarity and sentence representation from the decoder is also used to compare the encoder representation.).
Examiner notes that as previously stated, the list of vectors is still a vector. The representation after processing is then compared and processed by the decoder where similarities are found.
The rationale for combining the teachings of Sutskever-Cho-Ye has already been established under the rejection for claim 1.
Regarding claim 8, Sutskever-Cho-Ye in view of Das [hereinafter Sutskever-Cho-Ye-Das] teaches all the limitations and motivations of claim 1 in apparatus form rather than method form. Therefore, the supporting rationale of the rejection to claim 1 applies equally as well to those elements of claim 8. Claim 8 additionally recites an apparatus comprising: a processor generating and a storage unit storing. Sutskever teaches that the model runes on both single GPU processor machines and an 8-GPU machine. Machines that run single or multiple GPUs inherently have CPUs and storage units. 
Regarding claim 9, Sutskever-Cho-Ye-Das teaches all the limitations and motivations of claim 3 in apparatus form rather than method form. Therefore, the supporting rationale of the rejection to claim 3 applies equally as well to those elements of claim 9.
Regarding claim 10, Sutskever-Cho-Ye-Das teaches all the limitations and motivations of claim 4 in apparatus form rather than method form. Therefore, the supporting rationale of the rejection to claim 4 applies equally as well to those elements of claim 10.
Regarding claim 11, Sutskever-Cho-Ye-Das teaches all the limitations and motivations of claim 5 in apparatus form rather than method form. Therefore, the supporting rationale of the rejection to claim 5 applies equally as well to those elements of claim 11.
Regarding claim 12, Sutskever-Cho-Ye-Das teaches all the limitations and motivations of claim 6 in apparatus form rather than method form. Therefore, the supporting rationale of the rejection to claim 6 applies equally as well to those elements of claim 12.
Regarding claim 15, Sutskever-Cho-Ye-Das teaches all the limitations and motivations of claim 7 in apparatus form rather than method form. Therefore, the supporting rationale of the rejection to claim 7 applies equally as well to those elements of claim 15.
Regarding claim 16, Sutskever-Cho-Ye-Das teaches all the limitations and motivations of claim 2 in apparatus form rather than method form. Therefore, the supporting rationale of the rejection to claim 2 applies equally as well to those elements of claim 16.
Regarding claim 17, Ye teaches:
The neural network translation model constructing apparatus of claim 8, wherein the first neural network translation model comprises: based on a process of the processor, an encoder outputting a source feature vector value obtained by encoding a feature of the source domain data (Ye; IV The RNN Similarity Training Architecture, page 113 right column paragraph 2 starting with “Our model”; The encoder turns the query statement into a sentence representation vector which is output by the final time step’s hidden layer.);
Examiner notes that the entirety of the query statement is processed by the encoder. This means that any similarities, or common features, are encoded, since all contents are encoded.
and a decoder decoding the source feature vector value to output a source output vector value corresponding to a translation result of the source domain data (Ye; IV The RNN Similarity Training Architecture, page 113 right column paragraphs 2-3 starting with “Our model”; The decoder takes the sentence representation as one of the inputs for each time step so that the output of the decoder can be considered a similarity representation of the sentences. The decoder outputs a result for the two sentences.).
The rationale for combining the teachings of Sutskever-Cho-Ye has already been established under the rejection for claim 1.
Regarding claim 18, Ye teaches: 
The neural network translation model constructing apparatus of claim 17, wherein the encoder of the first neural network translation model encodes the feature of the source domain data to be vertical to a common feature vector value obtained by the encoder of the third neural network translation model encoding the common feature, in a vector space (Ye; IV The RNN Similarity Training Architecture, page 113 right column paragraph 2 starting with “Our model”; The encoder turns the query sentence into a sentence representation, which is a large vector consisting of word vectors.).
Examiner notes that the encoder turns the query sentence into a sentence representation vector, which as previously noted, is a vector. Furthermore, as previously noted, since the encoder encodes the entire query sentence, then it also encodes the feature of the source domain. If the vector exists within the neural network environment, then it must be vertical for compatibility.
The rationale for combining the teachings of Sutskever-Cho-Ye has already been established under the rejection for claim 1.
Regarding claim 19, Ye teaches:
The neural network translation model constructing apparatus of claim 8, wherein the second neural network translation model comprises: based on a process of the processor, an encoder outputting a target feature vector value obtained by encoding a feature of the target domain data (Ye; IV The RNN Similarity Training Architecture, page 113 right column paragraph 2-3 starting with “Our model”; The encoder turns the query statement into a sentence representation vector which is output by the final time step’s hidden layer. The candidate word sequence is fed into the RNN at each step);
Examiner notes that the entirety of the query statement is processed by the encoder. This means that any similarities, or common features, are encoded, since all contents are encoded. Examiner further notes that as previously stated, the candidate word sequence is the target domain, whereas the query word sequence is the source domain. In this case, the candidate (target) is fed into the RNN at each step.
and a decoder decoding the target feature vector value to output a target output vector value corresponding to a translation result of the target domain data (Ye; IV The RNN Similarity Training Architecture, page 113 right column paragraphs 2-3 starting with “Our model”; The decoder takes the sentence representation as one of the inputs for each time step so that the output of the decoder can be considered a similarity representation of the sentences. The decoder outputs a result for the two sentences. The candidate word sequence is fed into the RNN at each step.).
Examiner notes that as previously stated, the candidate word sequence is the target domain, whereas the query word sequence is the source domain. In this case, the candidate (target) is fed into the RNN at each step.
The rationale for combining the teachings of Sutskever-Cho-Ye has already been established under the rejection for claim 1.
Regarding claim 20, Ye teaches:
The neural network translation model constructing apparatus of claim 19, wherein the encoder of the second neural network translation model encodes the feature of the target domain data to be vertical to a common feature vector value obtained by the encoder of the third neural network translation model encoding the common feature, in a vector space (Ye; IV The RNN Similarity Training Architecture, page 113 right column paragraph 2 starting with “Our model”; The encoder turns the query sentence into a sentence representation, which is a large vector consisting of word vectors.).
Examiner notes that the encoder turns the query sentence into a sentence representation vector, which as previously noted, is a vector. Furthermore, as previously noted, since the encoder encodes the entire query sentence, then it also encodes the feature of the source domain. If the vector exists within the neural network environment, then it must be vertical for compatibility.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Sutskever-Cho-Ye-Das further in view of Ding, Shifei, et al. "Evolutionary artificial neural networks: a review." Artificial Intelligence Review 39.3 (2013) [hereinafter Ding].
Regarding claim 6, Ding teaches:
The neural network translation model constructing method of claim 5, wherein the encoding of the common feature comprises adjusting a connection weight connecting nodes of a neural network configuring the encoder so that the domain classifier does not accurately classify which of the source domain and the target domain the common feature vector value is included in (Ding; 2.1.1 The learning process in ANNs, page 252 paragraph 6 starting with “The learning process”; Optimization of neural networks includes adjusting connection weights.).
Examiner notes that optimization means working towards a desired result. Under the broadest reasonable interpretation, if the desired result is to make sure the domain classifier does not accurately classify, then adjusting connection weights can optimize the neural network towards that intended result.
It would have been obvious before the effective filing date for a person of ordinary skill in the art to combine the teachings of Sutskever-Cho-Ye-Das and combine it with the teachings of optimizing neural networks by adjusting connection weights as taught by Ding. By adjusting weights, neural networks can be optimized (Ding; 2.1.1 The learning process in ANNs, page 252 paragraph 6 starting with “The learning process”).

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Sutskever-Cho-Ye-Das further in view of Ganin, Yaroslav, et al. "Domain-adversarial training of neural networks." The journal of machine learning research 17.1 (2016) [hereinafter Ganin].
Regarding claim 13, Ganin teaches: 
The neural network translation model constructing apparatus of claim 10, wherein the processor executes the domain classifier in a learning process of the neural network translation model, and does not execute the domain classifier in an actual translation process performed by the neural network translation model (Ganin; 1 Introduction; Domain classifier discriminates between source and target domains during training.).
	Examiner notes that the domain classifier is used to find similarities only during training. Under the broadest reasonable interpretation, the learning process of a neural network is the training process.
It would have been obvious before the effective filing date for a person of ordinary skill in the art to combine the teachings of Sutskever-Cho-Ye-Das and combine it with the teachings of domain adaptation as taught by Ganin. Domain adaptation by Ganin achieved performance and standard benchmarks (Ganin; Abstract).

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Sutskever-Cho-Ye-Das further in view of Shi, Yangyang, et al. "Contextual spoken language understanding using recurrent neural networks." 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2015 [hereinafter Shi].
Regarding claim 14, Shi teaches the neural network translation model constructing apparatus of claim 10, wherein the domain classifier is implemented with a plurality of hidden layers in a hierarchical structure of a neural network, based on a process of the processor (Shi; 3 Recurrent Neural Network Based Joint Training, page 5273, left column paragraph 3 starting with “In the proposed”; There are two hidden layers. Domain classification is used in the top hidden layer and the bottom hidden layer.).
Examiner notes that the use of top and bottom implies a hierarchical structure. Under the broadest reasonable interpretation, plurality of layers can mean two.
It would have been obvious before the effective filing date for a person of ordinary skill in the art to combine the teachings of Sutskever-Cho-Ye-Das and combine it with the teachings of a dual hidden-layer for domain classification as taught by Shi. Shi’s method obtains state of the art results and improved performances over baseline techniques (Shi; Abstract).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC WU whose telephone number is (571)272-3380. The examiner can normally be reached Monday-Friday between 9AM and 6PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, OMAR FERNANDEZ RIVAS can be reached on (571)272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ERIC C WU/               Examiner, Art Unit 2128                            
                                                                                                                                                             /LUIS A SITIRICHE/Primary Examiner, Art Unit 2126