DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/05/2018.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1, 3, 4, 7,  8, 10, 11, 14, 15, 17, 18, and 20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Umut Guclu and Marcel A. J. van Gerven "Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream" 2015, hereinafter referred to as Guclu. and in view of S. E. Ahmed Raza, L. Cheung, D. Epstein, S. Pelengaris, M. Khan and N. M. Rajpoot, "MIMO-Net: A multi-input multi-output convolutional neural network for cell segmentation in fluorescence microscopy images," 2017 hereinafter referred to as Raza
In regards to claim 1 Guclu teaches An artificial neural network, comprising: up-stream layers; down-stream layers, wherein an output of the up-stream layers is provided as input to the down-stream layers; a first input to the up-stream layers configured to receive input data; (Selectivity of voxels to individual feature maps reveals distributed representations 1st paragraph - For features of either low or high complexity this relationship tended to be spatially confined to either upstream or downstream visual areas, respectively. Materials and Methods 6th paragraph - In contrast, each artificial neuron in the fully connected layers took all features at all locations in the previous layer as its input.) and wherein the artificial neural network is configured to identify a classification of the information in the input data at an output of the down-stream layers using the context data. (Materials and Methods 11th paragraph - However, they have been trained on the same dataset (i.e., ImageNet) for the same task (i.e., object categorization).  Two of these DNNs have more than five convolutional layers (i.e., vgg-verydeep-16 and vgg-verydeep-19). To enable layer-wise comparison, we grouped the convolutional layers of these DNNs to have five groups and used the outputs of the last layer in a group as the outputs of the entire group.  Image decoding is driven by discriminative and categorical information - To examine to what extent decoding performance is driven by discrimination (identifying an image based on its unique characteristics) versus categorization (identifying an image based on categorical information), the following analysis was performed. We manually assigned each image in the test set to one of two categories (animate vs inanimate), as this appears to be the strongest categorical division in inferior temporal cortex (Khaligh-Razavi and Kriegeskorte, 2014))
but fails to teach a second input to the down-stream layers configured to receive context data, wherein the context data identifies a characteristic of information in the input data; 
However Raza teaches a second input to the down-stream layers configured to receive context data, wherein the context data identifies a characteristic of information in the input data; (2.2. The Proposed Network 3rd paragraph - The second input is added from the downsampling path for better localization and to capture the context information as in [9]. It also passes the convolution only features to the upsampling path, which helps to learn from the features which do not have maximum response in downsampling path.) 
Guclu and Raza analogous art because they are in the same field of endeavor of neural networks.
It would have been obvious to one of ordinary skill prior to the effective filing date to combine the artificial neural network of Guclu and the second input of Raza, in order to capture context information. (Raza 2.2. The Proposed Network)
In regards to claim 3 modified Guclu teaches The artificial neural network of claim 1, wherein the artificial neural network is a convolutional neural network wherein the up-stream layers comprise convolutional layers and the down-stream layers comprise dense layers.  (Figure 1. DNN-based encoding framework. A, Schematic of the encoding model that transforms a visual stimulus to a voxel response in two stages. First, a deep (convolutional) neural network transforms the visual stimulus(x) to multiple layers of feature representations. Then, a linear mapping transforms a layer of feature representations to a voxel response(y). B, Schematic of the deep neural network where each layer of artificial neurons uses one or more of the following (non)linear transformations: convolution, rectification, local response normalization, max pooling, inner product, and softmax. C, Reconstruction of an example image from the activities in the first five layers.)
In regards to claim 4 modified Guclu teaches The artificial neural network of claim 1, wherein: the input data comprises image data; the information in the input data comprises an image of an object; and   the artificial neural network is configured to identify the classification of the object at the output of the down-stream layers using the context data.  (Materials and Methods 7th paragraph - The DNN was trained on 1.2 million augmented (by random crops, horizontal mirroring, and color jittering) natural images that are each labeled as 1 of 1000 object categories. The natural images were taken from the ImageNet (2012) dataset (Deng et al., 2009). Each input image was represented as a 224x224 matrix for each of three RGB color channels.  Materials and Methods 12th paragraph - Third, to test whether results are explained by optimizing the DNN for categorization, we compared its encoding performance with that of nine random DNNs that share the same architecture, but whose weights are drawn from a zero mean and unit variance multivariate Gaussian.)
In regards to claim 7 modified Guclu teaches The artificial neural network of claim 1, wherein the context data identifies a selected one of a temporal characteristic of the information in the input data, a spatial characteristic of the information in the input data, or a category of the information in the input data.  (Image decoding is driven by discriminative and categorical information - It was found that the correlation between the observed and predicted responses to an image was significantly higher than the mean correlation between the observed responses to the same image and the predicted responses to different images, regardless of their category (p 5e-13 for both subjects, Bonferroni corrected for number of conditions, Student’s t test across test images within subjects). This points toward identification based on each image’s unique characteristics. For high-level voxels only, it was additionally found that the mean pairwise correlation between the observed and predicted responses to a pair of same category images was significantly higher than that of different category images (p 7e-25 for both subjects, Bonferroni corrected for number of conditions, Student’s t test across test images within subjects). This indicates that for downstream areas, not only unique characteristics of an image, but also its semantic content is involved in response prediction.)
In regards to claim 8 modified Guclu teaches A method of identifying a classification of information, comprising: providing input data to a first input to an artificial neural network to up-stream layers of the artificial neural network, wherein the artificial neural network comprises down-stream layers, and wherein an output of the up-stream layers is provided as input to the down-stream layers; (Selectivity of voxels to individual feature maps reveals distributed representations 1st paragraph - For features of either low or high complexity this relationship tended to be spatially confined to either upstream or downstream visual areas, respectively.  Materials and Methods 6th paragraph - In contrast, each artificial neuron in the fully connected layers took all features at all locations in the previous layer as its input.) and identifying a classification of the information in the input data at an output of the down-stream layers by the artificial neural network using the context data.  (Materials and Methods 11th paragraph - However, they have been trained on the same dataset (i.e., ImageNet) for the same task (i.e., object categorization).  Two of these DNNs have more than five convolutional layers (i.e., vgg-verydeep-16 and vgg-verydeep-19). To enable layer-wise comparison, we grouped the convolutional layers of these DNNs to have five groups and used the outputs of the last layer in a group as the outputs of the entire group.  Image decoding is driven by discriminative and categorical information - To examine to what extent decoding performance is driven by discrimination (identifying an image based on its unique characteristics) versus categorization (identifying an image based on categorical information), the following analysis was performed. We manually assigned each image in the test set to one of two categories (animate vs inanimate), as this appears to be the strongest categorical division in inferior temporal cortex (Khaligh-Razavi and Kriegeskorte, 2014))
Raza teaches: providing context data to a second input to the artificial neural network to the down-stream layers, wherein the context data identifies a characteristic of information in the input data; (2.2. The Proposed Network 3rd paragraph - The second input is added from the downsampling path for better localization and to capture the context information as in [9]. It also passes the convolution only features to the upsampling path, which helps to learn from the features which do not have maximum response in downsampling path.)
In regards to claim 10 modified Guclu teaches The method of claim 8, wherein the artificial neural network is a convolutional neural network wherein the up- stream layers comprise convolutional layers and the down- stream layers comprise dense layers. (Figure 1. DNN-based encoding framework. A, Schematic of the encoding model that transforms a visual stimulus to a voxel response in two stages. First, a deep (convolutional) neural network transforms the visual stimulus(x) to multiple layers of feature representations. Then, a linear mapping transforms a layer of feature representations to a voxel response(y). B, Schematic of the deep neural network where each layer of artificial neurons uses one or more of the following (non)linear transformations: convolution, rectification, local response normalization, max pooling, inner product, and softmax. C, Reconstruction of an example image from the activities in the first five layers.)
In regards to claim 11 modified Guclu teaches The method of claim 8, wherein: the input data comprises image data; the information in the input data comprises an image of an object; and identifying the classification of the information in the input data comprises identifying the classification of the object at the output of the down-stream layers using the context data. (Materials and Methods 7th paragraph - The DNN was trained on 1.2 million augmented (by random crops, horizontal mirroring, and color jittering) natural images that are each labeled as 1 of 1000 object categories. The natural images were taken from the ImageNet (2012) dataset (Deng et al., 2009). Each input image was represented as a 224x224 matrix for each of three RGB color channels.  Materials and Methods 12th paragraph - Third, to test whether results are explained by optimizing the DNN for categorization, we compared its encoding performance with that of nine random DNNs that share the same architecture, but whose weights are drawn from a zero mean and unit variance multivariate Gaussian.)
In regards to claim 14 modified Guclu teaches The method of claim 8, wherein the context data identifies a selected one of a temporal characteristic of the information in the input data, a spatial characteristic of the information in the input data, or a category of the information in the input data. (Image decoding is driven by discriminative and categorical information - It was found that the correlation between the observed and predicted responses to an image was significantly higher than the mean correlation between the observed responses to the same image and the predicted responses to different images, regardless of their category (p 5e-13 for both subjects, Bonferroni corrected for number of conditions, Student’s t test across test images within subjects). This points toward identification based on each image’s unique characteristics. For high-level voxels only, it was additionally found that the mean pairwise correlation between the observed and predicted responses to a pair of same category images was significantly higher than that of different category images (p 7e-25 for both subjects, Bonferroni corrected for number of conditions, Student’s t test across test images within subjects). This indicates that for downstream areas, not only unique characteristics of an image, but also its semantic content is involved in response prediction.)
In regards to claim 15 modified Guclu teaches A method of identifying a classification of information, comprising: providing input data from an input data source to a first input to an artificial neural network to up-stream layers of the artificial neural network, wherein the artificial neural network comprises down-stream layers, and wherein an output of the up-stream layers is provided as input to the down-stream layers; (Selectivity of voxels to individual feature maps reveals distributed representations 1st paragraph - For features of either low or high complexity this relationship tended to be spatially confined to either upstream or downstream visual areas, respectively. Materials and Methods 6th paragraph - In contrast, each artificial neuron in the fully connected layers took all features at all locations in the previous layer as its input.) and identifying a classification of the information in the input data at an output of the down-stream layers by the artificial neural network using the context data.  (Materials and Methods 11th paragraph - However, they have been trained on the same dataset (i.e., ImageNet) for the same task (i.e., object categorization).  Two of these DNNs have more than five convolutional layers (i.e., vgg-verydeep-16 and vgg-verydeep-19). To enable layer-wise comparison, we grouped the convolutional layers of these DNNs to have five groups and used the outputs of the last layer in a group as the outputs of the entire group.  Image decoding is driven by discriminative and categorical information - To examine to what extent decoding performance is driven by discrimination (identifying an image based on its unique characteristics) versus categorization (identifying an image based on categorical information), the following analysis was performed. We manually assigned each image in the test set to one of two categories (animate vs inanimate), as this appears to be the strongest categorical division in inferior temporal cortex (Khaligh-Razavi and Kriegeskorte, 2014))
Raza teaches: providing context data from a context data source to a second input to the artificial neural network to the down- stream layers, wherein the context data identifies a characteristic of information in the input data, and wherein the context data source is an independent data source that is different from the input data source; (2.2. The Proposed Network 3rd paragraph - The second input is added from the downsampling path for better localization and to capture the context information as in [9]. It also passes the convolution only features to the upsampling path, which helps to learn from the features which do not have maximum response in downsampling path.) 
In regards to claim 17 modified Guclu teaches The method of claim 15, wherein the artificial neural network is a convolutional neural network wherein the up- stream layers comprise convolutional layers and the down- stream layers comprise dense layers.  (Figure 1. DNN-based encoding framework. A, Schematic of the encoding model that transforms a visual stimulus to a voxel response in two stages. First, a deep (convolutional) neural network transforms the visual stimulus(x) to multiple layers of feature representations. Then, a linear mapping transforms a layer of feature representations to a voxel response(y). B, Schematic of the deep neural network where each layer of artificial neurons uses one or more of the following (non)linear transformations: convolution, rectification, local response normalization, max pooling, inner product, and softmax. C, Reconstruction of an example image from the activities in the first five layers.)
In regards to claim 18 modified Guclu teaches The method of claim 15, wherein: the input data comprises image data; the information in the input data comprises an image of an object; and identifying the classification of the information in the input data comprises identifying the classification of the object at the output of the down-stream layers using the context data.  (Materials and Methods 7th paragraph - The DNN was trained on 1.2 million augmented (by random crops, horizontal mirroring, and color jittering) natural images that are each labeled as 1 of 1000 object categories. The natural images were taken from the ImageNet (2012) dataset (Deng et al., 2009). Each input image was represented as a 224x224 matrix for each of three RGB color channels.  Materials and Methods 12th paragraph - Third, to test whether results are explained by optimizing the DNN for categorization, we compared its encoding performance with that of nine random DNNs that share the same architecture, but whose weights are drawn from a zero mean and unit variance multivariate Gaussian.)
In regards to claim 20 modified Guclu teaches The method of claim 15, wherein the context data identifies a selected one of a temporal characteristic of the information in the input data, a spatial characteristic of the information in the input data, or a category of the information in the input data.  (Image decoding is driven by discriminative and categorical information - It was found that the correlation between the observed and predicted responses to an image was significantly higher than the mean correlation between the observed responses to the same image and the predicted responses to different images, regardless of their category (p 5e-13 for both subjects, Bonferroni corrected for number of conditions, Student’s t test across test images within subjects). This points toward identification based on each image’s unique characteristics. For high-level voxels only, it was additionally found that the mean pairwise correlation between the observed and predicted responses to a pair of same category images was significantly higher than that of different category images (p 7e-25 for both subjects, Bonferroni corrected for number of conditions, Student’s t test across test images within subjects). This indicates that for downstream areas, not only unique characteristics of an image, but also its semantic content is involved in response prediction.)

Claim(s) 2, 9, and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable by Umut Guclu and Marcel A. J. van Gerven "Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream" 2015, hereinafter referred to as Guclu. and in view of S. E. Ahmed Raza, L. Cheung, D. Epstein, S. Pelengaris, M. Khan and N. M. Rajpoot, "MIMO-Net: A multi-input multi-output convolutional neural network for cell segmentation in fluorescence microscopy images," 2017 hereinafter referred to as Raza and in further view of L. Hu, L. Qin, K. Mao, W. Chen and X. Fu, "Optimization of Neural Network by Genetic Algorithm for Flowrate Determination in Multipath Ultrasonic Gas Flowmeter," hereinafter referred to as Hu.
In regards to claim 2 modified Guclu teaches The artificial neural network of claim 1 but fails to teach wherein a bias of nodes in the down-stream layers changes in response to the context data.
However Hu teaches wherein a bias of nodes in the down-stream layers changes in response to the context data (A. GA Optimized ANN Methodology - Therefore, weights, biases and other parameters of ANN are not fixed and they will be adjusted to approximate the real accurate output adaptively according to flow conditions and piping configurations.)
Guclu and Hu are analogous art because they are in the same field of endeavor of artificial intelligence.  
It would have been obvious to one of ordinary skill prior to the effective filing date to combine the artificial neural network of Guclu, the second input of Raza, and bias of nodes in Hu, in order to optimize the artificial neural network. (Hu, Abstract)
In regards to claim 9 modified Guclu teaches The method of claim 8 but fails to teach wherein identifying the classification of the information in the input data comprises changing a bias of nodes in the down-stream layers in response to the context data.
However Hu teaches wherein identifying the classification of the information in the input data comprises changing a bias of nodes in the down-stream layers in response to the context data. (A. GA Optimized ANN Methodology - Therefore, weights, biases and other parameters of ANN are not fixed and they will be adjusted to approximate the real accurate output adaptively according to flow conditions and piping configurations.)
Guclu and Hu are analogous art because they are in the same field of endeavor of artificial intelligence.  
The same motivation and reason to combine apply as in claim 2.
In regards to claim 16 modified Guclu teaches The method of claim 15 but fails to teach wherein identifying the classification of the information in the input data comprises changing a bias of nodes in the down-stream layers in response to the context data.
However Hu teaches wherein identifying the classification of the information in the input data comprises changing a bias of nodes in the down-stream layers in response to the context data. (A. GA Optimized ANN Methodology - Therefore, weights, biases and other parameters of ANN are not fixed and they will be adjusted to approximate the real accurate output adaptively according to flow conditions and piping configurations.)
Guclu and Hu are analogous art because they are in the same field of endeavor of artificial intelligence.  
The same motivation and reason to combine apply as in claim 2.

Claim(s) 5, 12, and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable by Umut Guclu and Marcel A. J. van Gerven "Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream" 2015, hereinafter referred to as Guclu. and in view of S. E. Ahmed Raza, L. Cheung, D. Epstein, S. Pelengaris, M. Khan and N. M. Rajpoot, "MIMO-Net: A multi-input multi-output convolutional neural network for cell segmentation in fluorescence microscopy images," 2017 hereinafter referred to as Raza and in further view of K. J. Piczak, "Environmental sound classification with convolutional neural networks," 2015 hereinafter referred to as Piczak 

In regards to claim 5 modified Guclu teaches The artificial neural network of claim 1, but fails to teach wherein the input data comprises audio data and the information in the input data represents a sound.
However Piczak teaches wherein the input data comprises audio data and the information in the input data represents a sound. (SECTION 3.Sound Classification - The ESC-50 dataset is a collection of 2000 short (5 seconds) environmental recordings comprising 50 equally balanced classes of sound events in 5 major groups (animals, natural soundscapes and water sounds, human non-speech sounds, interior/domestic sounds, and exterior/urban noises) prearranged into 5 folds for comparable cross-validation.)
Guclu and Piczak are analogous art because they are in the same field of endeavor of artificial intelligence.  
It would have been obvious to one of ordinary skill prior to the effective filing date to combine the artificial neural network of Guclu, the second input of Raza, and input data that represents a sound, in order to effectively implement convolution neural networks in environmental sound classification tasks. (Piczak, Summary)
In regards to claim 12 modified Guclu teaches The method of claim 8 but fails to teach wherein the input data comprises audio data and the information in the input data represents a sound.
However Piczak teaches wherein the input data comprises audio data and the information in the input data represents a sound. (SECTION 3.Sound Classification - The ESC-50 dataset is a collection of 2000 short (5 seconds) environmental recordings comprising 50 equally balanced classes of sound events in 5 major groups (animals, natural soundscapes and water sounds, human non-speech sounds, interior/domestic sounds, and exterior/urban noises) prearranged into 5 folds for comparable cross-validation.)
Guclu and Piczak are analogous art because they are in the same field of endeavor of artificial intelligence.  
The same motivation and reason to combine apply as in claim 5.
In regards to claim 19 modified Guclu teaches The method of claim 15 but fails to teach wherein the input data comprises audio data and the information in the input data represents a sound.
However Piczak teaches wherein the input data comprises audio data and the information in the input data represents a sound. (SECTION 3.Sound Classification - The ESC-50 dataset is a collection of 2000 short (5 seconds) environmental recordings comprising 50 equally balanced classes of sound events in 5 major groups (animals, natural soundscapes and water sounds, human non-speech sounds, interior/domestic sounds, and exterior/urban noises) prearranged into 5 folds for comparable cross-validation.)
Guclu and Piczak are analogous art because they are in the same field of endeavor of artificial intelligence.  
The same motivation and reason to combine apply as in claim 5.

Claim(s) 6 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable by Umut Guclu and Marcel A. J. van Gerven "Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream" 2015, hereinafter referred to as Guclu. and in view of S. E. Ahmed Raza, L. Cheung, D. Epstein, S. Pelengaris, M. Khan and N. M. Rajpoot, "MIMO-Net: A multi-input multi-output convolutional neural network for cell segmentation in fluorescence microscopy images," 2017 hereinafter referred to as Raza and in further view of Lifeng Shang, Zhengdong Lu, and Hang Li “Neural Responding Machine for Short-Text Conversation” 2015, hereinafter referred to as Shang 

In regards to claim 6 modified Guclu teaches The artificial neural network of claim 1, but fails to teach further comprising a context generator configured to generate the context data from the input data.
However Shang teaches further comprising a context generator configured to generate the context data from the input data.  (Abstract - Empirical study shows that NRM can generate grammatically correct and content-wise appropriate responses to over 75% of the input text, outperforming state-of-the-arts in the same setting, including retrieval-based and SMT-based models. Figure 2: The general framework and dataflow of the encoder-decoder-based NRM.  Figure 4 shows the graphical model of the RNN-encoder and related context generator for a global encoding scheme.)
Guclu and Shang are analogous art because they are in the same field of endeavor of artificial intelligence.  
It would have been obvious to one of ordinary skill prior to the effective filing date to combine the artificial neural network of Guclu and the context generator of Shang, in order to adaptively focus on some important words of the input text according to the generated words of response. (Shang, Local Scheme)
In regards to claim 13 modified Guclu teaches The method of claim 8, but fails to teach further comprising generating the context data from the input data.
However Shang teaches further comprising generating the context data from the input data.  (Abstract - Empirical study shows that NRM can generate grammatically correct and content-wise appropriate responses to over 75% of the input text, outperforming state-of-the-arts in the same setting, including retrieval-based and SMT-based models. Figure 2: The general framework and dataflow of the encoder-decoder-based NRM.  Figure 4 shows the graphical model of the RNN-encoder and related context generator for a global encoding scheme.)
Guclu and Shang are analogous art because they are in the same field of endeavor of artificial intelligence.  
The same motivation and reason to combine apply as in claim 6.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LUIS ANGEL PEREZ whose telephone number is (571)272-2361. The examiner can normally be reached Monday-Friday 7:30am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/L.A.P./Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122