Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This office action is in response to communication filed 4/23/2020. Claims 1-20 are currently pending and claims 1, 16, and 20 are the independent claims. 

Specification
The abstract of the disclosure does not commence on a separate sheet in accordance with 37 CFR 1.52(b)(4) and 1.72(b). A new abstract of the disclosure is required and must be presented on a separate sheet, apart from any other text.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-6, 9-14, and 16-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Bluche (CA 2,963,808 A1.

As per claim 1, Bluche anticipates: a neural network system implemented on one or more computers and for generating a data item, the neural network system comprising: 
a causal convolutional neural network configured to generate a data item by, at each of a plurality of iterations, generating a value of the data item conditioned upon values of the data item previously generated at previous iterations (pars. [0030], [0040], [0043], [0058], neural network/convolutional neural network (causal convolutional neural network) have multiple layers, and iteratively produce output/feature maps/attention maps/etc. (generate data item) in which output/data item from the network is used as input for a next iteration/layer/etc. such that output/data item from the network is based on output from a previous iteration/timestep/etc. (output/data item is generated at a plurality of iterations and is conditioned upon values of data item/output previously generated at previous iteration).); 
a support memory configured to store data representing a set of support data patches for generating the data item (pars. [0035], [0049], data including attention weight vectors/map/etc. (data representing support data patches for generating data item) is stored in and retrieved from storage unit (support memory).); and 
a soft attention subsystem configured to, at each of the plurality of iterations, determine a soft attention query vector dependent upon the previously generated values of the data item (pars. [0049]-[0052], attention mechanism computes attention weights for attention weight vectors/map of attention weights/etc. (soft attention query vector) and stores them in storage unit, and scores/weights are determined by incorporating attention map/state vector/etc. (values of data item) from previous iteration/timestep (soft attention query vector is dependent upon previously generated values of the data item).), 
wherein the soft attention query vector defines a set of scores for the support data patches for generating the value of the data item at the iteration (pars. [0046], [0050]-[0052], [0076], attention weight vector/map/etc. (support data patch) have attention weight determined based on scalar score (defines scored for support data patches/attention weight vector/maps/etc.), incorporate attention map/vectors from previous iteration and provide content and context for image based on determined feature maps (for generating value of data item at the iteration).); and 
wherein one or more layers of the causal convolutional neural network are conditioned upon a combination of the support data patches weighted by the scores (fig. 3A, pars. [0030], [0040], [0043], [0049]-[0056], convolutional neural network has several layers iteratively producing output/feature maps/attention weight/vectors/maps/etc., output/attention weight vectors/etc. (output, support data patches weighted by scores, etc.) from a previous iteration/layer is input to a next iteration/layer. As output/feature maps/attention weight vectors/maps/etc. are iteratively produced by layers of the neural network such that output/support data patches/feature maps/attention weight vectors/etc. output from previous iteration/layer in input to next iteration/layer, the layers of the neural network/casual convolutional neural network are conditioned upon/utilize/etc. a combination of the support data patches weighted by the scores/use the attention weight vectors/maps from previous iteration/layer which is based on weight vector/map from earlier iteration/layer/etc., and as such is conditioned upon a combination of the support data patches weighted by the scores/utilizes a attention weight vector/map resulting from/based on/etc. weight vectors/maps from previous iteration/layers/etc.).

As per claim 2, Bluche further anticipates: wherein the support data patches each have a respective support data patch key, and wherein the soft attention subsystem is configured to, at each of the plurality of iterations, combine an encoding of the previously generated values of the data item and the support data patch key for each of the support data patches to determine the soft attention query vector (pars. [0043]-[0044], attention mechanism/soft attention subsystem is implemented using neural network and determines attention weight vectors/scores/etc., and feature maps and attention weight maps/vectors (support data patch having support data patch key) from previous timestep/iteration/layer/etc. are inputs to next timestep/iteration/layer of neural network which determines feature maps/attention weight map/vectors/etc. for that layer/iteration/timestep/etc. As attention weight and feature maps/vectors/etc. (data patches having support data patch key) from each layer/iteration/timestep/etc. is used as input to a next layer/iteration/timestep in a neural network implementing attention mechanism/soft attention subsystem to determine attention weight and feature vectors/maps/etc., each of the plurality of iterations/layers/timesteps combine an encoding of previously generated values of the data items and support data patch key for each of the support data patches to determine soft attention query vector/attention weight vector for each iteration/layer/timestep is determined using feature and attention maps/vectors previously generated in previous iterations/layers/timesteps.).

As per claim 3, Bluche further anticipates: wherein the encoding of the previously generated values of the data item comprises a set of features from a layer of the causal convolutional neural network (pars. [0043], attention weights/feature vectors/etc. are generated/computed/encoded/etc. by neural network layer/layer of causal convolutional neural network based on inputs of feature maps, the map of attention weights at previous timestep, state vectors at previous timestep, etc. from previous iteration/layer/timestep (set of features from a layer/previous iteration/timestep/etc. of the causal convolutional neural network).).

As per claim 4, Bluche further anticipates: wherein the support data patches each has a respective support data patch value encoding content of the support data patch, wherein the soft attention mechanism is configured to, at each of the plurality of the iterations, determine an attention-controlled context function from a combination of the support data patch values weighted by the scores, and wherein one or more layers of the causal convolutional neural network are conditioned upon the attention-controlled context function (pars. [0043]-[0044], attention weight vector/map/etc. (support data patches) have weight/score/etc. (value) determined by neural network implementing attention mechanism and based on an attention function, feature maps, attention weights at previous timestep/layer/iteration, state vector at previous timestep/layer/iteration, etc. (encoding content of support data batch and soft attention mechanism determines an attention-controlled context function/attention function/etc. from a combination of the support data patch values weighted by the scores, and wherein one or more layers of the causal convolutional neural network are conditioned upon the attention-controlled context function (from attention function/attention weights/feature maps/state vectors/etc. from previous/one or more/etc. iterations/layers/timestep of neural network/layers are conditions by attention controlled context function).).

As per claim 5, Bluche further anticipates: wherein the support data patches comprise a plurality of different encodings of each of one or more support data items (pars. [0039, encoder unit encodes received image/extracts features/etc. and determines a number of feature vectors/feature maps/etc. (plurality of different encodings/feature vectors/maps/etc. of support data items).).

As per claim 6, Bluche further anticipates: wherein the iteratively generated values of the data item define respective positions associated with the values of the data item, and wherein the support data patches span a range of said positions (pars. [0030], [0038]-[0039], [0043]-[0044], image has height and width dimensions and output/score/attention weight/feature vector/value of data item/support data patch is determined/computed/etc. for each position of image (values of data item define positions associated with the values of the data item/positions in image/etc., and support data patches span a range of said positions/are computed for each position of image).).

As per claim 9, Bluche further anticipates: wherein the support data patches comprise encodings of a plurality of support data items, and wherein the support data patches each include a channel identifying a respective support data item or set of support data items (pars.[0015], [0038]-[0039], image/document has width and height dimensions determined in terms of pixels/pixel values/etc. and includes plurality of characters, plurality of feature maps/feature vectors/etc. (support data patches) each corresponding to a character are determined and encoded including extracting features and determining feature vectors/maps having dimensions (support data patch include channel identifying respective/corresponding support data item).).

As per claim 10, Bluche further anticipates: wherein the one or more layers of the causal convolutional neural network are further conditioned upon global feature data, wherein the global feature data defines global features for the data item, and wherein the global feature data is derived from one or more of the support data patches (pars. [0050]-[0052], location based attention mechanism is incorporated by determining previous position from attention map from previous timestep/iteration/layer/etc. and state vector from previous timestep/layer/iteration/etc. and incorporates content based attention mechanism based on all feature maps and state vector from previous timestep/layer/iteration, and attention weights are determined based on context of whole image (global feature data defines global features for the data item and is derived from one or more of the support data patches).).

As per claim 11, Bluche further anticipates: wherein the support data patches comprise encodings of one or more support data items, the system further comprising an input to receive the one or more support data items and an encoder to encode the one or more support data items into the global feature data, wherein the global feature data represents one or more features of the one or more support data items (pars. [0010], [0034], [0050]-[0052], images are transmitted to input unit, image is encoded, features are extracted from the image, features maps based on extracted features are determined, the location based attention mechanism is incorporated by determining previous position in image from attention map from previous timestep/iteration/layer/etc. and state vector from previous timestep/layer/iteration/etc. and incorporates content based attention mechanism based on all feature maps and state vector of image from previous timestep/layer/iteration, and attention weights are determined based on context of whole image.).

As per claim 12, Bluche further anticipates: wherein the causal convolutional neural network comprises one or more causal convolutional network layers coupled to an output layer, wherein the output layer is configured to generate an output defining a distribution of predicted values for the data item at an iteration, the neural network system further comprising a selection module to select a value of the data item for a current iteration dependent upon the distribution of predicted values, and to provide the selected value for the causal convolutional neural network to use in a subsequent iteration (pars. [0058]-[0062],probability of next character in image is determined and stored in probability vector having dimensions indicating likelihood/probability that the next character in image is the character represented by that dimension and the dimension with the highest value is determined to be next character (output layer generates output defining a distribution of predicted values/probabilities/likelihood for the data item at an iteration, and value of the data item is selected/predicted/etc. for a current iteration dependent upon the distribution of predicted values/probability/likelihood), and output of neural network is based on output from previous timestep/iteration/layer/probability vector at timestep/iteration is determined based on vectors from previous timestep/iteration/etc. (network layers/previous layer/iteration/timestep/etc. coupled to output layer and output/data item/predicted value/selected value is provided for use in subsequent iteration/timestep/layer). ).

As per claim 13, Bluche further anticipates: wherein the causal convolutional neural network is configured to generate successive groups of values for the data item, wherein the successive groups of values are at successively higher resolution, and wherein the causal convolutional neural network is configured to iteratively generate values of the data item for one group conditioned upon previously generated values of the data item for one or more previously generated lower resolution groups (pars. [0030], [0040], [0043], [0061], output is arranged in features maps and output from a first iteration/layer/timestep of neural network is input to next iteration/layer/timestep of neural network, and with each iteration more and more complex representation of image are extracted (successive groups of values are generated for data item at successively higher resolution by neural network iteratively generating values of the data item/output/feature maps/etc. for one group conditioned upon previously generated values of the data item for one or more previously generated lower resolution groups/previous iterations/layers/timestep/etc.).).

As per claim 14, Bluche further anticipates: wherein the values of the data item comprises one or more of: pixel values of a still or moving image, audio signal values, and values representing a text string (pars. [0038], image (still or moving image) is determined in terms of pixels (pixel values), and image has dimensions/values/etc. representing the width and height of the image.).

As per claim 16, Bluche anticipates: a method of training a neural network system to encode a probability density estimate for a data item, the method comprising: 
training a convolutional neural network to iteratively generate a succession of values of a data item conditioned upon previously generated values of the data item (pars. [0030], [0040], [0043], [0058], neural networks/convolutional neural networks have several layers and are trained (train convolutional neural network) thereby defining receptors of the neural networks to iteratively output feature maps/vectors/data items where output from a previous iteration/layer/timestep/etc. is input to the next iteration/layer/timestep/etc. (iteratively generate a succession of values of a data item conditioned upon previously generated values of the data item).), 
wherein the training encodes a probability density estimate for the data item in weights of the convolutional neural network (pars. [0058]-[0063], neural network/convolutional neural network iteratively produces/outputs/determines/etc. probability of each character in image and stores probability value in probability vector having dimensions where each dimension comprising a probability value indicating the likelihood the next character in image is the character represented by that dimension, and probability is determined using vectors from previous iteration/timestep/layer/etc..);
wherein the training further comprises: 
encoding support data from input data, the input data defining one or more examples of a target data item for the neural network system, to generate encoded support data (pars. [0038]-[0039], [0041], [0071], Appendix A pg. 5 section 5.2, image having dimensions is received/input/etc., encoder encodes input/received image using neural network and includes extracting features and determining feature vectors and maps having dimensions, and neural network/model/etc. is pre-trained/trained using input images/data defining examples of target data item.); 
encoding a combination of local context data derived from the previously generated values of the data item, and the encoded support data, to determine an attention-controlled context function, and conditioning one or more layers of the convolutional neural network upon the attention-controlled context function (pars. [0043]-[0044], neural network determines attention weight vector/map/etc. have weight/score/etc. (value) for each feature vector at every position in feature maps at timestamp/iteration/layer/etc. by implementing attention mechanism and based on an attention function, feature maps, attention weights at previous timestep/layer/iteration, state vector at previous timestep/layer/iteration, etc. (determine attention-controlled context function by encoding combination of local context data from previously generated values at previous timestep/layer/iteration and encoded support data, and conditioning layers/iterations/timesteps/etc. on the attention-controlled context function/utilizing the attention-controlled context function in layer/iteration/timestep/etc.).).

As per claim 17, Bluche further anticipates: storing the encoded support data in memory coupled to the convolutional neural network; and querying the stored encoded support data using the attention-controlled context function (pars. [0030], [0033], [0035], [0043]-[0044], [0049], data including attention weight vectors/map/etc. (encoded support data for generating data item) having weight score based on attention function/feature maps/attention weights at previous iterations/etc. (using attention controlled context function) is stored in and retrieved from storage unit (storing encoded support data in and querying stored encoded support data from memory).).

As per claim 18, Bluche further anticipates: wherein determining the attention-controlled context function comprises learning a scoring function matching the local context data with the support data (pars. [0044], [0046], [0051], content based attention mechanism provides content/context of image based on all feature maps, attention weights of attention weight vector at coordinate corresponding to feature maps is determined based on score at coordinate in feature map, and score is determined based on attention function, feature maps altogether each having dimensions, etc. (attention-controlled context function comprises scoring function matching the local context data with the support data).).

As per claim 19, Bluche further anticipates: using the encoded probability density estimate to generate values for a further data item sampled from or predicted by the encoded probability density estimate (pars. [0058]-[0062], probability of next character in image (further data item predicted) is determined (generate values for further data item predicted) and stored in probability vector having dimensions indicating likelihood/probability that the next character in image is the character represented by that dimension and the dimension with the highest value is determined to be next character, and output of neural network is based on output from previous timestep/iteration/layer/probability vector at timestep/iteration is determined based on vectors from previous timestep/iteration/etc. (probability/value/output/etc. of next character/further data item predicted is generated/determined using encoded probability density estimate).).

As per claim 20, it recites a non-transitory computer-readable storage media having similar limitations to the neural network system of claim 1, and is therefore rejected for the same reasoning as claim 1, above. 

Allowable Subject Matter
Claims 7-8 and 15 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  
The prior art of record teaches a trained convolutional neural network that generates data item/output/etc. by iteratively generating a value of the data item using/condition on/etc. values of the data item/output generated/output/etc. in previous iterations, that stored/encoded/etc. support data patches are used to generate the data item/output/etc., that attention vectors having a weight/score/etc. for the support data patches are used/determined/etc. in each iteration of neural network and are dependent on values of data item/output from previous iterations, that layers of the neural network are conditions using a weighted combination of support data. However, the prior art of record fails to render an obviousness of the support data comprising features of a convolutional neural network encoding the support data when the support data patches comprises a plurality of different encoding of each of one or more support data items, as required by dependent claims 7 and 8; or the values of the data item comprising one or more of: pixel values of a still or moving image, audio signal values, and values representing a text string, wherein the support data patches comprise an encoding of data of the same type as a data item, and wherein the convolutional neural network is further conditioned on an encoding of data of a different type to that of the data item, as required by dependent claim 15.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS M SLACHTA whose telephone number is (571)270-0653. The examiner can normally be reached Monday-Friday 6:30am-4pm.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chat Do can be reached on 571-272-3721. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.









Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DOUGLAS M SLACHTA/Examiner, Art Unit 2193