Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/14/2020 has been entered.

Status of Claims
This action is in reply to the amendments and remarks filed on 12/14/2020.
Claims 1-5, 7-13, and 15-20 are pending.
Claims 1, 8, and 9 have been amended.
Claim 21 has been canceled.

Response to Arguments
Applicant’s arguments, with respect to the rejection(s) of claim(s) 1, 8, and 9 under 35 U.S.C. 103, stating that “Sunkavalli does not disclose” the amended limitation that now states “the discretization layer is configured to generate a discretized network input comprising a respective discretized vector for each of the input one or more intensity values for each of the pixels in the image before the image is processed by additional neural network layers”, have been considered but are moot because the arguments do not apply to the current combination of references being used in the current rejection.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to 

Claims 1-5, 8-13, and 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Lu et al (“Effective Data Mining Using Neural Networks”, 1996) hereinafter Lu, in view of Sunkavalli et al (US Pub 20180260975) hereinafter Sunkavalli, in view of Amer et al (US Pub 20190094124) hereinafter Amer, and further in view of Zhao et al (US Pub 20190035118) hereinafter Zhao.
Regarding claims 1, 8, and 9, Lu teaches a method performed by one or more computers for generating a network output for a network input that includes an image, a system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations, and one or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising (sections 1 and 2.1-2.2 teach utilizing databases and classification rules of neural networks, well known to be executed/stored on computers including one or more processors and coupled memories (CRM), for determining an network’s output based on inputs (generating a network output for a network input)): 
receiving a network input for a neural network comprising a discretization layer followed by a plurality of additional neural network layers, the network input comprising a plurality of numeric values from a space of possible numeric values (sections 2.1-2.2 and Table 1 teach “datasets having nine attributes: salary, commission, age, elevel, car, zipcode, house-years, and loan (received network input comprising a plurality of numeric values from a space of possible numeric values)…a neural network was first constructed” and then once the network was formed, “[e]ach attribute was coded as a binary string for use as input to the network” (neural network comprising a discretization layer), where sections 1-2 and Fig. 1 teach the network further including multiple “layer[s]” (followed by a plurality of additional neural network layers)), wherein the numeric values are floating point values and include one or more respective intensity values for each pixel in the image; 
processing the network input using the discretization layer (section 2.2 and Table 1 teach “[e]ach attribute was coded (processed) as a binary string (generated discretized network input) for use as input to the network” (processing the network input using the discretization layer), and “[e]ach bit of a string was either 0 or 1 depending on which subinterval the original value was located. For example, a salary of 140k (each of the numeric values in the network input) would be coded as {1, 1, 1, 1, 1, 1} (respective discretized vector) and a value of 100k (each of the numeric values in the network input) as {0, 1, 1, 1, 1, 1} (respective discretized vector)), wherein the discretization layer is configured to generate a discretized network input comprising a respective discretized vector for each of the input one or more intensity values for each of the pixels in the image before the image is processed by additional neural network layers, wherein each discretized vector has a respective entry for each of a plurality of partitions of the space of possible numeric values (section 2.2 and Table 1 teach “[e]ach attribute was , and wherein generating the discretized network input comprises, for each of the numeric values: 
identifying the partition to which the numeric value belongs (section 2.2 and Table 1 teach “[f]or example, a salary of 140k would be coded as {1, 1, 1, 1, 1, 1} (identifying the partition to which the numeric value belongs) and a value of 100k as {0, 1, 1, 1, 1, 1}” (identifying the partition to which the numeric value belongs)); 
setting, in the discretized vector for the numeric value, each entry that is before the entry for the identified partition to a first value (section 2.2 and Table 1 teach “[e]ach attribute was coded as a binary string for use as input to the network”, and “[e]ach bit of a string was either 0 or 1 depending on which subinterval the original value was located. For example, a salary of 140k would be coded as {1, 1, 1, 1, 1, 1} and a value of 100k as {0, 1, 1, 1, 1, 1}” (setting, in the discretized vector for the numeric value, each entry that is before the entry for the identified partition to a first value)), 
setting, in the discretized vector for the numeric value, the entry for the identified partition to a second value (section 2.2 and Table 1 teach “[e]ach attribute was coded as a binary string for use as input to the network”, and “[e]ach bit of a string was either 0 or 1 depending on which subinterval the , and 
setting, in the discretized vector for the numeric value, each entry that is after the entry for the identified partition to the second value (section 2.2 and Table 1 teach “[e]ach attribute was coded as a binary string for use as input to the network”, and “[e]ach bit of a string was either 0 or 1 depending on which subinterval the original value was located. For example, a salary of 140k would be coded as {1, 1, 1, 1, 1, 1} and a value of 100k as {0, 1, 1, 1, 1, 1}” (setting, in the discretized vector for the numeric value, each entry that is after the entry for the identified partition to the second value)); and 
processing the discretized network input using the plurality of additional neural network layers to generate a network output for the network input (section 2.2 teaches “total of 37 binary inputs (processing the discretized network input)…the tuples were classified into two classes…[t]he training data set consisted of 2000 tuples…[t]he initial network had 38 input units, six hidden units, one output unit, and therefore 234 links” (using the plurality of additional neural network layers to generate a network output for the network input)).
However Lu does not explicitly teach a method performed by one or more computers for generating a network output for a network input that includes an image, wherein the numeric values are floating point values and include one or more respective intensity values for each pixel in the image, and wherein the discretization layer is configured to generate a discretized network input comprising a respective discretized vector for each of the input one or more intensity values for each of the pixels in the image before the image is processed by additional neural network layers. 
Sunkavalli teaches a method performed by one or more computers for generating a network output for a network input that includes an image (paragraphs 0025-0027, 0058, Fig. 1, and claim 1 teach using “computer[s]” for “generat[ing] designated output” from a trained “neural network” (generating a network output) on “input images” (for a network input that includes an image)),
wherein the numeric values are floating point values and include one or more respective intensity values for each pixel in the image (paragraph 0086 teaches “floating point values are formulaic representations representing real numbers (numeric values are floating point values) indicating color and/or brightness information (include one or more intensity values) for each pixel within input image 410”).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Sunkavalli’s teachings of “floating point values…indicating color and/or brightness information for each pixel within input image” being normalized before processing of the next neural network layer into Lu’s teaching of attribute data encoding for neural network inputs in order to improve network calculations thereby optimizing quality of the image output (Sunkavalli, paragraphs 0025-0027, 0058, 0086-0093, Figs. 1 and 4, Table 1, and claim 1).
However, while Sunkavalli teach the image’s “floating point values” undergo “normalization…after each layer except for the output layers” for a neural network, Amer processing the network input using the discretization layer, wherein the discretization layer is configured to generate a discretized network input comprising a respective discretized vector for each of the input one or more intensity values for each of the pixels in the image before the image is processed by additional neural network layers (paragraphs 0010, 0028, and 0030-0038 teach a “hierarchical layer[ed]” neural network with a “preprocessing phase” for turning “particular resolution” thermal image “localized section[s]” into “discrete value” variables and “vectorized” for “inputs to the neural network algorithms” (processing the network input using the discretization layer, wherein the discretization layer is configured to generate a discretized network input comprising a respective discretized vector for each of the input one or more intensity values for each of the pixels in the image before the image is processed by additional neural network layers). Further it is taught that the “image processing and analysis stage can employ…support vector machines”.).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify attribute data encoding for neural network inputs, as taught by Lu as modified by “floating point values…indicating color and/or brightness information for each pixel within input image” being normalized before processing of the next neural network layer as taught by Sunkavalli, to include turning thermal images into “discrete value” variables and “vectorized” for “inputs to the neural network algorithms” as taught by Amer in order to improve neural network functionality and accuracy through image data preprocessing (Amer, paragraphs 0010, 0028, and 0030-0038).
 a system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations, and one or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations (see mapping above) and receiving a network input for a neural network comprising a discretization layer followed by a plurality of additional neural network layers (see mapping above), however Zhao teaches a method performed by one or more computers for generating a network output for a network input that includes an image, a system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations, and one or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations (paragraphs 0028, 0031, 0051, 0156-0160, and Fig. 6 teach a CRM/memory including “instructions [that] executable by at least one processor” for performing the embodiments of the disclosure concerning obtaining a neural network output from “a first input image” (generating a network output for a network input that includes an image)), and
receiving a network input for a neural network comprising a discretization layer followed by a plurality of additional neural network layers (paragraph 0134 and Fig. 11 teach the neural network hidden layers can include “one or more batch normalization layers” (neural network comprising a discretization layer) and further 
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify attribute data encoding for neural network inputs, as taught by Lu as modified by “floating point values…indicating color and/or brightness information for each pixel within input image” being normalized before processing of the next neural network layer as taught by Sunkavalli, as further modified by turning thermal images into “discrete value” variables and “vectorized” for “inputs to the neural network algorithms” as taught by Amer, to include an image processing neural network including normalization layers executed on a computing system as taught by Zhao in order to maximize processing by structuring the neural network to include normalization layers (Zhao, paragraphs 0028, 0031, 0051, 0134, 0156-0160, and Fig. 11).

Regarding claims 2, 10, and 16, the combination of Lu, Sunkavalli, Amer, and Zhao teach all the claim limitations of claims 1, 8, and 9 above, and further teach the first value is zero (Lu, section 2.2 and Table 1 teach “[e]ach attribute was coded as a binary string for use as input to the network…The thermometer coding scheme was used for the binary representation of the continuous attributes. Each bit of a string was either 0 or 1 depending on which subinterval the original value was located. For example, a salary of 140k would be coded as {1, 1, 1, 1, 1, 1} and a value of 100k as {0, 1, 1, 1, 1, 1}” (first value is zero)).

Regarding claims 3, 11, and 17, the combination of Lu, Sunkavalli, Amer, and Zhao teach all the claim limitations of claims 1, 8, and 9 above, and further teach the second value is a positive value (Lu, section 2.2 and Table 1 teach “[e]ach attribute was coded as a binary string for use as input to the network…The thermometer coding scheme was used for the binary representation of the continuous attributes. Each bit of a string was either 0 or 1 depending on which subinterval the original value was located. For example, a salary of 140k would be coded as {1, 1, 1, 1, 1, 1} and a value of 100k as {0, 1, 1, 1, 1, 1}” (second value is a positive value)).

Regarding claims 4, 12, and 18, the combination of Lu, Sunkavalli, Amer, and Zhao teach all the claim limitations of claims 3, 11, and 17 above, and further teach the second value is one (Lu, section 2.2 and Table 1 teach “[e]ach attribute was coded as a binary string for use as input to the network…The thermometer coding scheme was used for the binary representation of the continuous attributes. Each bit of a string was either 0 or 1 depending on which subinterval the original value was located. For example, a salary of 140k would be coded as {1, 1, 1, 1, 1, 1} and a value of 100k as {0, 1, 1, 1, 1, 1}” (second value is one)).

Regarding claims 5, 13, and 19, the combination of Lu, Sunkavalli, Amer, and Zhao teach all the claim limitations of claims 1, 8, and 9 above, and further teach the entries in the discretized vector are ordered from an entry for a lowest partition of the space to an entry for a highest partition of the space (Lu, section 2.2 and Tables 1-2 teach “[e]ach attribute was coded as a binary string for use as input to the .

Claims 7, 15, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Lu et al (“Effective Data Mining Using Neural Networks”, 1996) hereinafter Lu, in view of Sunkavalli et al (US Pub 20180260975) hereinafter Sunkavalli, in view of Amer et al (US Pub 20190094124) hereinafter Amer, in view of Zhao et al (US Pub 20190035118) hereinafter Zhao, and further in view of Kurakin et al ("Adversarial machine learning at scale", February 2017) hereinafter Kurakin.
Regarding claims 7, 15, and 20, the combination of Lu, Sunkavalli, Amer, and Zhao teach all the claim limitations of claims 1, 8, and 9 above. 
While Zhao teaches training a “generative adversarial network[‘s]” parameters through training sets and “gradient descent”/cost/loss functions “between a testing value…of the neural network and a desired value”, Kurakin teaches training the neural network using adversarial training (sections 3 and 4 teach adversarial training for a neural network), comprising: 
obtaining a target network output for the network input (sections 3 and 4.1 teach the example datasets having “labeled examples” (target network output for the network input)); 
generating an adversarial input from the network input (sections 3, 4.1, and Algorithm 1 teach “generat[ing]…adversarial examples…from corresponding clean examples” to be included in the training dataset (generating an adversarial input from the network input)); 
processing the adversarial input using the neural network to generate a network output for the adversarial input (sections 3, 4 intro, and 4.1 teach using (processing) the “adversarial examples (adversarial input)” in the neural network (using the neural network) for predicting results to measure accuracy (generate a network output for the adversarial input)); 
determining a gradient with respect to the parameters of the neural network of an objective function that depends on (i) an error between the target network output and the network output for the network input and (ii) an error between the target network output and the network output for the adversarial input (sections 2-4, and “Appendices” A teach training the neural network, know to include network parameters, on “clean examples”, training the neural network on “adversarial examples” and then measuring the “error” and/or “accuracy” of each (depends on (i) an error between the target network output and the network output for the network input and (ii) an error between the target network output and the network output for the adversarial input) through loss functions and “gradient” values); and
adjusting current values of the parameters using the gradient (sections 2-4, and “Appendices” A teach training the neural network, know to include network parameters, through loss functions and “gradient” values based on the training results of .
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify attribute data encoding for neural network inputs, as taught by Lu as modified by “floating point values…indicating color and/or brightness information for each pixel within input image” being normalized before processing of the next neural network layer as taught by Sunkavalli, as modified by turning thermal images into “discrete value” variables and “vectorized” for “inputs to the neural network algorithms” as taught by Amer, as further modified by an image processing neural network including normalization layers executed on a computing system as taught by Zhao, adversarial training for a neural network as taught by Kurakin in order to improve prediction accuracy of a neural network by implementing adversarial examples in the training dataset (Kurakin, sections 2-4, and “Appendices” A).

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Wierstra et al (US Patent 10432953) teaches “system discretizes the latent variable” and “processes the discrete values of the latent variables using the generative neural network to generate intermediate outputs”.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CLINT MULLINAX whose telephone number is 571-272-3241.  The examiner can normally be reached on Mon - Fri 8:00-4:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/C.M./Examiner, Art Unit 2123                                                                                                                                                                                                        

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123