DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 05/28/2020, 10/28/2020 and 12/23/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Drawings
The drawings filed on are accepted by the Examiner.
Specification
The disclosure is objected to because of the following informalities:
The disclosure is objected to because it contains an embedded hyperlink and/or other form of browser-executable code (see page 18 line 22 and page 38 line 19). Applicant is required to delete the embedded hyperlink and/or other form of browser-executable code; references to websites should be limited to the top-level domain name without any prefix such as http:// or other browser-executable code. See MPEP § 608.01.
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s .
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Regarding claims 1-13, claims 1-13 are rejected because not enabling for the "undue breadth" reasoning. Thus, it is single structure, “a processor” try to cover every conceivable structure to achieve the stated purpose. When claims depend on a recited property, a fact situation comparable to Hyatt is possible, where the claim covers every conceivable structure for achieving the stated property while the specification discloses at most only those known to the inventor.
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Regarding claims 1-13, claims 1-13 are rejected as failing to define the invention in the manner required by 35 U.S.C. 112, second paragraph. The claim(s) are narrative in form and replete with indefinite and functional or operational language.  The structure which goes to make up the device must be clearly and positively specified.  The structure must be organized and correlated in such a manner as to present a complete operative device.  The claim(s) must be in one sentence form only
Regarding claims 1-13, claims 1-13 is an apparatus claim. MPEP in section 2114 states that “[A]pparatus claims cover what a device is, not what a device does.” Hewlett-Packard Co. v. Bausch  & Lomb Inc., 909 F.2d 1464, 1469, 15 USPQ2d 1525, 1528 (Fed. Cir. 1990) (emphasis in original)”. And also that “While features of an apparatus may be recited either structurally or functionally, claims directed to an apparatus must be distinguished from the prior art in terms of structure rather than function. In re Schreiber, 128 F.3d 1473, 1477-78, 44 USPQ2d 1429, 1431-32 (Fed. Cir. 1997)”  
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-8 and 10-15 are rejected under 35 U.S.C. 103 as being unpatentable over DeVries (“Improved regularization of convolutional neural networks with cutout.” arXiv preprint arXiv: 1708.04552, 2017, see Applicant admitted Prior Art (AAPA) page 1 lines 17-24) in view of Vasudevan (US 20190354895 A1).
Regarding claims 1, 14 and 15, DeVries discloses input, to a machine learning model configured to perform recognition, input data (abstract section 1 specifically discloses “To evaluate this technique we conduct tests on several popular image recognition datasets, achieving state-of-the-art results on CIFAR-10, CIFAR-100, and SVHN. We also achieve competitive results on STL-10, demonstrating the usefulness of cutout for low data and higher resolution problems.”); identify a feature portion of the input data to serve as a basis for recognition by the machine learning model in which the input data is used as input (abstract section 3 specifically discloses “One of the major differences between cutout and other dropout variants is that units are dropped at the input stage of the network rather than in the intermediate layers. This approach has the effect that visual features, including objects that are removed from the input image, are correspondingly removed from all subsequent feature maps. Other dropout variants generally consider each feature map individually, and as a result, features that are randomly removed from one feature map may still be present in others. These the simple regularization technique of randomly masking out square regions of input during training, which we call cutout, can be used to improve the robustness and overall performance of  convolutional neural networks. Not only is this method extremely easy to implement, but we also demonstrate that it can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance..”); and perform data augmentation based on the processed data (abstract section 3 specifically discloses “In this paper, we show that the simple regularization technique of randomly masking out square regions of input during training, which we call cutout, can be used to improve the robustness and overall performance of  convolutional neural networks. Not only is this method extremely easy to implement, but we also demonstrate that it can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance”) DeVries doesn’t disclose data augmentation based on the processed data. Vasudevan discloses data augmentation based on the processed data (abstract figure 1 block 108 figure 7 blocks 704-408 Vasudevan specifically discloses “While the training system 100 determines the final data augmentation policy 108 with respect to a particular set of training data 106, the final data augmentation policy 108 may (in some cases) be transferrable to other sets of training data. That is, the final data augmentation policy 108 determined with respect to the training data 106 may be used to effectively train other machine learning model on different sets of training data”)

    PNG
    media_image1.png
    526
    526
    media_image1.png
    Greyscale

DeVries and Vasudevan are analogous art because they are from the same field of neural networks. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to incorporate in the technique disclosed by DeVries the data augmentation based on processed data disclosed by Vasudevan. The suggestion/motivation for doing so would have been to learn a data augmentation policy for training a machine learning model (Vasudevan abstract). See also KSR Int'l Co. v. Teleflex Inc. Case cited as 550 US (2007).  In the KSR case, the Court stated that in certain circumstances what is obvious to try is also obvious, such as where "there is a design need or market pressure to solve a problem, and there are a finite number of identified, predictable solutions, a person of ordinary skill has good reason to pursue the known options within his or her technical grasp. If this leads to the anticipated success, it is likely the product not of innovation but of ordinary skill and common sense." Regarding hindsight, the Court found that "[r]igid preventive rules that deny fact finders recourse to common sense . . . are neither necessary under our case law nor consistent with it." The Court stated that "familiar items may have obvious uses beyond their primary purposes," analogizing an obvious invention to the fitting together the person of ordinary skill is also a person of ordinary creativity, and not "an automaton."
Regarding claim 2, DeVries and Vasudevan disclose claim 1, DeVries also discloses select a part of the feature portion as a portion to be processed and acquire the processed data by processing the selected portion to be processed (abstract section 3 specifically discloses “One of the major differences between cutout and other dropout variants is that units are dropped at the input stage of the network rather than in the intermediate layers. This approach has the effect that visual features, including objects that are removed from the input image, are correspondingly removed from all subsequent feature maps. Other dropout variants generally consider each feature map individually, and as a result, features that are randomly removed from one feature map may still be present in others. These inconsistencies produce a noisy representation of the input image, thereby forcing the network to become more robust to noisy inputs.”)
Regarding claim 3, DeVries and Vasudevan disclose claim 2, DeVries also discloses to select the portion to be processed based on a score calculated for each area in the feature portion (abstract section 2.2 specifically discloses “Wu and Gu propose probabilistic weighted pooling [20], wherein activations in each pooling region are dropped with some probability. This approach is similar to applying dropout before each pooling layer, except that instead of scaling the output with respect to the dropout probability at test time, the output of each pooling function is selected to be the sum of the activations weighted by the dropout probability.”)
Regarding claim 4, DeVries and Vasudevan disclose claim 2, DeVries also discloses select a plurality of portions to be processed that are different from each masking out square regions of input during training, which we call cutout, can be used to improve the robustness and overall performance of convolutional neural networks. Not only is this method extremely easy to implement, but we also demonstrate that it can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance.”)
Regarding claim 5, DeVries and Vasudevan disclose claim 4, DeVries also discloses randomly select from the feature portion the plurality of portions to be processed (abstract section 2.2 specifically discloses “In this paper, we show that the simple regularization technique of randomly masking out square regions of input during training, which we call cutout, can be used to improve the robustness and overall performance of convolutional neural networks. Not only is this method extremely easy to implement, but we also demonstrate that it can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance.”)
Regarding claim 6, DeVries and Vasudevan disclose claim 1, DeVries also discloses identify a plurality of feature portions, acquire a plurality of pieces of processed data based on the plurality of feature portions, and perform the data augmentation (abstract section 2.2 specifically discloses “In this paper, we show that the simple regularization technique of randomly masking out square regions of input during training, which we call cutout, can be used to improve the robustness and overall performance of convolutional neural networks. Not only is this method extremely easy to implement, but we also demonstrate that it can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance.”). Vasudevan also discloses perform the data augmentation based on the plurality of pieces of processed data (abstract figure 1 block 108 figure 7 blocks 704-408 Vasudevan specifically discloses “While the training system 100 determines the final data augmentation policy 108 with respect to a particular set of training data 106, the final data augmentation policy 108 may (in some cases) be transferrable to other sets of training data. That is, the final data augmentation policy 108 determined with respect to the training data 106 may be used to effectively train other machine learning model on different sets of training data”)
Regarding claim 7, DeVries and Vasudevan disclose claim 1, DeVries also discloses that the input data is an input image to be input to the machine learning model, and identify a feature portion of the input image, acquire a processed image by processing at least a part of the feature portion (abstract section 2.2 specifically discloses “In this paper, we show that the simple regularization technique of randomly masking out square regions of input during training, which we call cutout, can be used to improve the robustness and overall performance of convolutional neural networks. Not only is this method extremely easy to implement, but we also demonstrate that it can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance.”) Vasudevan also discloses perform the data augmentation based on the processed image (abstract figure 1 block 108 final data augmentation policy 108 with respect to a particular set of training data 106, the final data augmentation policy 108 may (in some cases) be transferrable to other sets of training data. That is, the final data augmentation policy 108 determined with respect to the training data 106 may be used to effectively train other machine learning model on different sets of training data”)
Regarding claim 8, DeVries and Vasudevan disclose claim 7, DeVries also discloses acquire the processed image by performing mask processing on at least a part of the feature portion and perform the data augmentation based on the image on which the mask has been performed (abstract section 2.2 specifically discloses “In this paper, we show that the simple regularization technique of randomly masking out square regions of input during training, which we call cutout, can be used to improve the robustness and overall performance of convolutional neural networks. Not only is this method extremely easy to implement, but we also demonstrate that it can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance.”) Vasudevan also discloses perform the data augmentation based on the processed image (abstract figure 1 block 108 figure 7 blocks 704-408 Vasudevan specifically discloses “While the training system 100 determines the final data augmentation policy 108 with respect to a particular set of training data 106, the final data augmentation policy 108 may (in some cases) be transferrable to other sets of training data. That is, the final data augmentation policy 108 determined with respect to the training data 106 may be used to effectively train other machine learning model on different sets of training data”)
are dropped with some probability. This approach is similar to applying dropout before each pooling layer, except that instead of scaling the output with respect to the dropout probability at test time, the output of each pooling function is selected to be the sum of the activations weighted by the dropout probability.”)
Regarding claim 11, DeVries and Vasudevan disclose claim 10, DeVries also discloses at least one or more convolutional layers and identify the feature portion further based on a feature map output from the at least one or more convolutional layers (abstract section 3 specifically discloses “Other dropout variants generally consider each feature map individually, and as a result, features that are randomly removed from one feature map may still be present in others.”)
Regarding claim 12, DeVries and Vasudevan disclose claim 1, DeVries also discloses to output a result of the recognition and an activation map for the result of the recognition, and to identify the feature portion based on the activation map (section 4.4 figure 4 and 5 specifically discloses “In Figure 4, we sort the activations within each layer by ascending magnitude, averaged over all samples in the test set. We observe that the shallow layers of the network experience a general increase in activation strength, while in deeper layers, we see more activations in the tail end of the distribution. The latter observation illustrates that cutout is indeed encouraging the network to take into account a wider variety of features when making predictions, rather 
Regarding claim 13, DeVries and Vasudevan disclose claim 1, DeVries also discloses a teacher data set including a plurality of pieces of teacher data learned therein, the input data is included in the teacher data set, and perform the data augmentation by adding teacher data including the processed data to the teacher data set (abstract specifically discloses “In this paper, we show that the simple regularization technique of randomly masking out square regions of input during training, which we call cutout, can be used to improve the robustness and overall performance of convolutional neural networks. Not only is this method extremely easy to implement, but we also demonstrate that it can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance” see also Applicant admitted Prior Art (AAPA) specifically discloses “There have been known machine learning models using supervised machine learning. For example, in the literature "T. Devries and G. W. Taylor, “Improved regularization of convolutional neural networks with cutout” arXiv preprint arXiv: 1708.04552, 2017.5", there is described a technology in which a new teacher image is acquired by performing mask processing on a portion randomly selected from a teacher image, to thereby implement data augmentation”)
Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over DeVries and Vasudevan as applied to claim 7 above, and further in view of Pathak (“Context 
Regarding claim 9, DeVries and Vasudevan disclose claim 7, DeVries also discloses acquire the processed image, processing on at least a part of the feature portion and perform the data augmentation (abstract section 2.2 specifically discloses “In this paper, we show that the simple regularization technique of randomly masking out square regions of input during training, which we call cutout, can be used to improve the robustness and overall performance of convolutional neural networks. Not only is this method extremely easy to implement, but we also demonstrate that it can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance.”) Vasudevan also discloses perform the data augmentation based on the processed image (abstract figure 1 block 108 figure 7 blocks 704-408 Vasudevan specifically discloses “While the training system 100 determines the final data augmentation policy 108 with respect to a particular set of training data 106, the final data augmentation policy 108 may (in some cases) be transferrable to other sets of training data. That is, the final data augmentation policy 108 determined with respect to the training data 106 may be used to effectively train other machine learning model on different sets of training data”). DeVries and Vasudevan don’t specifically discloses inpainting. Pathak discloses inpainting (title, abstract section 1 Pathak specifically discloses “Indeed, to the best of our knowledge, ours is the first parametric inpainting algorithm that is able to give reasonable results for semantic hole-filling (i.e. large missing regions). The context encoder can also be useful as a better visual feature for computing nearest neighbors in nonparametric inpainting 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Selvaraju, "Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization” Arxiv 2017.
Zhou “Learning Deep Features for Discriminative Localization” Arxiv 2015.
Latapie (US 20160358074 A1) discloses methods and systems for counting people.
Zhang (US 20180107928 A1) discloses diagnostic systems and methods for deep learning models configured for semiconductor applications.
Sung (US 20200065992 A1) discloses method and apparatus for recognizing image and method and apparatus for training recognition model based on data augmentation.
Choo (US 20200134469 A1) discloses method and apparatus for determining a base model for transfer learning.
Ma (US 20200160040 A1) discloses three-dimensional living-body face detection method, face authentication recognition method, and apparatuses.
Lin (US 20200167644 A1) discloses model building device and loading disaggregation system.
Zhao (US 20200174840 A1) discloses dynamic composition of data pipeline in accelerator-as-a-service computing environment.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JUAN A TORRES whose telephone number is (571) 272-3119. The examiner can normally be reached M-F 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kenneth N Vanderpuye can be reached on (571) 272-3078. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and 





/JUAN A TORRES/Primary Examiner, Art Unit 2636