DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/13/2021 has been entered.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1- 2, 4- 11, and 13- 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites the limitation "…the one or more bounding objects" in line 14.  There is insufficient antecedent basis for this limitation in the claim.
Claims 8 and 15 have similar issues and are rejected using the same rational.
Claim Rejections - 35 USC § 102

A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 8 and 15 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Hasan et al (US Patent 10755228), “Hasan”.
As per claims 1, 8 and 15, as best understood and in light of the rejections, Hasan teaches a computer readable medium with instructions see for example fig. 1;
receiving, by one or more processors, a first image under visual inspection; classifying, by one or more processors, the first image being a not-good image using a pre-trained classifier, the not-good image having at least a defect object (i.e., product data 210 of database 114 may comprise one or more data structures for identifying, classifying, and storing data associated with products, including, for example, a product identifier (such as a Stock Keeping Unit (SKU), Universal Product Code (UPC) or the like), product attributes and attribute values, sourcing information, and the like.  Product data 210 may comprise data about one or more products organized and sortable by, for example, product attributes, attribute values, product identification, sales quantity, demand forecast, or any stored category or dimension.  Attributes of one or more products may be, for example, any categorical characteristic or quality of a product, and an attribute value may be a specific value or identity for the one or more products according to the categorical characteristic or quality, including, for example, physical parameters (such as, for example, size, weight, dimensions, color, and the inconsistencies (corresponding to defects) between product images including: the images are different sizes, the products are not in the same relative positions within the images, the background colors are different, and the like.  By preprocessing product images, image processing module 200 may attenuate these problems and align the data so that the input is cleaned before being processed by the color-coding model.  Image preprocessing may comprise one or more actions, including, bounding box annotation 502a-502c, image size unification 504a-504c, and circle masking 506a-506c) see for example fig. 5 and column 13 lines 30- 50. Figure 4 (reproduced below) and column 16 lines 55- 60 teaches inputting, by one or more processors, the annotated image to train the detector (i.e., at action 414, image processing system 110 trains color-coding model with one or more training product images).

    PNG
    media_image1.png
    468
    554
    media_image1.png
    Greyscale


Claims 1- 2, 4, 6, 8- 11, and 15- 19 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by He et al (US PAP 2019/ 0073568), “He” (IDS).
As per claims 1, 8 and 15, as best understood and in light of the rejections, He teaches a computer readable medium with instructions see for example claim 25;
receiving, by one or more processors, a first image under visual inspection see for example fig. 1 and [0057]; 
classifying, by one or more processors, the first image being a not-good image using a pre-trained classifier, the not-good image having at least a defect object (i.e., classification of images (in plural) and defects thereof) see for example [0061- 62, 80, 81- 82, 86- 87, 89, 92, 98- 100, 102, 104, 108, 110 and 112] and fig. 2; 
in response to classifying the first image being the not-good image, detecting, by one or more processors, the one or more defect objects in the not-good image (i.e., the image(s) are training defect images rather than the images in which defects are being after the neural network has been trained and the image level labels may be assigned to each training defect image and may include labels such as defect ID 1, defect ID 2, .  . . defect ID n, pattern defect, bridging defect, etc. and using a training set of defect images that includes defect images and non-defect images may produce a neural network that is better capable of differentiating between defect images and non-defect images when the neural network is used for defect detection) see for example [0080 and 82] and fig. 4; [0086- 87] discloses “pre-trained weights are obtained by training an image classification network”, and [0098] discloses “ neural network may be configured as a detection and classification CNN.  In the embodiment shown in FIG. 6, the second portion includes a proposal network, which is configured for detecting defects on the specimen based on the features determined for the images and generating bounding boxes for each of the detected defects”;
masking, by one or more processors, the one or more defect objects in the not-good image with one or more bounding boxes indicating areas of the one or more defect objects in the image (i.e., "bounding box" is generally defined herein as a box (square or rectangular) drawn around one or more contiguous portions of pixels identified as defective.  For example, a bounding box may include one or two regions of pixels identified as defective) see for example [0061]
annotating, by one or more processors, the one or more defect objects (i.e., bounding boxes of the detected defects, detection scores, information about the defect classifications such as class labels or IDs, etc., or any such suitable 
inputting, by one or more processors, the masked image to train a detector, wherein the masked image includes the one or more bounding boxes generated for the one or more bounding objects and saved for the masked image (i.e., image level labels 200 and/or bounding box level labels 202 may only be input to the neural network during a training phase) see for example [0078]; [0095] discloses “to train the neural network by inputting class labels assigned by a user to bounding boxes in training defect images and the training defect images to the neural network.  For example, a user may assign class per bounding box.  Bounding box level labels and the defect images may then be used as input to train the neural network”; [100 and 102] also disclose similar limitation; and 
inputting, by one or more processors, the annotated image to train the detector (i.e., image level labels 200 and/or bounding box level labels 202 may only be input to the neural network during a training phase) see for example [0078]; claim 15 and [0095, 100 and 102] also disclose similar limitation.
Note; paragraph [0012] of the specification discloses “…Object detection methods may be based on fully-supervised learning methods, which usually train detection networks on large scale benchmarks with instance-level annotations (e.g., bounding boxes). Accordingly annotation is interpreted as synonymous to bounding box which can be used interchangeably.
As per claims 2, 11 and 18, He teaches a defect object that has no bounding box indicating an area of the defect object (i.e., image(s) 204, which include defect image(s) 
As per claim 4, He teaches receiving, by one or more processors, a second image under visual inspection; classifying, by one or more processors, the second image being a good image using a pre-trained classifier, the good image meaning no defect object being in the second image; and in response to classifying the second image being the good image, inputting, by one or more processors, the good image to train the detector (i.e., training set of defect images may also preferably include one or more of the training defect images described above possibly in combination with one or more non-defect/ good images, wherein the training set of defect images may include images of the specimen in which no defect is or was detected.  Using a training set of defect images that includes defect images and non-defect images may produce a neural network that is better capable of differentiating between defect images and non-defect images when the neural network is used for defect detection) see for example [0082].
As per claims 6 and 19, He teaches generating bounding boxes indicating areas of the one or more defect objects in the image (i.e., bounding boxes of the detected defects) see for example [0061- 62, 91- 92, 95, 98- 100 and 102].
As per claims 9 and 16, He teaches a defect object that is pre-marked with a bounding box indicating an area of the defect object (i.e., train the neural network by inputting class labels assigned by a user to bounding boxes in training defect images and the training defect images to the neural network.  For example, a user may assign class per bounding box.  Bounding box level labels and the defect images may then be 
As per claims 10 and 17, He teaches to restore the pre-marked bounding box for the defect object (i.e., generate bounding boxes having different dimensions or sizes (which depend on the size of the defects detected in the images).  The ROI pooling layer(s) are able to accept input images of different sizes.  Therefore, the ROI pooling layer(s) can accept the bounding box images generated by the proposal network and can adjust the sizes of the bounding box images to create fixed length inputs for fully connected layer(s), wherein a convolution layer can accept input of different sizes while a fully connected layer cannot) see for example [0099].

Claims 5, 7, 13- 14, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over He et al (US PAP 2019/ 0073568), “He” (IDS) in view of Karlinsky et al (US PAP 2017/ 0177977), “Karlinsky”.
As per claims 5 and 13, he does not explicitly teach using one or more patches to cover the one or more defect objects in the image.
However, Karlinsky teaches using one or more patches to cover the one or more defect objects in the image (i.e., defect mask) see for example [0064, 70, 86, 98, 113 and 117].
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art at the time the invention was made to incorporate the teachings of Karlinsky into He to detect and/or classify defects in a specimen during its fabrication, wherein examination is provided by using non-during or after manufacture of the specimen to be examined, wherein the examination process can include runtime scanning (in a single or in multiple scans), sampling, reviewing, measuring, classifying and/or other operations provided with regard to the specimen or parts thereof using the same or different inspection tools, wherein examination can be provided prior to manufacture of the specimen to be examined and can include, for example, generating an examination recipe(s) and/or other setup operations and therefore increase the effectiveness of examination by automatization of process(es), thus enabling reconstructing the at least one FP (fabrication process) image in correspondence with different examination modality, etc. as an state of the art technology see for example [005, 7 and 9].
As per claims 7, 14, 20 and in light of the rejections, Karlinsky teaches receiving, by one or more processors, a plurality of images under the visual inspection; classifying, by one or more processors, each of the plurality of images being not-good image using a pre-trained classifier, the not-good image meaning one or more defect objects being in the image; in response to classifying one or more of the plurality of images being the not-good images, detecting, by one or more processors, the one or more defect objects in each of the not- good images; masking, by one or more processors, the one or more defect objects in each of the not- good images; and inputting, by one or more processors, each masked image to train the detector (i.e., ., defect image (e.g. optical or Defect bounding box detection SEM, reference image or coordinate, defect mask images (optional), image (all defect pixels CAD (optional) are "1", others "0") see for example table 1 and [0064] to disclose “during the fabrication process, derivatives of the capture images obtained by various pre-processing stages (e.g. images of a part of a defect to be classified by ADC, SEM images of larger regions in which the defect is to be localized by ADR, registered images of different examination modalities corresponding to the same mask location, segmented images, height map images, etc.) and computer-generated design data-based images); [0070] discloses “defect detection and/or classification for a "virtual layer" constituted by one or more metal layers can use attributes generated by DNN specially trained for this virtual layer.  Likewise, another specially trained DNN can be used for defect detection and/or classification in a "virtual layer" constituted by one or more mask layers”; [0086- 92] further disclose “training process can be cyclic, and can be repeated several times until the DNN is sufficiently trained.  The process can start from an initially generated training set, while a user provides a feedback for the results reached by the DNN based on the initial training set.  The provided feedback can include, for example: manual re-classification of one or more pixels, regions and/or defects; prioritization of classes; changes of sensitivity, updates of ground-truth segmentation and/or manually defining regions of interest (ROIs) for segmentation applications; re-defining mask/bounding box for defect detection applications; re-selecting failed cases and/or manually registering failures for registration applications; re-selecting features of interest for regression applications, etc.”.
Response to Arguments
Applicant's arguments filed 12/01/21 have been fully considered but they are not persuasive.

1) Applicant argues “Thus, HE teaches "to train the neural network by inputting class labels assigned by a user to training defect images and the training defect images to the neural network". However, subsequent to this initial summarization, HE fails to teach or suggest "annotating ... the one or more defect objects", (emphasis added), as is required by claim 1. Remarks at 8. 
In response applicant should submit an argument pointing out disagreements with the examiner’s contentions.  Applicant must also discuss the reference(s) applied against the claims, explaining how the claims avoid the references or distinguish from them. Applicant should further argue the secondary prior arts of the record and offer explicit rational as to why the combinations and corresponding motivations to combine are not sufficient to render the claims prima faice obvious. Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
The He reference clearly teaches annotating, by one or more processors, the one or more defect objects (i.e., bounding boxes of the detected defects, detection scores, information about the defect classifications such as class labels or IDs, etc., or any such suitable information known in the art) see for example [0062]. The He reference is replete with the alleged missing limitation as follows: 
[0092] Like the neural network described above, another embodiment configured for defect detection also includes the convolution layer(s) described above.  In this manner, this embodiment may be configured as a detection CNN.  One embodiment of such a neural network bounding boxes for each of the detected defects.  In this manner, the detection network shown in FIG. 5 includes two parts: convolution layer(s) 206, which are included in the first portion described herein, and proposal network 214, which is included in the second portion described herein.  The convolution layer(s) are configured to determine features and generate feature map 208 for image(s) 204, which may include any of the image(s) and inputs described herein.  The proposal network is configured for defect detection.  In other words, the proposal network uses feature(s) from feature map 208 to detect the defects in the image(s) based on the determined features.  The proposal network is configured to generate bounding box detection results (e.g., bounding box 216).  In this manner, the detection CNN (including the convolution layers(s) and the proposal network) outputs bounding box 216, which may include a bounding box associated with each detected defect or more than one detected defect.  The network may output bounding box locations with detection scores on each bounding box.  The detection score for a bounding box may be a number (such as 0.5).  The results of the defect detection can also be stored and used as described further herein.

[0095] In some embodiments, the one or more computer subsystems are configured to train the neural network by inputting class labels assigned by a user to bounding boxes in training defect images and the training defect images to the neural network.  For example, a user may assign class per bounding box.  Bounding box level labels and the defect images may then be used as input to train the neural network.  The training may be performed as described further herein.  The class labels may include any of the class labels described herein. The training defect images may include any of the training defect images described herein.  In addition, the class labels and the training defect images may be acquired as described further herein.

[0098] Another embodiment shown in FIG. 6 is configured for both detection and classification.  Like the embodiments described further herein, this embodiment includes convolution layer(s) 206, which may be configured as described further herein.  Therefore, this neural network may be configured as a detection and classification CNN.  In the embodiment shown in FIG. 6, the second portion includes a proposal network, which is configured for detecting defects on the specimen based on the features determined for the images and generating bounding boxes for each of the detected defects, and region of interest (ROI) pooling layer(s) and fully connected layer(s) configured for classifying the detected defects based on the features determined for the defects.  In this manner, the detection and classification network includes three parts, convolution layer(s) 206, which are included in the first portion described herein, proposal network 214, which is included in the second portion described herein, and ROI pooling layer(s) 218 plus fully connected layer(s) 220, which are also included in the second portion described herein.  The convolution layer(s) are configured to determine features and generate feature map 208 for image(s) 204, which may include any of the image(s) and inputs described herein.  The proposal network is configured for defect detection as described above, and the results generated by the proposal network may include any of the defect detection results described herein.

[0099] In one such embodiment, the second portion includes one or more ROI pooling layers followed by one or more fully connected layers, the one or more ROI pooling layers are configured for generating fixed length representations of the generated bounding boxes, the fixed length representations are input to the one or more fully connected layers, and the one or more fully connected layers are configured for selecting one or more of the determined features and classifying the detected defects based on the one or more selected features. In this manner, the proposal network is used to generate bounding box detection.  The bounding boxes are sent to ROI pooling layer(s) 218, which construct fixed length fully connected layer inputs.  For example, the proposal network may advantageously generate bounding boxes having different dimensions or sizes (which depend on the size of the defects detected in the images).  The ROI pooling layer(s) are able to accept input images of different sizes.  

[0100] As shown in FIG. 6, bounding box level labels 202 and defect images 204 may be used as the input to train the detection and classification CNN.  Reference images and/or design can be inserted as the second (and possibly third) channel of the input, but the reference images and/or design are not required.  When this neural network is being used for production or runtime (after the neural network is trained), then bounding box level labels 202 would 
not be input to the neural network.  Instead, the only input would be image(s) 204.

[0102] In summary, therefore, the embodiments described herein provide a unified deep learning framework for defect classification and inspection (detection).  The classification network portion of the unified deep learning framework can be trained using class per image label and defect images and will output class per image with classification confidence.  The detection network portion of the unified deep learning framework can be trained using class per bounding box and defect images and will output bounding box location and detection score per bounding box.  The detection and classification network portion of the unified deep learning framework can be trained using class per bounding box and defect images and will output bounding box location and class per bounding box with classification confidence. 
 
 [0105] The unified deep learning classification and detection framework embodiments described herein have therefore multiple advantages over previously used approaches.  For example, the embodiments described herein have a number of ease of use and cost advantages.  In one such example, the embodiments described herein significantly reduce the burden on the user to annotate defects at the pixel level for the detection network to learn.  In other words, the embodiments described herein enable bounding box labeling for detection, which significantly reduces user annotation burden.

2) Applicant makes similar allegations presented supra and argues “Thus, HE teaches "pre-trained weights are obtained by training an image classification network". However, subsequent to this initial summarization, HE fails to teach or suggest "inputting ... the annotated image to train the detector", (emphasis added), as is required by claim 1”. Remarks at 9.
In response the He reference clearly teaches inputting, by one or more processors, the annotated image to train the detector (i.e., image level labels 200 and/or bounding box level labels 202 may only be input to the neural network during a training phase) see for example [0078].
The He reference is replete with the alleged missing limitation as follows:
bounding box level labels 202 and defect images 204 may be used as the input to train the detection and classification CNN.  Reference images and/or design can be inserted as the second (and possibly third) channel of the input, but the reference images and/or design are not required.  When this neural network is being used for production or runtime (after the neural network is trained), then bounding box level labels 202 would not be input to the neural network.  Instead, the only input would be image(s) 204.

[0102] In summary, therefore, the embodiments described herein provide a unified deep learning framework for defect classification and inspection (detection).  The classification network portion of the unified deep learning framework can be trained using class per image label and defect images and will output class per image with classification confidence.  The detection network portion of the unified deep learning framework can be trained using class per bounding box and defect images and will output bounding box location and detection score per bounding box.  The detection and classification network portion of the unified deep learning framework can be trained using class per bounding box and defect images and will output bounding box location and class per bounding box with classification confidence. 
 
[0105] The unified deep learning classification and detection framework embodiments described herein have therefore multiple advantages over previously used approaches.  For example, the embodiments described herein have a number of ease of use and cost advantages.  In one such example, the embodiments described herein significantly reduce the burden on the user to annotate defects at the pixel level for the detection network to learn.  In other words, the embodiments described herein enable bounding box labeling for detection, which significantly reduces user annotation burden.  In this manner, the embodiments provide ease of use for annotation, training, and testing.  In another such example, pre-trained weights can be created directly using existing image level labeled data.  In this manner, the embodiments described herein can make immediate use of prior knowledge by directly using image level labeled data to generate pre-trained weights.  The embodiments described herein also enable easy building of a dataset for generating pre-trained weights.  In an additional example, fine tuning the embodiments described herein from pre-trained weights reduces the training time by about 90% and training defects requirements for both networks.  In this manner, the embodiments described herein may share the same pre-trained weights thereby reducing training time and training defect requirements for both detection and classification, which can reduce the time to recipe (i.e., the time involved in setting up a recipe). 

3) Applicant argues independent claims 8 and 15, and dependent claims 2, 4- 7, 9- 11, 13- 14 and 60- 20 using the similar arguments. Remarks at 9.
Accordingly examiner responds using the same rational.

Inquiry
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Manuchehr Rahmjoo whose telephone number is 571-272- 7789.  The examiner can normally be reached on 8 AM- 5 pm.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

Manuchehr Rahmjoo
/Manuchehr Rahmjoo/
Primary Examiner, AU 2667
Manuchehr.Rahmjoo@uspto.gov