PNG
    media_image1.png
    172
    172
    media_image1.png
    Greyscale
United States Patent and Trademark Office
    
        
            
                                
            
        
    

Commissioner for Patents
United States Patent and Trademark Office
P.O. Box 1450
Alexandria, VA 22313-1450
www.uspto.gov











BEFORE THE PATENT TRIAL AND APPEAL BOARD


Application Number: 16/838,037
Filing Date: 2 April 2020
Appellant(s): Jing Zhang et al.



__________________
Ann Marie Mewherter (Reg. No. 50,484)
For Appellant


EXAMINER’S ANSWER





This is in response to the appeal brief filed 23 August 2021.

(1) Grounds of Rejection to be Reviewed and Appeal
The ground(s) of rejection set forth in the Office action dated 1 April 2021 from which the appeal is taken have been modified by the Appeal brief dated 23 August 2021.  A list of rejections withdrawn by the examiner (if any) is included under the subheading “WITHDRAWN REJECTIONS.”  New grounds of rejection (if any) are provided under the subheading “NEW GROUNDS OF REJECTION.”  
(2) Restatement of Rejection	
The following ground(s) of rejection are applicable to the appealed claims, which the appellant did not address, but will be maintained.
Claims 1 and 66-67 are rejected on the ground of nonstatutory double patenting as being unpatentable over claim 13 of U.S. Patent No. US 10, 181,185 B2 in view of Zhang et al. (US 2017/0351952 A1).
The following ground(s) of rejection are applicable to the appealed claims, which the appellant has addressed.
Claims 1-36 and 66-67 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (US 2017/0351952 A1) in view of Moniwa et al. (US 2003/0106037 A1).
	
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1 and 66-67 are rejected on the ground of nonstatutory double patenting as being unpatentable over claim 13 of U.S. Patent No. US 10, 181,185 B2 in view of Zhang et al. (US 2017/0351952 A1).
a.	Regarding claim 1, claim 13 of the instant patent discloses a system configured to detect defects on a specimen, comprising: 
one or more computer systems (claim 1, limitation 2); and
one or more components executed by the one or more computer systems, wherein the one or more components comprise a deep metric learning defect detection model configured for (claim 1, limitation 2): 
projecting a test image generated for a specimen and a corresponding reference image into latent space (claim 1, limitations 3-4);
detecting defects in the one or more different portions of the test image based on the determined for the one or more different portions of the test image, respectively (claim 13).
However, claim 1 of the instant patent does not disclose for one or more different portions of the test image, determining a distance in the latent space between the one or more different portions and corresponding one or more portions of the corresponding reference image.
Zhang discloses one or more different portions of the test image, determining a distance in the latent space between the one or more different portions and corresponding one or more portions of the corresponding reference image (Zhang discloses “various distance measures (e.g., L1, L2, L_inf, Manhattan, etc.)” at ¶0089).
Before the time of the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to utilize the distance measures of Zhang to claim 13.
The suggestion/motivation would have been “to make sure that the design will be formed on the specimen in an acceptable manner[,] … provide a reference for the design, which illustrates how the design is meant to be formed on the specimen, that can be used for one or more functions performed for the specimen … for defect detection so that any differences between the design formed on the specimen and the reference can be detected and identified as defects or potential defects” (Zhang; ¶0009).
b.	Regarding claims 66-67, claims 66-67 are analogous and correspond to claim 1. See rejection of claim 1 for further explanation.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-36 and 66-67 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (US 2017/0351952 A1) in view of Moniwa et al. (US 2003/0106037 A1).
a.	Regarding claim 1, Zhang discloses a system configured to detect defects on a specimen, comprising: 
one or more computer systems (Zhang discloses two computer subsystems at Figs. 1-36 and 102 and ¶0029).
one or more components executed by the one or more computer systems, wherein the one or more components (Zhang discloses a components executed by computer subsystems at Fig. 1-100 and ¶0029) comprise a deep metric learning defect detection model configured for (Zhang discloses a neural network at Fig. 1-104 and ¶0058): 
projecting a test image generated for a specimen and a corresponding reference image into latent space (Zhang discloses that “determining inverted features of input images in a training set for a specimen input to the neural network” at ¶0058);
detecting defects in the one or more different portions of the test image based on the determined for the one or more different portions of the test image, respectively (Zhang discloses that “[d]efect review typically involves re-detecting defects detected as such by an inspection process and generating additional information about the defects at a higher resolution using either a high magnification optical system or a scanning electron microscope (SEM). Defect review is therefore performed at discrete locations on specimens where defects have been detected by inspection. The higher resolution data for the defects generated by defect review is more suitable for determining attributes of the defects such as profile, roughness, more accurate size information” at ¶0005 and 0056-0058).
However, Zhang does not disclose for one or more different portions of the test image, determining a distance in the latent space between the one or more different portions and corresponding one or more portions of the corresponding reference image.
Moniwa discloses one or more different portions of the test image, determining a distance in the latent space between the one or more different portions and corresponding one or more portions of the corresponding reference image (Moniwa discloses that “in step S7 of implementing proximity effect correction for patterns which are imaged with shifter edges, an edge of the respective patterns P2 is extended outward by 10 nm in case that a distance from the edge to an adjacent pattern becomes 1000 nm or more. In step S8 of implementing phase assignment for the aperture patterns for phase shift patterns, the aperture patterns for present phase shift patterns P7 and the aperture pattern for dummy shifter pattern P8 that are disposed so as to sandwich the pattern P2 after correction of the proximity effect or the dummy gate pattern P9 there between are assigned with phases opposite to each other, thereby obtaining 0-degree phase assigned shifter pattern P3 and 180-degree phase assigned shifter pattern P4 as shown in FIG. 5F” at Fig. 5F, Fig. 10-S7 and ¶0064).
Before the time of the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to utilize the proximately correction process of Moniwa to Zhang’s two computer subsystems.
The suggestion/motivation would have been “to provide a method of … capable of forming fine patterns with high accuracy by implementing proximity effect correction for a phase shift mask with high accuracy” (Moniwa; ¶0014).
b.	Regarding claim 2, the combination applied in claim 1 discloses wherein the test image and the corresponding reference image are for corresponding locations in different dies on the specimen (Zhang discloses that “determining inverted features of input images in a training set for a specimen input to the neural network” at ¶0058).
c.	Regarding claim 3, the combination applied in claim 1 discloses wherein the test image and the corresponding reference image are for corresponding locations in different cells on the specimen (Zhang discloses that “determining inverted features of input images in a training set for a specimen input to the neural network” at ¶0058).
d.	Regarding claim 4, the combination applied in claim 1 discloses wherein the test image and the corresponding reference image are generated for the specimen without using design data for the specimen (Zhang discloses that “determining inverted features of input images in a training set for a specimen input to the neural network” at ¶0058).
e.	Regarding claim 5, the combination applied in claim 1 discloses wherein the test image is generated for the specimen by an imaging system that directs energy to and detects energy from the specimen (Zhang discloses that “[e]lectrons returned from the specimen (e.g., secondary electrons may be focused by one or more elements 132 to detector 134” at Fig. 1a and ¶0048), and wherein the corresponding reference image is generated without using the specimen (Zhang discloses that “[a] "runtime" image as that term is used herein simply means a test image that is input to the trained neural network. As such, in one deployment situation, only the first model of the INN (the trained neural network) is deployed, which will generate the inverted features (or the inverted images) alone at prediction time” at Fig. 3 and ¶0093).
f.	Regarding claim 6, the combination applied in claim 1 discloses wherein the corresponding reference image is acquired from a database containing design data for the specimen (Zhang discloses that “the computer subsystem(s) may be configured to acquire the input images in the training set from an imaging subsystem or system described herein and/or from a storage medium in which the images have been stored by an imaging subsystem or system” at ¶0075).
g.	Regarding claim 7, the combination applied in claim 1 discloses wherein the one or more computer systems are configured for inputting design data for the specimen into the deep metric learning defect detection model, and wherein the deep metric learning defect detection model is further configured for performing said detecting using the design data (Zhang discloses that “determining inverted features of input images in a training set for a specimen input to the neural network” at ¶0058).
h.	Regarding claim 8, the combination applied in claim 1 discloses wherein said detecting is performed with one or more parameters determined from care areas for the specimen (Zhang discloses that “the trained neural network during deployment, which generates inverted features 404” at Fig. 4-404 and ¶0094).
i.	Regarding claim 9, the combination applied in claim 1 discloses wherein the one or more computer systems are configured for inputting information for the care areas into the deep metric learning defect detection model (Zhang discloses that “Inverted features 404 may be input to forward physical model 406” at Fig. 4-406 and ¶0094).
j.	Regarding claim 10, the combination applied in claim 1 discloses wherein said detecting is performed without information for care areas for the specimen (Zhang discloses that “[e]lectrons returned from the specimen (e.g., secondary electrons may be focused by one or more elements 132 to detector 134” at Fig. 1a and ¶0048).
k.	Regarding claim 11, the combination applied in claim 1 discloses wherein the test image is generated in a logic area of the specimen (Zhang discloses that “Runtime image 300 may be input to trained neural network 302 along with the imaging parameters (not shown in FIG. 3) that were used to generate the runtime image. Runtime image 300 may also have an arbitrary size. A "runtime" image as that term is used herein simply means a test image that is input to the trained neural network” at Fig. 3-300 and ¶0093).
l.	Regarding claim 12, the combination applied in claim 1 dislcoses wherein the test image is generated in an array area of the specimen (Zhang discloses that “Runtime image 300 may be input to trained neural network 302 along with the imaging parameters (not shown in FIG. 3) that were used to generate the runtime image. Runtime image 300 may also have an arbitrary size. A "runtime" image as that term is used herein simply means a test image that is input to the trained neural network” at Fig. 3-300 and ¶0093).
m.	Regarding claim 13, the combination applied in claim 1 discloses wherein the different portions of the test image comprise different pixels in the test image (Zhang discloses that “The pixel values in the residue image can therefore be used to identify any catastrophic failure of the INN model” at ¶0095).
n.	Regarding claim 14, the combination applied in claim 1 discloses wherein the deep metric learning defect detection model is further configured for projecting an additional corresponding reference image into the latent space and determining an average of the corresponding reference image and the additional corresponding reference image and a reference region in the latent space, and wherein the one or more portions of the corresponding reference image used for determining the distance comprise the reference region (Moniwa discloses that “in step S7 of implementing proximity effect correction for patterns which are imaged with shifter edges, an edge of the respective patterns P2 is extended outward by 10 nm in case that a distance from the edge to an adjacent pattern becomes 1000 nm or more. In step S8 of implementing phase assignment for the aperture patterns for phase shift patterns, the aperture patterns for present phase shift patterns P7 and the aperture pattern for dummy shifter pattern P8 that are disposed so as to sandwich the pattern P2 after correction of the proximity effect or the dummy gate pattern P9 there between are assigned with phases opposite to each other, thereby obtaining 0-degree phase assigned shifter pattern P3 and 180-degree phase assigned shifter pattern P4 as shown in FIG. 5F” at Fig. 5F, Fig. 10-S7 and ¶0064).
o.	Regarding claim 15, the combination applied in claim 1 discloses wherein the corresponding reference image comprises a non-defective test image for the specimen, wherein projecting the corresponding reference image comprises learning a reference region in the latent space, and wherein the one or more portions of the corresponding reference image used for determining the distance comprise the reference region (Moniwa discloses that “in step S7 of implementing proximity effect correction for patterns which are imaged with shifter edges, an edge of the respective patterns P2 is extended outward by 10 nm in case that a distance from the edge to an adjacent pattern becomes 1000 nm or more. In step S8 of implementing phase assignment for the aperture patterns for phase shift patterns, the aperture patterns for present phase shift patterns P7 and the aperture pattern for dummy shifter pattern P8 that are disposed so as to sandwich the pattern P2 after correction of the proximity effect or the dummy gate pattern P9 there between are assigned with phases opposite to each other, thereby obtaining 0-degree phase assigned shifter pattern P3 and 180-degree phase assigned shifter pattern P4 as shown in FIG. 5F” at Fig. 5F, Fig. 10-S7 and ¶0064).
o.	Regarding claim 16, the combination applied in claim 1 discloses wherein the deep metric learning defect detection model has a Siamese network architecture (Zhang discloses that “trained neural network 402 may be deployed with forward physical model 406 and residue layer 410. Forward physical model 406 may or may not be trained as described herein. In this manner, runtime image 400 is input to the trained neural network during deployment, which generates inverted features 404. As such, during runtime (or deployment), runtime image 400 may be input to trained neural network 402 by the computer subsystem(s) (not shown in FIG. 4, which may be the same computer subsystem(s) that trained the neural network or may be different computer subsystem(s)), which generates inverted features 404, which is the output of the trained neural network. Runtime image 400 may be input to trained neural network 402 along with the imaging parameters (not shown in FIG. 4) that were used to generate the runtime image. Runtime image 400 may also have an arbitrary size. Inverted features 404 may be input to forward physical model 406, which generates model transformed features 408. The model transformed features may then be input to residue layer 410 in combination with image 400, which may use those inputs to generate residue image 412” at Fig. 4 and ¶0094).
p.	Regarding claim 17, the combination applied in claim 1 discloses wherein the deep metric learning defect detection model has a triplet network architecture (Zhang discloses that “trained neural network 402 may be deployed with forward physical model 406 and residue layer 410. Forward physical model 406 may or may not be trained as described herein. In this manner, runtime image 400 is input to the trained neural network during deployment, which generates inverted features 404. As such, during runtime (or deployment), runtime image 400 may be input to trained neural network 402 by the computer subsystem(s) (not shown in FIG. 4, which may be the same computer subsystem(s) that trained the neural network or may be different computer subsystem(s)), which generates inverted features 404, which is the output of the trained neural network. Runtime image 400 may be input to trained neural network 402 along with the imaging parameters (not shown in FIG. 4) that were used to generate the runtime image. Runtime image 400 may also have an arbitrary size. Inverted features 404 may be input to forward physical model 406, which generates model transformed features 408. The model transformed features may then be input to residue layer 410 in combination with image 400, which may use those inputs to generate residue image 412” at Fig. 4 and ¶0094).
q.	Regarding claim 18, the combination applied in claim 1 discloses wherein the deep metric learning defect detection model has a quadruplet network architecture (Zhang discloses that “trained neural network 402 may be deployed with forward physical model 406 and residue layer 410. Forward physical model 406 may or may not be trained as described herein. In this manner, runtime image 400 is input to the trained neural network during deployment, which generates inverted features 404. As such, during runtime (or deployment), runtime image 400 may be input to trained neural network 402 by the computer subsystem(s) (not shown in FIG. 4, which may be the same computer subsystem(s) that trained the neural network or may be different computer subsystem(s)), which generates inverted features 404, which is the output of the trained neural network. Runtime image 400 may be input to trained neural network 402 along with the imaging parameters (not shown in FIG. 4) that were used to generate the runtime image. Runtime image 400 may also have an arbitrary size. Inverted features 404 may be input to forward physical model 406, which generates model transformed features 408. The model transformed features may then be input to residue layer 410 in combination with image 400, which may use those inputs to generate residue image 412” at Fig. 4 and ¶0094).
r.	Regarding claim 19, the combination applied in claim 1 discloses wherein the deep metric learning defect detection model comprises one or more deep learning convolution filters, and wherein the one or more computer systems are configured for determining a configuration of the one or more deep learning convolution filters based on physics involved in generating the test image (Zhang discloses that “Runtime image 400 may be input to trained neural network 402 along with the imaging parameters (not shown in FIG. 4) that were used to generate the runtime image. Runtime image 400 may also have an arbitrary size. Inverted features 404 may be input to forward physical model 406, which generates model transformed features 408” at Fig. 4 and ¶0094).
s.	Regarding claim 20, the combination applied in claim 1 discloses wherein the deep metric learning defect detection model comprises one or more deep learning convolution filters, and wherein the one or more computer systems are configured for determining a configuration of the one or more deep learning convolution filters based on imaging hardware used for generating the test image (Zhang discloses that “The embodiments described herein may also be configured for deployment of the neural network in a variety of ways and for generating a variety of outputs after it has been trained as described further herein. For example, as shown in FIG. 3, in one manner of deployment, trained neural network 302 may be deployed itself (without the forward physical model and without the residue layer). In this manner, during runtime (or deployment), runtime image 300 may be input to trained neural network 302 by the computer subsystem(s) (not shown in FIG. 3, which may be the same computer subsystem(s) that trained the neural network or may be different computer subsystem(s)), which generates inverted features 304, which is the output of the trained neural network. Runtime image 300 may be input to trained neural network 302 along with the imaging parameters (not shown in FIG. 3) that were used to generate the runtime image. Runtime image 300 may also have an arbitrary size. A "runtime" image as that term is used herein simply means a test image that is input to the trained neural network. As such, in one deployment situation, only the first model of the INN (the trained neural network) is deployed, which will generate the inverted features (or the inverted images) alone at prediction time” at Fig. 3 and ¶0093).
t.	Regarding claim 21, the combination applied in claim 1 discloses wherein determining the configuration comprises determining one or more parameters of the one or more deep learning convolution filters based on a point spread function of the imaging hardware (Zhang discloses that “The embodiments described herein may also be configured for deployment of the neural network in a variety of ways and for generating a variety of outputs after it has been trained as described further herein. For example, as shown in FIG. 3, in one manner of deployment, trained neural network 302 may be deployed itself (without the forward physical model and without the residue layer). In this manner, during runtime (or deployment), runtime image 300 may be input to trained neural network 302 by the computer subsystem(s) (not shown in FIG. 3, which may be the same computer subsystem(s) that trained the neural network or may be different computer subsystem(s)), which generates inverted features 304, which is the output of the trained neural network. Runtime image 300 may be input to trained neural network 302 along with the imaging parameters (not shown in FIG. 3) that were used to generate the runtime image. Runtime image 300 may also have an arbitrary size. A "runtime" image as that term is used herein simply means a test image that is input to the trained neural network. As such, in one deployment situation, only the first model of the INN (the trained neural network) is deployed, which will generate the inverted features (or the inverted images) alone at prediction time” at Fig. 3 and ¶0093).
u.	Regarding claim 22, the combination applied in claim 1 discloses wherein the one or more parameters of the one or more deep learning convolution filters comprise one or more of filter size, filter symmetry, and filter depth (Zhang discloses that “The embodiments described herein may also be configured for deployment of the neural network in a variety of ways and for generating a variety of outputs after it has been trained as described further herein. For example, as shown in FIG. 3, in one manner of deployment, trained neural network 302 may be deployed itself (without the forward physical model and without the residue layer). In this manner, during runtime (or deployment), runtime image 300 may be input to trained neural network 302 by the computer subsystem(s) (not shown in FIG. 3, which may be the same computer subsystem(s) that trained the neural network or may be different computer subsystem(s)), which generates inverted features 304, which is the output of the trained neural network. Runtime image 300 may be input to trained neural network 302 along with the imaging parameters (not shown in FIG. 3) that were used to generate the runtime image. Runtime image 300 may also have an arbitrary size. A "runtime" image as that term is used herein simply means a test image that is input to the trained neural network. As such, in one deployment situation, only the first model of the INN (the trained neural network) is deployed, which will generate the inverted features (or the inverted images) alone at prediction time” at Fig. 3 and ¶0093).
v.	Regarding claim 23, the combination applied in claim 1 discloses wherein determining the one or more parameters of the one or more deep learning convolution filters comprises learning the one or more parameters by optimizing a loss function (Zhang discloses that “the neural network does not have to be defined by a unique topology to implement the functions described herein. Instead, the neural network may be application specific, and its layer type and number of layers are undefined. The neural network may include two or more encoder layers configured for determining the inverted features of an image for a specimen. The term "encoder" generally refers to a neural network or part of a neural network that "encodes" the information content of input data to a more compact representation. The encode process may be effectively lossy or lossless. In addition, the encode process may or may not be human interpretable. The encoded representation can be a vector of scalar values or distributions” at ¶0059).
x.	Regarding claim 24, the combination applied in claim 1 discloses wherein determining the configuration comprises selecting the one or more deep learning convolution filters from a predetermined set of deep learning convolution filters based on a point spread function of the imaging hardware (Zhang discloses that “aperture "A" can be used to estimate a crude point spread function (PSF) for the tool” at ¶0088).
y.	Regarding claim 25, the combination applied in claim 1 discloses wherein one or more parameters of the one or more deep learning convolution filters in the predetermined set are fixed (Zhang discloses that “aperture "A" can be used to estimate a crude point spread function (PSF) for the tool” at ¶0088).
z.	Regarding claim 26, the combination applied in claim 1 discloses wherein determining the configuration further comprises fine tuning one or more initial parameters of the one or more deep learning convolution filters by optimizing a loss function (Zhang discloses that “the neural network does not have to be defined by a unique topology to implement the functions described herein. Instead, the neural network may be application specific, and its layer type and number of layers are undefined. The neural network may include two or more encoder layers configured for determining the inverted features of an image for a specimen. The term "encoder" generally refers to a neural network or part of a neural network that "encodes" the information content of input data to a more compact representation. The encode process may be effectively lossy or lossless. In addition, the encode process may or may not be human interpretable. The encoded representation can be a vector of scalar values or distributions” at ¶0059).
aa.	Regarding claim 27, the combination applied in claim 1 discloses wherein the one or more components further comprise a learnable low-rank reference image generator configured for generating the corresponding reference image, wherein the one or more computer systems are configured for inputting one or more test images generated for the specimen into the learnable low-rank reference image generator, wherein the one or more test images are generated for different locations on the specimen corresponding to the same location in a design for the specimen, and wherein the learnable low-rank reference image generator is further configured for removing noise from the one or more test images thereby generating the corresponding reference image (Zhang discloses that “The term "low resolution image" of a specimen, as used herein, is generally defined as an image in which all of the patterned features formed in the area of the specimen at which the image was generated are not resolved in the image. For example, some of the patterned features in the area of the specimen at which a low resolution image was generated may be resolved in the low resolution image if their size is large enough to render them resolvable. However, the low resolution image is not generated at a resolution that renders all patterned features in the image resolvable. In this manner, a "low resolution image," as that term is used herein, does not contain information about patterned features on the specimen that is sufficient for the low resolution image to be used for applications such as defect review, which may include defect classification and/or verification, and metrology. In addition, a "low resolution image" as that term is used herein generally refers to images generated by inspection systems, which typically have relatively lower resolution (e.g., lower than defect review and/or metrology systems) in order to have relatively fast throughput” at ¶0100).
bb.	Regarding claim 28, the combination applied in claim 1 discloses wherein the test image and an additional test image are generated for the specimen with different modes of an imaging system, respectively (Zhang discloses two computer subsystems at Figs. 1-36 and 102 and ¶0029); 
wherein the deep metric learning defect detection model is further configured for projecting the test image and the corresponding reference image into a first latent space, projecting the additional test image and an additional corresponding reference image into a second latent space, and combining the first and second latent spaces into a joint latent space; and wherein the latent space used for determining the distance is the joint latent space (Moniwa discloses that “in step S7 of implementing proximity effect correction for patterns which are imaged with shifter edges, an edge of the respective patterns P2 is extended outward by 10 nm in case that a distance from the edge to an adjacent pattern becomes 1000 nm or more. In step S8 of implementing phase assignment for the aperture patterns for phase shift patterns, the aperture patterns for present phase shift patterns P7 and the aperture pattern for dummy shifter pattern P8 that are disposed so as to sandwich the pattern P2 after correction of the proximity effect or the dummy gate pattern P9 there between are assigned with phases opposite to each other, thereby obtaining 0-degree phase assigned shifter pattern P3 and 180-degree phase assigned shifter pattern P4 as shown in FIG. 5F” at Fig. 5F, Fig. 10-S7 and ¶0064).
cc.	29.    (Original) The system of claim 1, wherein the one or more computer systems are configured for inputting design data for the specimen into the deep metric learning defect detection model (Zhang discloses that “Runtime image 300 may be input to trained neural network 302 along with the imaging parameters (not shown in FIG. 3) that were used to generate the runtime image. Runtime image 300 may also have an arbitrary size. A "runtime" image as that term is used herein simply means a test image that is input to the trained neural network” at Fig. 3-300 and ¶0093); 
wherein the test image and an additional test image are generated for the specimen with different modes of an imaging system, respectively (Zhang discloses two computer subsystems at Figs. 1-36 and 102 and ¶0029);  
wherein the deep metric learning defect detection model is further configured for projecting the test image and the corresponding reference image into a first latent space, projecting the additional test image and an additional corresponding reference image into a second latent space, projecting the design data into a third latent space, and combining the first, second, and third latent spaces into a joint latent space; and wherein the latent space used for determining the distance is the joint latent space (Moniwa discloses that “in step S7 of implementing proximity effect correction for patterns which are imaged with shifter edges, an edge of the respective patterns P2 is extended outward by 10 nm in case that a distance from the edge to an adjacent pattern becomes 1000 nm or more. In step S8 of implementing phase assignment for the aperture patterns for phase shift patterns, the aperture patterns for present phase shift patterns P7 and the aperture pattern for dummy shifter pattern P8 that are disposed so as to sandwich the pattern P2 after correction of the proximity effect or the dummy gate pattern P9 there between are assigned with phases opposite to each other, thereby obtaining 0-degree phase assigned shifter pattern P3 and 180-degree phase assigned shifter pattern P4 as shown in FIG. 5F” at Fig. 5F, Fig. 10-S7 and ¶0064).
dd.	Regarding claim 30, the combination applied in claim 1 discloses wherein the one or more computer systems are configured for inputting design data for the specimen into the deep metric learning defect detection model (Zhang discloses two computer subsystems at Figs. 1-36 and 102 and ¶0029);  
wherein the deep metric learning defect detection model is further configured for projecting the test image and the corresponding reference image into a first latent space, projecting the design data into a second latent space, and combining the first and second latent spaces into a joint latent space; and wherein the latent space used for determining the distance is the joint latent space (Moniwa discloses that “in step S7 of implementing proximity effect correction for patterns which are imaged with shifter edges, an edge of the respective patterns P2 is extended outward by 10 nm in case that a distance from the edge to an adjacent pattern becomes 1000 nm or more. In step S8 of implementing phase assignment for the aperture patterns for phase shift patterns, the aperture patterns for present phase shift patterns P7 and the aperture pattern for dummy shifter pattern P8 that are disposed so as to sandwich the pattern P2 after correction of the proximity effect or the dummy gate pattern P9 there between are assigned with phases opposite to each other, thereby obtaining 0-degree phase assigned shifter pattern P3 and 180-degree phase assigned shifter pattern P4 as shown in FIG. 5F” at Fig. 5F, Fig. 10-S7 and ¶0064).
ee.	Regarding claim 31, the combination applied in claim 1 wherein the one or more computer systems are configured for inputting design data for the specimen into the deep metric learning defect detection model (Zhang discloses that “Runtime image 300 may be input to trained neural network 302 along with the imaging parameters (not shown in FIG. 3) that were used to generate the runtime image. Runtime image 300 may also have an arbitrary size. A "runtime" image as that term is used herein simply means a test image that is input to the trained neural network” at Fig. 3-300 and ¶0093); 
wherein the test image and an additional test image are generated for the specimen with different modes of an imaging system, respectively (Zhang discloses two computer subsystems at Figs. 1-36 and 102 and ¶0029);  
wherein the deep metric learning defect detection model is further configured for projecting a first set comprising one or more of the test image and the corresponding reference image, the additional test image and an additional corresponding reference image, and the design data into a first latent space, projecting a second set comprising one or more of the test image and the corresponding reference image, the additional test image and the additional corresponding reference image, and the design data into a second latent space, and combining the first and second latent spaces into a joint latent space; and wherein the latent space used for determining the distance is the joint latent space (Moniwa discloses that “in step S7 of implementing proximity effect correction for patterns which are imaged with shifter edges, an edge of the respective patterns P2 is extended outward by 10 nm in case that a distance from the edge to an adjacent pattern becomes 1000 nm or more. In step S8 of implementing phase assignment for the aperture patterns for phase shift patterns, the aperture patterns for present phase shift patterns P7 and the aperture pattern for dummy shifter pattern P8 that are disposed so as to sandwich the pattern P2 after correction of the proximity effect or the dummy gate pattern P9 there between are assigned with phases opposite to each other, thereby obtaining 0-degree phase assigned shifter pattern P3 and 180-degree phase assigned shifter pattern P4 as shown in FIG. 5F” at Fig. 5F, Fig. 10-S7 and ¶0064).
ff.	Regarding claim 32, the combination applied in claim 1 discloses wherein the one or more computer systems are configured for training the deep metric learning defect detection model with one or more training images and pixel-level ground truth information for the one or more training images (Zhang discloses that “The neural network is configured for determining inverted features of input images in a training set for a specimen input to the neural network. The inverted features determined by the neural network may include any suitable features described further herein or known in the art that can be inferred from the input and used to generate the output described further herein. For example, the features may include a vector of intensity values per pixel. The features may also include any other types of features described herein, e.g., vectors of scalar values, vectors of independent distributions, joint distributions, or any other suitable feature types known in the art” at ¶0074).
gg.	Regarding claim 33, the combination applied in claim 1 discloses wherein the one or more training images and pixel-level ground truth information are generated from a process window qualification wafer (Zhang discloses that “The neural network is configured for determining inverted features of input images in a training set for a specimen input to the neural network. The inverted features determined by the neural network may include any suitable features described further herein or known in the art that can be inferred from the input and used to generate the output described further herein. For example, the features may include a vector of intensity values per pixel. The features may also include any other types of features described herein, e.g., vectors of scalar values, vectors of independent distributions, joint distributions, or any other suitable feature types known in the art” at ¶0074).
hh.	Regarding claim 34, the combination applied in claim 1 discloses wherein the one or more computer systems are configured for performing active learning for training the deep metric learning defect detection model (Zhang discloses that “The embodiments described herein may also be configured for deployment of the neural network in a variety of ways and for generating a variety of outputs after it has been trained as described further herein. For example, as shown in FIG. 3, in one manner of deployment, trained neural network 302 may be deployed itself (without the forward physical model and without the residue layer). In this manner, during runtime (or deployment), runtime image 300 may be input to trained neural network 302 by the computer subsystem(s) (not shown in FIG. 3, which may be the same computer subsystem(s) that trained the neural network or may be different computer subsystem(s)), which generates inverted features 304, which is the output of the trained neural network. Runtime image 300 may be input to trained neural network 302 along with the imaging parameters (not shown in FIG. 3) that were used to generate the runtime image. Runtime image 300 may also have an arbitrary size. A "runtime" image as that term is used herein simply means a test image that is input to the trained neural network. As such, in one deployment situation, only the first model of the INN (the trained neural network) is deployed, which will generate the inverted features (or the inverted images) alone at prediction time” at Fig. 3 and ¶0093).
ii.	Regarding claim 35, the combination applied in claim 1 discloses wherein the specimen is a wafer (Zhang discloses that “The neural network described herein may be generated for specific specimens (e.g., specific wafers or reticles)” at ¶0092).
jj.	Regarding claim 36, the combination applied in claim 1 discloses wherein the specimen is a reticle (Zhang discloses that “The neural network described herein may be generated for specific specimens (e.g., specific wafers or reticles)” at ¶0092).
kk.	Regarding claims 66-67, claims 66-67 are analogous and correspond to claim 1. See rejection of claim 1 for further explanation.
	
(3) Response to Argument
A-1.	Appellant’s argument at page 6, line 13 to page 9, line 19
The Appellant argues that the prior arts do not teach or suggest with respect to claims 1-6, 8, 10-13, 16-26, 32-36, and 66-67 because of the appellant’s understanding of the “latent space” in two ways- (1) “the space into which a test image for a specimen and a corresponding reference image are projected by a deep metric learning defect detection model” and (2) “term is used herein refers to the hidden layer in the DML defect detection model that contains the hidden representation of the input.” 
	B-1.	Examiner’s response to the argument 
	The examiner does not agree with the appellant’s argument. First, the two ways that the appellant’s definition of the “latent space” are not correct definitions when the claims should be interpreted under the “Broadest Reasonable Interpretation (“BRI”),” which is the fundamental standard of the application review by the examiners. It does not appear that the appellant could rebut or argue against this standard.
	Second, the first definition of the “latent space”—“ the space into which a test image for a specimen and a corresponding reference image are projected by a deep metric learning defect detection model”— is not a clear definition considering the interrelation between the claim limitations. Instead, the “latent space” should have been interpreted as a set with different type of variables or parameters, which is correct interpretation under the standard, BRI. This interpretation was incorporated for understanding and rejecting the claims in the previous Office Action (“OA). It would not be an exception for this examiner’s answer. 
Third, the appellant admits that the Latent Space refers to a “hidden layer in the DML defect detection model that contains the hidden representation of the input” particularly “set forth in the instant Specification” (emphasis added). In other words, the appellants is reading in the specification for interpreting the meaning of “latent space” in the claim limitations. In fact, this is an erroneous understanding of the claim interpretation during the whole patent prosecution. The correct way how the claim is interpreted is the BRI in the light of the specification, not reading in the specification as to define certain terms (emphasis added). If it is allowed the way how the appellant defined “latent space,” then any claims would be allowed with just having a specific meaning defined or explained only in the specification, which would violate the fairness and the correctness of examining or reviewing patent applications. For example, if a certain claim recites “semantic space” and defines it as a relationship of parameters in a machine language programming to propose a list only in the specification, then should the examiner interpret the “semantic space” as what the specification defined? The answer is a definite no. A person of ordinary skill in the art or a sensible person would not read in the specification to interpret a term in the claims. Here, the appellant’s interpretation of “latent space” as a term referring to “the hidden layer in the DML defect detection model that contains the hidden representation of the input” was an error because “latent space” could only be construed as what the appellant asserted by reading in the specification for claim interpretation. So, the correct interpretation is a set with different type of variables or parameters under BRI in the light of the specification (emphasis added).  
Fourth, the claim rejection rendered by the combination of Zhang and Moniwa was correct based on the correct interpretation of latent space as a set with different type of variables or parameter. Zhang discloses projecting a test image generated for a specimen and a corresponding reference image into latent space by teaching the fact that “determining inverted features of input images in a training set for a specimen input to the neural network” (¶0058), which the training set of a specimen input comprises variables or parameters in the neural network process. Moreover, Moniwa teaches one or more different portions of the test image, determining a distance in the latent space between the one or more different portions and corresponding one or more portions of the corresponding reference image by disclosing that “in step S7 of implementing proximity effect correction for patterns which are imaged with shifter edges, an edge of the respective patterns P2 is extended outward by 10 nm in case that a distance from the edge to an adjacent pattern becomes 1000 nm or more. In step S8 of implementing phase assignment for the aperture patterns for phase shift patterns, the aperture patterns for present phase shift patterns P7 and the aperture pattern for dummy shifter pattern P8 that are disposed so as to sandwich the pattern P2 after correction of the proximity effect or the dummy gate pattern P9 there between are assigned with phases opposite to each other, thereby obtaining 0-degree phase assigned shifter pattern P3 and 180-degree phase assigned shifter pattern P4 as shown in FIG. 5F” at Fig. 5F, Fig. 10-S7 and ¶¶0064 and 0094; “In step S14 of pattern operation using a result of phase assignment and a result of proximity effect correction, the result 101 of proximity effect correction is subtracted in pattern operation from the result 102 of phase assignment, the result being phase shift mask pattern data” (Fig. 10-S14 and ¶0094). In short, Moniwa manipulates parameters from different patterns and target data to calculate the distance or error. Therefore, the combination of Zhang and Moniwa was a valid rejection under 35 U.S.C. §103 as being unpatentable with respect to claims 1-6, 8, 10-13, 16-26, 32-36, and 66-67.
A-2.	Appellant’s argument at page 9, line 20 to page 13, line 2
The appellant argues the examiner failed to establish a prima facie case of obviousness for several reasons.
B-2.	Examiner’s response to the argument 
The examiner disagrees with the appellant. The prima facie case of obviousness was established with the correct interpretation of the term, “latent space.” The correct understanding or definition was discussed in the previous section. Apparently, the previous section address appellant’s arguments. In sum, the correct interpretation of latent space is a set with different type of variables or parameters under BRI in the light of the specification (emphasis added). Zhang teaches the training set of a specimen input comprises variables or parameters in the neural network process (¶0058), and Moniwa calculates parameters or variables of patterns and target data to determine the distance or error. Hence, the combination of Zhang and Moniwa establish a prima facie case of obviousness regarding claims 1-6, 8, 10-13, 16-26, 32-36, and 66-67.
A-3.	Appellant’s argument at page 13, line 2 to page 15, line 8
The appellant argues the examiner failed to establish a prima facie case of obviousness with respect to claim 7 because Zhang or Moniwa does not teach inputting design data.
B-3.	Examiner’s response to the argument 
The examiner disagrees with the appellant. It is questionable what a design data means. The appellant did not specify or limit the design data to have some special meaning in the claim 7. Just arguing that a broad term is not disclosed by the prior arts is not persuasive at all. Despite this design data, which is a broad term, Zhang teaches the training set of a specimen input comprises variables or parameters in the neural network process, which are used for “semiconductor applications such as inspection, metrology, and defect review, the neural network described herein can be used to solve inverse problems in imaging formation (e.g., diffraction, interference, partial coherence, blurring, etc.) to regenerate optically-corrected features … [,] “Inverted features” (where inverted is related to the context of an inversion neural network) [being] generally defined herein as features after inverting a physical process and "features" are defined as generally referring to measurable properties including, but not limited to, intensity, amplitude, phase, edge, gradients” (¶0058) that is for some type of processing for design. Hence, the combination of Zhang and Moniwa establishes a prima facie case of obviousness regarding claim 7.
A-4.	Appellant’s argument at page 15, line 9 to page 17, line 8
The appellant argues the examiner failed to establish a prima facie case of obviousness with respect to claim 7 because Zhang or Moniwa does not teach care areas.
B-4.	Examiner’s response to the argument 
The examiner disagrees with the appellant. It is questionable what care areas means. This appears to be another appellant’s argument similar to the design data. Appellant did not specify or limit the care areas to have some special meaning in the claim 7. Just arguing that a broad term is not disclosed by the prior arts is not persuasive at all. The cares areas could be interpreted as target areas or interest areas for the processing. Here, Zhang discloses “Inverted features” (where inverted is related to the context of an inversion neural network) [being] generally defined herein as features after inverting a physical process and "features" are defined as generally referring to measurable properties including, but not limited to, intensity, amplitude, phase, edge, gradients,” which has target areas or interest areas for the processing. Hence, the combination of Zhang and Moniwa establishes a prima facie case of obviousness regarding claim 7.
A-4.	Appellant’s argument at page 17, line 9 to page 19, lines 16
The appellant argues the examiner failed to establish a prima facie case of obviousness with respect to claim 14.
B-4.	Examiner’s response to the argument 
The examiner disagrees with the appellant. The appellant’s argument is not persuasive at all without any clear evidence or explanation. The only argument is the combination of Zhang and Moniwa or each of the prior art does not teach the limitation of claim 14. The appellant has to give a good reason why these prior arts do not teach the limitation of claim 14. It appears that the appellant is looking for verbatim words in the prior arts to consider the rejection was valid. If that is the case, then that is an undue reasoning. It is clear that the citation of the prior art for rejection claim 14 was correct or valid. Therefore, the combination of Zhang and Moniwa establishes a prima facie case of obviousness regarding claim 14.
A-5.	Appellant’s argument at page 19, line 17 to page 21, line 21
The appellant argues the examiner failed to establish a prima facie case of obviousness with respect to claim 15.
B-5.	Examiner’s response to the argument 
The examiner disagrees with the appellant. The appellant’s argument is not persuasive at all without any clear evidence or explanation. The only argument is the combination of Zhang and Moniwa or each of the prior art does not teach the limitation of claim 15. The appellant has to give a good reason why these prior arts do not teach the limitation of claim 15. It appears that the appellant is looking for verbatim words in the prior arts to consider the rejection was valid. If that is the case, then that is an undue reasoning. It is clear that the citation of the prior art for rejection claim 15 was correct or valid. Therefore, the combination of Zhang and Moniwa establishes a prima facie case of obviousness regarding claim 15.
A-6.	Appellant’s argument at page 21, line 22 to page 24, line 4
The appellant argues the examiner failed to establish a prima facie case of obviousness with respect to claim 27 because Zhang or Moniwa does not teach low-rank reference image.
B-6.	Examiner’s response to the argument 
The examiner disagrees with the appellant. It is questionable what low-rank reference image means. This appears to be another appellant’s argument similar to the design data or care areas. Appellant did not specify or limit the low-rank reference image to have some special meaning in the claim 27. Zhang discloses that “The term "low resolution image" of a specimen, as used herein, is generally defined as an image in which all of the patterned features formed in the area of the specimen at which the image was generated are not resolved in the image. For example, some of the patterned features in the area of the specimen at which a low resolution image was generated may be resolved in the low resolution image if their size is large enough to render them resolvable. However, the low resolution image is not generated at a resolution that renders all patterned features in the image resolvable. In this manner, a "low resolution image," as that term is used herein, does not contain information about patterned features on the specimen that is sufficient for the low resolution image to be used for applications such as defect review, which may include defect classification and/or verification, and metrology. In addition, a "low resolution image" as that term is used herein generally refers to images generated by inspection systems, which typically have relatively lower resolution (e.g., lower than defect review and/or metrology systems) in order to have relatively fast throughput” at ¶0100. Here, the low-rank reference image interpreting as low resolution image was valid because it has a lower rank than the high resolution image. Therefore, the combination of Zhang and Moniwa establishes a prima facie case of obviousness regarding claim 27.
A-7.	Appellant’s argument at page 24, line 5 to page 26, line 19
The appellant argues the examiner failed to establish a prima facie case of obviousness with respect to claims 28-31.
B-7.	Examiner’s response to the argument 
The examiner disagrees with the appellant. The appellant’s argument is not persuasive at all without any clear evidence or explanation. The only argument is the combination of Zhang and Moniwa or each of the prior art does not teach the limitations of claims 28-31. The appellant has to give a good reason why these prior arts do not teach the limitation of claims 28-31. It appears that the appellant is looking for verbatim words in the prior arts to consider the rejection was valid. If that is the case, then that is an undue reasoning. It is clear that the citation of the prior art for rejection claims 28-31 was correct or valid. Therefore, the combination of Zhang and Moniwa establishes a prima facie case of obviousness regarding claims 28-31.



	(4) Conclusion of Examiner Answer
For the forgoing reasons, it is believed that the rejections should be sustained.

Respectfully submitted,
/JOHN W LEE/Primary Examiner, Art Unit 2664                                                                                                                                                                                                        
Conferees:
/EDWARD F URBAN/Supervisory Patent Examiner, Art Unit 2665                                                                                                                                                                                                        
/NAY A MAUNG/Supervisory Patent Examiner, Art Unit 2664                                                                                                                                                                                                        
Requirement to pay appeal forwarding fee.  In order to avoid dismissal of the instant appeal in any application or ex parte reexamination proceeding, 37 CFR 41.45 requires payment of an appeal forwarding fee within the time permitted by 37 CFR 41.45(a), unless appellant had timely paid the fee for filing a brief required by 37 CFR 41.20(b) in effect on March 18, 2013.