DETAILED ACTION
Notice of AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 11-30 are pending.
Claims 11-30 are rejected.
Priority
Continuation:
This instant application is continuation to the parent application no. 15821553 filed on 11/22/2017.
Provisional:
Acknowledgment is made of applicant’s claim for priority to provisional application no. 62428410 filed on 11/30/2016.
Response to Amendment
In the claim amendments filed on 08/03/2020, claims 1-10 were canceled, and new claims 11-30 were presented. Accordingly new claims are acknowledged and are fully considered.
Information Disclosure Statement
The information disclosure statement (IDS) dated 10/08/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Drawings.
The drawings are objected to under 37 CFR 1.83(a).  The drawings must show every feature of the invention specified in the claims.  Therefore, the method of claims 21-26 with all the steps must be shown or the feature(s) canceled from the claim(s).  No new matter should be entered.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets 
Claims 11-14, 18, 21-23 of the instant application are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 of the U.S. Patent No. US10732607B2 and further in view of Sahu et al. (US20180100810A1) [hereinafter Sahu].
Claim(s) 11:
Regarding claim 11, claim 1 of the U.S. Patent No. US10732607B2 discloses, “A computer program product for controlling processing of a substrate, the computer program product tangibly embodied in a non-transitory computer readable media and comprising instructions for causing a processor to: [See claim 1 of the U.S. Patent No. US10732607B2: “A computer program product for controlling processing of a substrate, the compute program product tangibly embodied in a non-transitory computer readable media and comprising instructions for causing a processor to:”];
“receive, from an in-situ optical monitoring system, a measured spectrum of light reflected from a substrate undergoing processing that modifies a thickness of an outer layer of the substrate;” [See claim 1 of the U.S. Patent No. US10732607B2: “receive, from an in-situ optical monitoring system, a measured spectrum of light reflected from a substrate undergoing processing that modifies a thickness of an outer layer of the substrate;”];
[See claim 1 of the U.S. Patent No. US10732607B2: “reduce a dimensionality of the measured spectrum to generate a plurality of component values,”];
“generate a characterizing value using an artificial neural network, the artificial neural network having a plurality of input nodes to receive the plurality of component values, an output node to output the characterizing value, and a plurality of hidden nodes connecting the input nodes to the output node; and” [See claim 1 of the U.S. Patent No. US10732607B2: “generate a characterizing value using an artificial neural network, the artificial neural network having a plurality of input nodes to receive the plurality of component values, an output node to output the characterizing value, and a plurality of hidden nodes connecting the input nodes to the output node; and”];
“determine at least one of whether to halt processing of the substrate or an adjustment for a processing parameter based on the characterizing value.” [See claim 1 of the U.S. Patent No. US10732607B2: “determine at least one of whether to halt processing of the substrate or an adjustment for a processing parameter based on the characterizing value.”], but doesn’t explicitly disclose, “reduce a dimensionality of the measured spectrum to generate a plurality of component values by performing, on the measured spectrum, a transformation based on a plurality of reference spectra;”
However, Sahu discloses, “reduce a dimensionality of the measured spectrum to generate a plurality of component values by performing, on the measured spectrum, a transformation based on a plurality of reference spectra;” [See dimensionality of  “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of reducing dimensionality of measured data to generate components by performing transformation on the measured spectrum data based on predetermined spectrum data taught by Sahu with the polishing monitoring system taught by the U.S. Patent No. US10732607B2 as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to enable methods to be performed more quickly and offer significant computational savings [Sahu: “systems and methods disclosed herein offer significant computational savings over pre-existing hyperspectral imaging systems, enable methods to be performed more quickly (e.g., systems may be operated with high conveyor belt speeds),” (¶40)].

Claim(s) 12:
	Regarding claim 12, the U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claim 11 of the instant application.
	Regarding claim 12, claim 2 of the U.S. Patent No. US10732607B2 discloses, “spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra” [See claim 2 of the U.S. Patent No. US10732607B2: “perform feature extraction on a plurality of reference spectra to generate a plurality of components.”], but doesn’t explicitly disclose, “wherein the transformation is based on spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra.”
However, Sahu discloses, “the transformation is based on spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra” [See the transformation is based on extracting spectral features extraction from reference/predetermined spectra: “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)… “In step 410, spectral feature values, e.g., a mean value of reflectance intensity, may be calculated for each known material and/or for a class of known material, for which image data is acquired. In step 412, threshold values may be determined for known materials and/or product material. Threshold values may be implemented in processes similar to those described herein (e.g., in method 300) and may be compared to scanned line or pixel data acquired by a real-time detection and removal system” (¶78)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Sahu with the polishing monitoring system taught by the U.S. Patent No. US10732607B2 and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 11.

Claim(s) 13:
	Regarding claim 13, the U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claims 11-12 of the instant application.
	Regarding claim 13, claim 2 of the U.S. Patent No. US10732607B2 further discloses, “instructions to perform feature extraction on the plurality of reference spectra to generate the plurality of spectral” “components” [See claim 2 of the U.S. Patent No. US10732607B2: “perform feature extraction on a plurality of reference spectra to generate a plurality of components.”], but doesn’t explicitly disclose, “instructions to perform feature extraction on the plurality of reference spectra to generate the plurality of spectral feature components.”
However, Sahu discloses, “instructions to perform feature extraction on the plurality of reference spectra to generate the plurality of spectral feature components.” [See the instructions to perform feature extraction on the reference spectra to generate plurality of spectral feature component (e.g.; spectral feature values): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)… “In step 410, spectral feature values, e.g., a mean value of reflectance intensity, may be calculated for each known material and/or for a class of known material, for which image data is acquired. In step 412, threshold values may be determined for known materials and/or product material. Threshold values may be implemented in processes similar to those described herein (e.g., in method 300) and may be compared to scanned line or pixel data acquired by a real-time detection and removal system” (¶78)].


Claim(s) 14:
	Regarding claim 14, the U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claims 11-13 of the instant application.
	Regarding claim 14, claim 3 of the U.S. Patent No. US10732607B2 discloses, “wherein the instructions to perform feature extraction comprise instructions to perform principal component analysis, singular value decomposition, independent component analysis, or autoencoding.” [See claim 3 of the U.S. Patent No. US10732607B2: “wherein the instructions to perform feature extraction comprise instructions to perform principal component analysis, singular value decomposition, independent component analysis, or autoencoding.”].

Claim(s) 18:
	Regarding claim 18, the U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claim 11 of the instant application.
Regarding claim 18, claim 9 of the U.S. Patent No. US10732607B2 discloses, “wherein neural network comprises at least one input node configured to receive at least [See claim 9 of the U.S. Patent No. US10732607B2: “neural network comprises at least one input node configured to receive at least one of a prior measurement of the substrate, a measurement of a prior substrate, a measurement from another sensor in a processing system in which the substrate undergoes processing, a measurement from a sensor that is outside the processing system, a value from a processing recipe stored by a controller of the processing system, or a value of a variable tracked by the controller.”].

Claim(s) 21:
Regarding claim 21, claim 11 of the U.S. Patent No. US10732607B2 discloses, “A method of processing a substrate, comprising: subjecting a substrate to processing that modifies a thickness of an outer layer of the substrate;” [See claim 11 of the U.S. Patent No. US10732607B2: “A method of processing a substrate, comprising: subjecting a substrate to processing that modifies a thickness of an outer layer of the substrate;”];
“measuring during the processing with an in-situ optical monitoring system a measured spectrum of light reflected from the substrate undergoing processing;” [See claim 11 of the U.S. Patent No. US10732607B2: “measuring during the processing with an in-situ optical monitoring system a measured spectrum of light reflected from the substrate undergoing processing;”];
“reducing a dimensionality of the measured spectrum to generate a plurality of component values” [See claim 11 of the U.S. Patent No. US10732607B2: “reducing a dimensionality of the measured spectrum to generate a plurality of component values,”];
“generating a characterizing value using an artificial neural network, the artificial neural network having a plurality of input nodes to receive the plurality of component values, an output node to output the characterizing value, a plurality of hidden nodes connecting the input nodes to the output node;” [See claim 11 of the U.S. Patent No. US10732607B2: “generating a characterizing value using an artificial neural network, the artificial neural network having a plurality of input nodes to receive the plurality of component values, an output node to output the characterizing value, and a plurality of hidden nodes connecting the input nodes to the output node; and”];
“determining at least one of whether to halt processing of the substrate or an adjustment for a processing parameter based on the characterizing value.” [See claim 11 of the U.S. Patent No. US10732607B2: “determining at least one of whether to halt processing of the substrate or an adjustment for a processing parameter based on the characterizing value.”], but doesn’t explicitly disclose, “reducing a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra;”
However, Sahu discloses, “reducing a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra;” [See dimensionality of measured spectrum is reduced to generate plurality of component by performing transformation of the measured spectrum (e.g.; orthogonal decomposition, spectral mixture resolution, etc.) based on plurality of reference spectrum (e.g.; predetermined spectrum data): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of reducing dimensionality of measured data to generate components by performing transformation on the measured spectrum data based on predetermined spectrum data taught by Sahu with the method taught by the U.S. Patent No. US10732607B2 as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to enable methods to be performed more quickly and offer significant computational savings [Sahu: “systems and methods disclosed herein offer significant computational savings over pre-existing hyperspectral imaging systems, enable methods to be performed more quickly (e.g., systems may be operated with high conveyor belt speeds),” (¶40)].

Claim(s) 22:
	Regarding claim 22, the U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claim 21 of the instant application.
Regarding claim 22, claim 12 of the U.S. Patent No. US10732607B2 discloses, “the processing comprises chemical mechanical polishing.” [See claim 12 of the U.S. Patent No. US10732607B2: “processing comprises chemical mechanical polishing.”].

Claim(s) 23:
	Regarding claim 23, the U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claim 21 of the instant application.
	Regarding claim 23, claim of the 13 U.S. Patent No. US10732607B2 discloses, “performing feature extraction on the plurality of reference spectra” [See claim 13 of the U.S. Patent No. US10732607B2: “performing feature extraction on a plurality of reference spectra”], but doesn’t explicitly disclose, “performing feature extraction on the plurality of reference spectra to generate the transformation.”
However, Sahu discloses, “performing feature extraction on the plurality of reference spectra to generate the transformation.” [See the instructions to perform feature extraction on the reference spectra to generate plurality of spectral feature component (e.g.; spectral feature values): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)… “In step 410, spectral feature values, e.g., a mean value of reflectance intensity, may be calculated for each known material and/or for a class of known material, for which image data is acquired. In step 412, threshold values may be determined for known materials and/or product material. Threshold values may be implemented in processes similar to those described herein (e.g., in method 300) and may be compared to scanned line or pixel data acquired by a real-time detection and removal system” (¶78)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Sahu with the method taught by the U.S. Patent No. US10732607B2 and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 21.

Claims 15-17, 19-20, 24-26 are rejected under 35 U.S.C. 103 as being unpatentable over the U.S. Patent No. US10732607B2 and Sahu, and further in view of Gnojewski et al. (US20100094790A1) [hereinafter Gnojewski].
Claim(s) 15:
	Regarding claim 15, the U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claim 11 of the instant application.
	Regarding claim 15, Gnojewski discloses, “instructions to perform dimensional reduction on two or more of the plurality of reference spectra that have known characteristic values to generate training data.” [See dimensional reduction is performed in more than one data (e.g.; 2048 data points), where the data is known characteristic (i.e.; known structures), and the data is used as an input to the neural network to train the neural network as shown in figure 6: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Gnojewski with the polishing monitoring system taught by the U.S. Patent No. US10732607B2 and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to take advantage of the faster measuring technique with lower operational cost [Gnojewski: “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶22… “The end result may be a faster measuring technique with lower operational cost.” (¶25))].
Claim(s) 16:
	Regarding claim 16, the U.S. Patent No. US10732607B2, Gnojewski, and Sahu disclose all the elements of claims 11 and 15 of the instant application.
	Regarding claim 16, Gnojewski discloses, “instructions to train the artificial neural network by backpropagation using the training data and the known characteristic values.” [See the back-propagation technique is used to train the neural network using the training data and the known characteristics: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)… “A cost function may be defined to evaluate a cumulative error at each output node. The cost function may be function showing a relationship between the cumulative errors at each output node as a function of the weighting functions. The minimization of the cost function through a number of iterations may form the basis of a back-propagation of parameters of the ANN 150. In the back-propagation, the ANN 150 may iterate through a process of training by changing the weighting functions according to predefined guidelines until some exit conditions are fulfilled. The exit condition may be defined based on a maximum number of iterations (e.g., 100), convergence of the cost function error below a predefined limit (e.g. 0.001), or the changes in weighing functions being less than a predefine threshold. The weighting function changes may occur at different times during the learning phase depending on the training approach selected by a user.” (¶33)].
	Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Gnojewski with the polishing monitoring system taught by the U.S. Patent No. US10732607B2, Gnojewski, and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 15.

Claim(s) 17:
	Regarding claim 17, the U.S. Patent No. US10732607B2, Gnojewski, and Sahu disclose all the elements of claims 11 and 15-16 of the instant application.
	Regarding claim 17, Gnojewski further discloses, further discloses, “the two or more spectra includes fewer spectra than the plurality of spectra.” [See the spectra 850, 860, and 870 as shown in figure 8, where two or more spectra are less than the rest of the spectra (e.g.; spectra having wavelength less than 400 and less than 300 such that they are smaller than the rest of the spectra that have wavelength more than 400): “The reflectometer 130 measurements may be made at certain points on the sample. The result of the reflectometer 130 measurements may include SIRs such as the spectra shown in FIG. 8 (see spectra 850, 860, and 870). A SIR is an intensity versus wavelength function that defines an intensity at a certain wavelength. The intensity may be provided as a reflection coefficient at the certain wavelength.” (¶21)].
	Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Gnojewski with the polishing monitoring system taught by the U.S. Patent No. US10732607B2, Gnojewski, and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 15.

Claim(s) 19:
	Regarding claim 19, the U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claim 11 of the instant application.
Regarding claim 19, Gnojewski discloses, “instructions to cause a chemical mechanical polishing system to polish the substrate,” [See the processing include chemical mechanical polishing (i.e.; CMP) to polish a substrate: “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “Post-polish cleans may be used as part of the CMP technique to remove the slurry from the surface of the sample after removal of a silicon dioxide layer, before subjecting the sample to reflectometry or other metrology measurements” (¶20)].
“instructions” “to receive the measured spectrum of light from the in-situ optical monitoring system during polishing of the substrate.” [See optical monitoring system such as the reflectometer 130 collects a measured spectrum of light that is reflected from the sample (e.g.; semiconductor wafer) that undergoes a process of modifying thickness of a layer: “ANN 150 is trained based on the spectral intensity response of a reflectometer 130.” “Reflectometer 130 may be a polarized reflectometer, a spectrometer, an optical time domain reflectometer, or other similar device. Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110. Sample 110 may be a semiconductor sample/wafer, a biological sample, a biochemical/chemical sample, or other sample for which dimensions are to be determined.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer,” (¶23)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of [Gnojewski: “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶22… “The end result may be a faster measuring technique with lower operational cost.” (¶25))].


Claim(s) 20:
	Regarding claim 20, the U.S. Patent No. US10732607B2, Gnojewski, and Sahu disclose all the elements of claims 11 and 19 of the instant application.
Regarding claim 20, Gnojewski discloses, “the instructions to determine at least one of whether to halt processing of the substrate or an adjustment for a processing parameter comprise instructions to modify a pressure in a carrier head of the chemical mechanical polishing system.” [Examiner notes that the claim requires only one of 1. halt processing of the substrate or 2. an adjustment for a processing parameter. Gnojewski discloses: the neural network outputs a result, where the result is used for an adjustment to the thickness of the layer and for determining a halt (e.g.; endpoint) of the process: “determining” “endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)].
	Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Gnojewski with the polishing monitoring system taught by the U.S. Patent No. US10732607B2, Gnojewski, and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 19.

Claim(s) 24:
	Regarding claim 24, The U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claims 23 and 21 of the instant application.
	Regarding claim 24, Gnojewski further discloses, “wherein the feature extraction comprises” “autoencoding” [Examiner notes that the claim requires only one of principal component analysis, singular value decomposition, independent component analysis, or autoencoding. Gnojewski discloses, autoencoding: See the autoencoding process, where an artificial neural network performs efficient data encoding automatically: “at block 710, the spectral intensity responses of areas of interest on the samples having similar features as the samples used for training the ANN 150 may be measured and converted to spectral central moments (SCM) at block 712. The SCM may then be applied, block 715, to the inputs of the trained ANN 750. The outputs of the trained ANN 750 (block 720) represent expected reference metrology tool measurement results on the same areas of the same samples. In other words, the trained ANN 750 may simulate the reference metrology tool.” (¶39)… “the trained ANN 750 measurement results may also be free from human and system errors involved in typical reference tool measurements.” (¶40)… “At operation 330, ANN 150 is trained using the spectral central moments as the inputs to the ANN 150 to generate an output. The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)], but doesn’t explicitly disclose, “wherein the feature extraction comprises principal component analysis, singular value decomposition, independent component analysis,”
	However, Sahu discloses, wherein the feature extraction comprises principal component analysis, singular value decomposition, independent component analysis,” [Examiner notes that the claim requires only one of principal component analysis, singular value decomposition, independent component analysis, or autoencoding. See the instructions to perform feature extraction comprise principal component analysis (e.g.; principal component analysis) and singular value decomposition (e.g.; orthogonal decomposition): “a combination of various data reduction and analysis techniques like principal component analysis, orthogonal decomposition, spectral mixture resolution, etc. could be used as a first step to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
	Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Gnojewski and Sahu with the polishing monitoring method taught by the U.S. Patent No. US10732607B2 and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to enable methods to be performed more quickly and offer significant computational savings [Sahu: “systems and methods disclosed herein offer significant computational savings over pre-existing hyperspectral imaging systems, enable methods to be performed more quickly (e.g., systems may be operated with high conveyor belt speeds),” (¶40)], and in order to take advantage of the faster measuring technique with lower operational cost [Gnojewski: “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶22… “The end result may be a faster measuring technique with lower operational cost.” (¶25)].

Claim(s) 25:
	Regarding claim 25, the U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claims 21-22 of the instant application.
	Regarding claim 25, Gnojewski further discloses, “performing dimensional reduction on two or more of the plurality of reference spectra that have known characteristic values to generate training data.” [See dimensional reduction is performed in more than one data (e.g.; 2048 data points), where the data is known characteristic (i.e.; known structures), and the data is used as an input to the neural network to train the neural network as shown in figure 6: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)].
	Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Gnojewski with the polishing monitoring method taught by the U.S. Patent No. US10732607B2 and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to take advantage of the faster measuring technique with lower operational cost [Gnojewski: “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶22… “The end result may be a faster measuring technique with lower operational cost.” (¶25)].

Claim(s) 26:
	Regarding claim 26, the U.S. Patent No. US10732607B2, Gnojewski, and Sahu disclose all the elements of claims 25 and 21-22 of the instant application.
	Regarding claim 26, Gnojewski further discloses, “training the artificial neural network by backpropagation using the training data and the known characteristic values.” [See the back-propagation technique is used to train the neural network using the training data and the known characteristics: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)… “A cost function may be defined to evaluate a cumulative error at each output node. The cost function may be function showing a relationship between the cumulative errors at each output node as a function of the weighting functions. The minimization of the cost function through a number of iterations may form the basis of a back-propagation of parameters of the ANN 150. In the back-propagation, the ANN 150 may iterate through a process of training by changing the weighting functions according to predefined guidelines until some exit conditions are fulfilled. The exit condition may be defined based on a maximum number of iterations (e.g., 100), convergence of the cost function error below a predefined limit (e.g. 0.001), or the changes in weighing functions being less than a predefine threshold. The weighting function changes may occur at different times during the learning phase depending on the training approach selected by a user.” (¶33)].
	Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Gnojewski with the polishing monitoring system taught by the U.S. Patent No. US10732607B2, Gnojewski, and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 25.

Claims 27-29 are rejected under 35 U.S.C. 103 as being unpatentable over the U.S. Patent No. US10732607B2, and further in view of Singh et al. (US6594024B1) [hereinafter Singh] and Sahu.
Claim(s) 27:
Regarding claim 27, the claim 16 U.S. Patent No. US10732607B2 discloses, “A polishing system, comprising: a support to hold a polishing pad; a carrier head to hold a substrate in contact with the polishing pad; a motor to generate relative motion between the support and the carrier head; an in-situ optical monitoring system to measure a spectrum of light reflected from the substrate during polishing;” [See claim 16 of the U.S. Patent No. US10732607B2: “A polishing system, comprising: a support to hold a polishing pad; a carrier head to hold a substrate in contact with the polishing pad; a motor to generate relative motion between the support and the carrier head; an in-situ optical monitoring system to measure a spectrum of light reflected from the substrate during polishing; and”];
“a controller configured to receive, from then optical in-situ monitoring system, a measured spectrum of light reflected from the substrate undergoing processing,” [See claim 16 of the U.S. Patent No. US10732607B2: “a controller configured to receive, from then optical in-situ monitoring system, a measured spectrum of light reflected from the substrate undergoing processing,”];
“reduce a dimensionality of the measured spectrum to generate a plurality of component values” [See claim 16 of the U.S. Patent No. US10732607B2: “reduce a dimensionality of the measured spectrum to generate a plurality of component values,”];
“generate a characterizing value using an artificial neural network, the artificial neural network having a plurality of input nodes to receive the plurality of component values, an output node to output the characterizing value, and a plurality of hidden nodes connecting the input nodes to the output node,” [See claim 16 of the U.S. Patent No. US10732607B2: “generate a characterizing value using an artificial neural network, the artificial neural network having a plurality of input nodes to receive the plurality of component values, an output node to output the characterizing value, and a plurality of hidden nodes connecting the input nodes to the output node, and”];
[See 16 claim  of the U.S. Patent No. US10732607B2: “determine at least one of whether to halt processing of the substrate or an adjustment for a processing parameter based on the characterizing value.”], but doesn’t explicitly disclose, “reduce a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra,”
However, Sahu discloses, “reduce a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra,” [See dimensionality of measured spectrum is reduced to generate plurality of component by performing transformation of the measured spectrum (e.g.; orthogonal decomposition, spectral mixture resolution, etc.) based on plurality of reference spectrum (e.g.; predetermined spectrum data): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of reducing [Sahu: “systems and methods disclosed herein offer significant computational savings over pre-existing hyperspectral imaging systems, enable methods to be performed more quickly (e.g., systems may be operated with high conveyor belt speeds),” (¶40)].

Claim(s) 28:
	Regarding claim 28, the U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claim 27 of the instant application.
	Regarding claim 28, claim 17 of the U.S. Patent No. US10732607B2 discloses, “spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra” [See claim 17 of the U.S. Patent No. US10732607B2: “perform feature extraction on a plurality of reference spectra to generate a plurality of components.”], but doesn’t explicitly disclose, “wherein the transformation is based on spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra.”
However, Sahu discloses, “wherein the transformation is based on spectral feature components extracted from the plurality of reference spectra by feature [See the transformation is based on extracting spectral features extraction from reference/predetermined spectra: “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)… “In step 410, spectral feature values, e.g., a mean value of reflectance intensity, may be calculated for each known material and/or for a class of known material, for which image data is acquired. In step 412, threshold values may be determined for known materials and/or product material. Threshold values may be implemented in processes similar to those described herein (e.g., in method 300) and may be compared to scanned line or pixel data acquired by a real-time detection and removal system” (¶78)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Sahu with the polishing monitoring system taught by the U.S. Patent No. US10732607B2 and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 27.
Claim(s) 29:
	Regarding claim 29, the U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claim 27 of the instant application.
	Regarding claim 29, Gnojewski further discloses, “train the artificial neural network by backpropagation using training data” “and the known characteristic values.” [See the back-propagation technique is used to train the neural network using the training data and the known characteristics: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)… “A cost function may be defined to evaluate a cumulative error at each output node. The cost function may be function showing a relationship between the cumulative errors at each output node as a function of the weighting functions. The minimization of the cost function through a number of iterations may form the basis of a back-propagation of parameters of the ANN 150. In the back-propagation, the ANN 150 may iterate through a process of training by changing the weighting functions according to predefined guidelines until some exit conditions are fulfilled. The exit condition may be defined based on a maximum number of iterations (e.g., 100), convergence of the cost function error below a predefined limit (e.g. 0.001), or the changes in weighing functions being less than a predefine threshold. The weighting function changes may occur at different times during the learning phase depending on the training approach selected by a user.” (¶33)].
	“training data generated by performing dimensional reduction on two or more of the plurality of reference spectra that have known characteristic values,” [See dimensional reduction is performed in more than one data (e.g.; 2048 data points), where the data is known characteristic (i.e.; known structures), and the data is used as an input to the neural network to train the neural network as shown in figure 6: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)].
	Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Gnojewski with the polishing monitoring system taught by the U.S. Patent No. US10732607B2 and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to take advantage of the faster measuring technique with lower operational cost [Gnojewski: “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶22… “The end result may be a faster measuring technique with lower operational cost.” (¶25)].
Claims 30 are rejected under 35 U.S.C. 103 as being unpatentable over the U.S. Patent No. US10732607B2 and Sahu, and further in view of Yoshida (US20180264619A1) [hereinafter Yoshida].
Claim(s) 30:
	Regarding claim 30, the U.S. Patent No. US10732607B2 and Sahu disclose all the elements of claim 27 of the instant application.
	Regarding claim 30, Yoshida discloses, “modify a pressure in the carrier head based on the characterizing value.” [Examiner notes that applicant’s specification paragraph 25 describes characterizing value as “a thickness of the layer” as such Yoshida describes characterizing values as “film-thickness.” See the pressure of the head is modified/adjusted (e.g.; controlled/adjusted to meet target pressure values) based on characterizing value (e.g.; determining target pressure values based on film thickness): “The operation controller 7 is configured to set target pressure values for the pressure chambers P1 to P4, respectively, based on the film-thickness profile and the film-thickness distribution that have been produced from the film-thickness data, and operate the pressure regulators R1 to R4 so that the pressures in the pressure chambers P1 to P4 are maintained at the corresponding target pressure values.” (¶98)… “the polishing head including: a head body;” “and at least three pressure regulators configured to control pressures in the at least three actuating chambers” (¶23)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of adjusting pressure based on characterizing values by Yoshida with the polishing monitoring system taught [Yoshida: “can eliminate the variation in film thickness along a circumferential direction of a substrate” (¶26)].

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 11-26 are rejected under 35 U.S.C. 103 as being unpatentable over Gnojewski et al. (US20100094790A1) [hereinafter Gnojewski], and further in view of Sahu et al. (US20180100810A1) [hereinafter Sahu].
Claim(s) 11:
Regarding claim 11, Gnojewski discloses, “A computer program product for controlling processing of a substrate, the computer program product tangibly embodied in a non-transitory computer readable media and comprising instructions for causing a [See the processor includes main memory, computer readable medium, and set of instructions, where the instructions are executed by the processor to perform functions for controlling processing of a substrate (i.e.; semiconductor processes): “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes.” (¶19)… “The machine 1200 may be a server computer, a client computer, a personal computer (PC), a tablet PC,” “or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.” (¶46)… “The example machine 1200 may include a processor 1260 (e.g., a central processing unit (CPU),” “a main memory 1270 and a static memory 1280, all of which communicate with each other via a bus 1208.” “the machine 1200 also may include” “a disk drive unit 1240,” (¶47)… “The disk drive unit 1240 may include a machine-readable medium 1222 on which is stored one or more sets of instructions (e.g., software) 1224 embodying any one or more of the methodologies or functions described herein.” (¶48)];
“receive, from an in-situ optical monitoring system, a measured spectrum of light reflected from a substrate undergoing processing that modifies a thickness of an outer layer of the substrate;” [See optical monitoring system such as the reflectometer 130 collects a measured spectrum of light that is reflected from the sample (e.g.; semiconductor wafer) that undergoes a process of modifying thickness of a layer: “ANN 150 is trained based on the spectral intensity response of a reflectometer 130.” “Reflectometer 130 may be a polarized reflectometer, a spectrometer, an optical time domain reflectometer, or other similar device. Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110. Sample 110 may be a semiconductor sample/wafer, a biological sample, a biochemical/chemical sample, or other sample for which dimensions are to be determined.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer,” (¶23)];
 “reduce the dimensionality of the measured spectrum to generate a plurality of component values;” [See the dimensionality is reduced such that the collected data is compressed to obtain a reduced number of component data, such that a transformation is performed on the measured/collected data based on reference spectra: “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” (¶23)];
“generate a characterizing value using an artificial neural network,” [See using the artificial neural network ANN50, an output is generated, where the output corresponds to a thickness of the layer in a structure (e.g.; characterizing value): “a method 300 for determining dimensions using an artificial neural network, where the artificial neural network is trained based on the spectral intensity response of a reflectometer.” “At operation 330, ANN 150 is trained using the spectral central moments as the inputs to the ANN 150 to generate an output. The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)];
“the artificial neural network having a plurality of input nodes to receive the plurality of component values,” [See plurality of input nodes 652 as shown in figure 6 receiving input data 630 including plurality of component values as shown in figure 6, also see 710 in figure 7: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “in the measurement phase 540, as shown in FIG. 7 at block 710, the spectral intensity responses of areas of interest on the samples having similar features as the samples used for training the ANN 150 may be measured and converted to spectral central moments (SCM) at block 712. The SCM may then be applied, block 715, to the inputs of the trained ANN 750.” (¶39)];
“an output node to output the characterizing value,” [See the output nodes, 656 as shown in figure 6, and also see figure 7 inside 750: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “The outputs of the trained ANN 750 (block 720) represent expected reference metrology tool measurement results on the same areas of the same samples” (¶39)];
“and a plurality of hidden nodes connecting the input nodes to the output node; and” [See plurality of hidden nodes 654 as shown in figure 6, also see inside 750 in “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“ The feed-forward propagation may start with using the prepared set of inputs along with a set of initial values for the weighing functions relating the hidden nodes to the input nodes (Wji) and output nodes to the hidden nodes (Wkj) of the ANN 150 (see ANN 150 in FIG. 6).” (¶32)];
	“determine at least one of whether to halt processing of the substrate or an adjustment for a processing parameter based on the characterizing value.” [Examiner notes that the claim requires only one of 1. halt processing of the substrate or 2. an adjustment for a processing parameter. Gnojewski discloses: the neural network outputs a result, where the result is used for an adjustment to the thickness of the layer and for determining a halt (e.g.; endpoint) of the process: “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)], but doesn’t explicitly disclose, “reduce a dimensionality of the measured spectrum to generate a plurality of component values by performing, on the measured spectrum, a transformation based on a plurality of reference spectra;”
However, Sahu discloses, “reduce a dimensionality of the measured spectrum to generate a plurality of component values by performing, on the measured spectrum, a transformation based on a plurality of reference spectra;” [See dimensionality of measured spectrum is reduced to generate plurality of component by performing transformation of the measured spectrum (e.g.; orthogonal decomposition, spectral mixture resolution, etc.) based on plurality of reference spectrum (e.g.; predetermined spectrum data): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of reducing dimensionality of measured data to generate components by performing transformation [Sahu: “systems and methods disclosed herein offer significant computational savings over pre-existing hyperspectral imaging systems, enable methods to be performed more quickly (e.g., systems may be operated with high conveyor belt speeds),” (¶40)].

Claim(s) 12:
	Regarding claim 12, Gnojewski and Sahu disclose all the elements of claim 11.
	Regarding claim 12, Gnojewski further discloses, “spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra” [See feature extraction process, where the system measures and generates spectrum data of samples, and then that data is used to extract features: “Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology).” (¶26)], but 
However, Sahu discloses, “the transformation is based on spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra” [See the transformation is based on extracting spectral features extraction from reference/predetermined spectra: “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)… “In step 410, spectral feature values, e.g., a mean value of reflectance intensity, may be calculated for each known material and/or for a class of known material, for which image data is acquired. In step 412, threshold values may be determined for known materials and/or product material. Threshold values may be implemented in processes similar to those described herein (e.g., in method 300) and may be compared to scanned line or pixel data acquired by a real-time detection and removal system” (¶78)].


Claim(s) 13:
	Regarding claim 13, Gnojewski and Sahu disclose all the elements of claims 11-12.
	Regarding claim 13, Gnojewski further discloses, “instructions to perform feature extraction on the plurality of reference spectra to generate the plurality of spectral” “components” [See feature extraction process, where the system measures and generates spectrum data of samples, and then that data is used to extract features: “Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology).” (¶26)], but doesn’t explicitly disclose, “instructions to 
However, Sahu discloses, “instructions to perform feature extraction on the plurality of reference spectra to generate the plurality of spectral feature components.” [See the instructions to perform feature extraction on the reference spectra to generate plurality of spectral feature component (e.g.; spectral feature values): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)… “In step 410, spectral feature values, e.g., a mean value of reflectance intensity, may be calculated for each known material and/or for a class of known material, for which image data is acquired. In step 412, threshold values may be determined for known materials and/or product material. Threshold values may be implemented in processes similar to those described herein (e.g., in method 300) and may be compared to scanned line or pixel data acquired by a real-time detection and removal system” (¶78)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of 

Claim(s) 14:
	Regarding claim 14, Gnojewski and Sahu disclose all the elements of claims 11-13.
	Regarding claim 14, Gnojewski further discloses, “wherein the instructions to perform feature extraction comprise instructions to perform” “autoencoding” [Examiner notes that the claim requires only one of principal component analysis, singular value decomposition, independent component analysis, or autoencoding. Gnojewski discloses, autoencoding: See the autoencoding process, where an artificial neural network performs efficient data encoding automatically: “at block 710, the spectral intensity responses of areas of interest on the samples having similar features as the samples used for training the ANN 150 may be measured and converted to spectral central moments (SCM) at block 712. The SCM may then be applied, block 715, to the inputs of the trained ANN 750. The outputs of the trained ANN 750 (block 720) represent expected reference metrology tool measurement results on the same areas of the same samples. In other words, the trained ANN 750 may simulate the reference metrology tool.” (¶39)… “the trained ANN 750 measurement results may also be free from human and system errors involved in typical reference tool measurements.” (¶40)… “At operation 330, ANN 150 is trained using the spectral central moments as the inputs to the ANN 150 to generate an output. The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)], but doesn’t explicitly disclose, “wherein the instructions to perform feature extraction comprise instructions to perform principal component analysis, singular value decomposition, independent component analysis,”
However, Sahu discloses, “wherein the instructions to perform feature extraction comprise instructions to perform principal component analysis, singular value decomposition, independent component analysis,” [Examiner notes that the claim requires only one of principal component analysis, singular value decomposition, independent component analysis, or autoencoding. See the instructions to perform feature extraction comprise principal component analysis (e.g.; principal component analysis) and singular value decomposition (e.g.; orthogonal decomposition): “a combination of various data reduction and analysis techniques like principal component analysis, orthogonal decomposition, spectral mixture resolution, etc. could be used as a first step to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].


Claim(s) 15:
	Regarding claim 15, Gnojewski and Sahu disclose all the elements of claim 11.
	Regarding claim 15, Gnojewski further discloses, “instructions to perform dimensional reduction on two or more of the plurality of reference spectra that have known characteristic values to generate training data.” [See dimensional reduction is performed in more than one data (e.g.; 2048 data points), where the data is known characteristic (i.e.; known structures), and the data is used as an input to the neural network to train the neural network as shown in figure 6: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)].

Claim(s) 16:
	Regarding claim 16, Gnojewski and Sahu disclose all the elements of claims 11 and 15.
	Regarding claim 16, Gnojewski further discloses, “instructions to train the artificial neural network by backpropagation using the training data and the known characteristic values.” [See the back-propagation technique is used to train the neural network using the training data and the known characteristics: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)… “A cost function may be defined to evaluate a cumulative error at each output node. The cost function may be function showing a relationship between the cumulative errors at each output node as a function of the weighting functions. The minimization of the cost function through a number of iterations may form the basis of a back-propagation of parameters of the ANN 150. In the back-propagation, the ANN 150 may iterate through a process of training by changing the weighting functions according to predefined guidelines until some exit conditions are fulfilled. The exit condition may be defined based on a maximum number of iterations (e.g., 100), convergence of the cost function error below a predefined limit (e.g. 0.001), or the changes in weighing functions being less than a predefine threshold. The weighting function changes may occur at different times during the learning phase depending on the training approach selected by a user.” (¶33)].

Claim(s) 17:
	Regarding claim 17, Gnojewski and Sahu disclose all the elements of claims 11 and 15-16.
	Regarding claim 17, Gnojewski further discloses, further discloses, “the two or more spectra includes fewer spectra than the plurality of spectra.” [See the spectra 850, 860, and 870 as shown in figure 8, where two or more spectra are less than the rest of the spectra (e.g.; spectra having wavelength less than 400 and less than 300 such that they are smaller than the rest of the spectra that have wavelength more than 400): “The reflectometer 130 measurements may be made at certain points on the sample. The result of the reflectometer 130 measurements may include SIRs such as the spectra shown in FIG. 8 (see spectra 850, 860, and 870). A SIR is an intensity versus wavelength function that defines an intensity at a certain wavelength. The intensity may be provided as a reflection coefficient at the certain wavelength.” (¶21)].

Claim(s) 18:
	Regarding claim 18, Gnojewski and Sahu disclose all the elements of claim 11.
Regarding claim 18, Gnojewski further discloses, “wherein neural network comprises at least one input node configured to receive at least one of a prior measurement of the substrate, a measurement of a prior substrate, a measurement [Examiner notes that the claim requires receiving only one of a prior measurement of the substrate, a measurement of a prior substrate, a measurement from another sensor in the processing system, a measurement from a sensor that is outside the processing system, a value from a processing recipe stored by the controller, or a value of a variable tracked by the controller. Gnojewski discloses, receiving a value of a variable tracked by the controller: See the controller collects data related to the variable spectrum using the optical monitoring system such as the reflectometer 130, where 130 collects a measured spectrum of light that is reflected from the sample (e.g.; semiconductor wafer), and the data is input to the artificial neural network at input nodes 652: “ANN 150 is trained based on the spectral intensity response of a reflectometer 130.” “Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110.” (¶19)…  “The reflectometer 130 measurements may be made at certain points on the sample. The result of the reflectometer 130 measurements may include SIRs such as the spectra shown in FIG. 8 (see spectra 850, 860, and 870).” (¶21)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” (¶23)… “At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)].
Claim(s) 19:
	Regarding claim 19, Gnojewski and Sahu disclose all the elements of claim 11.
Regarding claim 19, Gnojewski further discloses, “instructions to cause a chemical mechanical polishing system to polish the substrate,” [See the processing include chemical mechanical polishing (i.e.; CMP) to polish a substrate: “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “Post-polish cleans may be used as part of the CMP technique to remove the slurry from the surface of the sample after removal of a silicon dioxide layer, before subjecting the sample to reflectometry or other metrology measurements” (¶20)].
“instructions” “to receive the measured spectrum of light from the in-situ optical monitoring system during polishing of the substrate.” [See optical monitoring system such as the reflectometer 130 collects a measured spectrum of light that is reflected from the sample (e.g.; semiconductor wafer) that undergoes a process of modifying thickness of a layer: “ANN 150 is trained based on the spectral intensity response of a reflectometer 130.” “Reflectometer 130 may be a polarized reflectometer, a spectrometer, an optical time domain reflectometer, or other similar device. Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110. Sample 110 may be a semiconductor sample/wafer, a biological sample, a biochemical/chemical sample, or other sample for which dimensions are to be determined.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer,” (¶23)].

Claim(s) 20:
	Regarding claim 20, Gnojewski and Sahu disclose all the elements of claims 11 and 19.
Regarding claim 20, Gnojewski further discloses, “the instructions to determine at least one of whether to halt processing of the substrate or an adjustment for a processing parameter comprise instructions to modify a pressure in a carrier head of the chemical mechanical polishing system.” [Examiner notes that the claim requires only one of 1. halt processing of the substrate or 2. an adjustment for a processing parameter. Gnojewski discloses: the neural network outputs a result, where the result is used for an adjustment to the thickness of the layer and for determining a halt (e.g.; endpoint) of the process: “determining” “endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)].

Claim(s) 21:
Regarding claim 21, Gnojewski discloses, “A method of processing a substrate, comprising: subjecting a substrate to processing that modifies a thickness of an outer layer of the substrate;” [See the method for controlling processing of a substrate (i.e.; semiconductor processes): “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes.” (¶19)… “An artificial neural network may be trained to generate an output, using the spectral central moments as inputs. The output may include simulated metrological data associated with the sample. The term metrology shall be taken to include theoretical and practical aspects of measurement (e.g., measurement of semiconductor samples).” (¶17)… “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)];
[See optical monitoring system such as the reflectometer 130 collects a measured spectrum of light that is reflected from the sample (e.g.; semiconductor wafer) that undergoes a process of modifying thickness of a layer: “ANN 150 is trained based on the spectral intensity response of a reflectometer 130.” “Reflectometer 130 may be a polarized reflectometer, a spectrometer, an optical time domain reflectometer, or other similar device. Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110. Sample 110 may be a semiconductor sample/wafer, a biological sample, a biochemical/chemical sample, or other sample for which dimensions are to be determined.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer,” (¶23)];
“reducing a dimensionality of the measured spectrum to generate a plurality of component values” [See the dimensionality is reduced such that the collected data is compressed to obtain a reduced number of component data: “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” (¶23)];
“generating a characterizing value using an artificial neural network,” [See using the artificial neural network ANN50, an output is generated, where the output corresponds to a thickness of the layer in a structure (e.g.; characterizing value): “a method 300 for determining dimensions using an artificial neural network, where the artificial neural network is trained based on the spectral intensity response of a reflectometer.” “At operation 330, ANN 150 is trained using the spectral central moments as the inputs to the ANN 150 to generate an output. The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)];
“the artificial neural network having a plurality of input nodes to receive the plurality of component values,” [See plurality of input nodes 652 as shown in figure 6 receiving input data 630 including plurality of component values as shown in figure 6, “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “in the measurement phase 540, as shown in FIG. 7 at block 710, the spectral intensity responses of areas of interest on the samples having similar features as the samples used for training the ANN 150 may be measured and converted to spectral central moments (SCM) at block 712. The SCM may then be applied, block 715, to the inputs of the trained ANN 750.” (¶39)];
“an output node to output the characterizing value,” [See the output nodes, 656 as shown in figure 6, and also see figure 7 inside 750: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “The outputs of the trained ANN 750 (block 720) represent expected reference metrology tool measurement results on the same areas of the same samples” (¶39)];
“a plurality of hidden nodes connecting the input nodes to the output node;” [See plurality of hidden nodes 654 as shown in figure 6, also see inside 750 in figure 7: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“ The feed-forward propagation may start with using the prepared set of inputs along with a set of initial values for the weighing functions relating the hidden nodes to the input nodes (Wji) and output nodes to the hidden nodes (Wkj) of the ANN 150 (see ANN 150 in FIG. 6).” (¶32)];
“determining at least one of whether to halt processing of the substrate or an adjustment for a processing parameter based on the characterizing value.” [Examiner notes that the claim requires only one of 1. halt processing of the substrate or 2. an adjustment for a processing parameter. Gnojewski discloses: the neural network outputs a result, where the result is used for an adjustment to the thickness of the layer and for determining a halt (e.g.; endpoint) of the process: “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)], but doesn’t explicitly disclose, “reducing a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra;”
However, Sahu discloses, “reducing a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra;” [See dimensionality of measured spectrum is reduced to generate plurality of component by performing transformation of the measured spectrum (e.g.; orthogonal decomposition, spectral mixture resolution, etc.) based on plurality of reference spectrum (e.g.; predetermined spectrum data): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of reducing dimensionality of measured data to generate components by performing transformation on the measured spectrum data based on predetermined spectrum data taught by Sahu with the method taught by Gnojewski as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to enable methods to be performed more quickly and offer significant computational savings [Sahu: “systems and methods disclosed herein offer significant computational savings over pre-existing hyperspectral imaging systems, enable methods to be performed more quickly (e.g., systems may be operated with high conveyor belt speeds),” (¶40)].

Claim(s) 22:
	Regarding claim 22, Gnojewski and Sahu disclose all the elements of claim 21.
Regarding claim 22, Gnojewski further discloses, “the processing comprises chemical mechanical polishing.” [See the processing include chemical mechanical polishing (i.e.; CMP): “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “Post-polish cleans may be used as part of the CMP technique to remove the slurry from the surface of the sample after removal of a silicon dioxide layer, before subjecting the sample to reflectometry or other metrology measurements” (¶20)].

Claim(s) 23:
	Regarding claim 23, Gnojewski and Sahu disclose all the elements of claim 21.
	Regarding claim 23, Gnojewski further discloses, “performing feature extraction on the plurality of reference spectra” [See feature extraction process, where the system measures and generates spectrum data of samples, and then that data is used to extract features: “Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology).” (¶26)], but doesn’t explicitly disclose, “performing feature extraction on the plurality of reference spectra to generate the transformation.”
However, Sahu discloses, “performing feature extraction on the plurality of reference spectra to generate the transformation.” [See the instructions to perform feature extraction on the reference spectra to generate plurality of spectral feature  “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)… “In step 410, spectral feature values, e.g., a mean value of reflectance intensity, may be calculated for each known material and/or for a class of known material, for which image data is acquired. In step 412, threshold values may be determined for known materials and/or product material. Threshold values may be implemented in processes similar to those described herein (e.g., in method 300) and may be compared to scanned line or pixel data acquired by a real-time detection and removal system” (¶78)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Sahu with the method taught by Gnojewski and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 21.


Claim(s) 24:
	Regarding claim 24, Gnojewski and Sahu disclose all the elements of claims 23 and 21.
	Regarding claim 24, Gnojewski further discloses, “wherein the feature extraction comprises” “autoencoding” [Examiner notes that the claim requires only one of principal component analysis, singular value decomposition, independent component analysis, or autoencoding. Gnojewski discloses, autoencoding: See the autoencoding process, where an artificial neural network performs efficient data encoding automatically: “at block 710, the spectral intensity responses of areas of interest on the samples having similar features as the samples used for training the ANN 150 may be measured and converted to spectral central moments (SCM) at block 712. The SCM may then be applied, block 715, to the inputs of the trained ANN 750. The outputs of the trained ANN 750 (block 720) represent expected reference metrology tool measurement results on the same areas of the same samples. In other words, the trained ANN 750 may simulate the reference metrology tool.” (¶39)… “the trained ANN 750 measurement results may also be free from human and system errors involved in typical reference tool measurements.” (¶40)… “At operation 330, ANN 150 is trained using the spectral central moments as the inputs to the ANN 150 to generate an output. The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)], but doesn’t explicitly disclose, “wherein the feature extraction comprises principal component analysis, singular value decomposition, independent component analysis,”
However, Sahu discloses, wherein the feature extraction comprises principal component analysis, singular value decomposition, independent component analysis,” [Examiner notes that the claim requires only one of principal component analysis, singular value decomposition, independent component analysis, or autoencoding. See the instructions to perform feature extraction comprise principal component analysis (e.g.; principal component analysis) and singular value decomposition (e.g.; orthogonal decomposition): “a combination of various data reduction and analysis techniques like principal component analysis, orthogonal decomposition, spectral mixture resolution, etc. could be used as a first step to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Sahu with the method taught by Gnojewski and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 21.



Claim(s) 25:
	Regarding claim 25, Gnojewski and Sahu disclose all the elements of claims 21-22.
	Regarding claim 25, Gnojewski further discloses, “performing dimensional reduction on two or more of the plurality of reference spectra that have known characteristic values to generate training data.” [See dimensional reduction is performed in more than one data (e.g.; 2048 data points), where the data is known characteristic (i.e.; known structures), and the data is used as an input to the neural network to train the neural network as shown in figure 6: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)].

Claim(s) 26:
	Regarding claim 26, Gnojewski and Sahu disclose all the elements of claims 25 and 21-22.
	Regarding claim 26, Gnojewski further discloses, “training the artificial neural network by backpropagation using the training data and the known characteristic values.” [See the back-propagation technique is used to train the neural network using the training data and the known characteristics: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)… “A cost function may be defined to evaluate a cumulative error at each output node. The cost function may be function showing a relationship between the cumulative errors at each output node as a function of the weighting functions. The minimization of the cost function through a number of iterations may form the basis of a back-propagation of parameters of the ANN 150. In the back-propagation, the ANN 150 may iterate through a process of training by changing the weighting functions according to predefined guidelines until some exit conditions are fulfilled. The exit condition may be defined based on a maximum number of iterations (e.g., 100), convergence of the cost function error below a predefined limit (e.g. 0.001), or the changes in weighing functions being less than a predefine threshold. The weighting function changes may occur at different times during the learning phase depending on the training approach selected by a user.” (¶33)].




Claims 27-29 are rejected under 35 U.S.C. 103 as being unpatentable over Gnojewski, and further in view of Singh et al. (US6594024B1) [hereinafter Singh] and Sahu.
Claim(s) 27:
Regarding claim 27, Gnojewski discloses, “an in-situ optical monitoring system to measure a spectrum of light reflected from the substrate” “a controller configured to receive, from then optical in-situ monitoring system, a measured spectrum of light reflected from the substrate undergoing processing,” [See optical monitoring system such as the reflectometer 130 collects a measured spectrum of light that is reflected from the sample (e.g.; semiconductor wafer) that undergoes a process of modifying thickness of a layer: “ANN 150 is trained based on the spectral intensity response of a reflectometer 130.” “Reflectometer 130 may be a polarized reflectometer, a spectrometer, an optical time domain reflectometer, or other similar device. Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110. Sample 110 may be a semiconductor sample/wafer, a biological sample, a biochemical/chemical sample, or other sample for which dimensions are to be determined.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer,” (¶23)];
[See the dimensionality is reduced such that the collected data is compressed to obtain a reduced number of component data: “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” (¶23)];
“generate a characterizing value using an artificial neural network,” [See using the artificial neural network ANN50, an output is generated, where the output corresponds to a thickness of the layer in a structure (e.g.; characterizing value): “a method 300 for determining dimensions using an artificial neural network, where the artificial neural network is trained based on the spectral intensity response of a reflectometer.” “At operation 330, ANN 150 is trained using the spectral central moments as the inputs to the ANN 150 to generate an output. The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)];
“the artificial neural network having a plurality of input nodes to receive the plurality of component values,” [See plurality of input nodes 652 as shown in figure 6 receiving input data 630 including plurality of component values as shown in figure 6, also see 710 in figure 7: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “in the measurement phase 540, as shown in FIG. 7 at block 710, the spectral intensity responses of areas of interest on the samples having similar features as the samples used for training the ANN 150 may be measured and converted to spectral central moments (SCM) at block 712. The SCM may then be applied, block 715, to the inputs of the trained ANN 750.” (¶39)];
“an output node to output the characterizing value,” [See the output nodes, 656 as shown in figure 6, and also see figure 7 inside 750: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “The outputs of the trained ANN 750 (block 720) represent expected reference metrology tool measurement results on the same areas of the same samples” (¶39)];
“a plurality of hidden nodes connecting the input nodes to the output node,” [See plurality of hidden nodes 654 as shown in figure 6, also see inside 750 in figure 7: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“ The feed-forward propagation may start with using the prepared set of inputs along with a set of initial values for the weighing functions relating the hidden nodes to the input nodes (Wji) and output nodes to the hidden nodes (Wkj) of the ANN 150 (see ANN 150 in FIG. 6).” (¶32)];
“determine at least one of whether to halt processing of the substrate or an adjustment for a processing parameter based on the characterizing value” [Examiner notes that the claim requires only one of 1. halt processing of the substrate or 2. an adjustment for a processing parameter. Gnojewski discloses: the neural network outputs a result, where the result is used for an adjustment to the thickness of the layer and for determining a halt (e.g.; endpoint) of the process: “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)], but doesn’t explicitly disclose, “A polishing system, comprising: a support to hold a polishing pad; a carrier head to hold a substrate in contact with the polishing pad; a motor to generate relative motion between the support and the carrier head;” measure a spectrum of light reflected from the substrate “during polishing;” “reduce a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra,”
However, Singh discloses, “A polishing system, comprising: a support to hold a polishing pad; a carrier head to hold a substrate in contact with the polishing pad; a motor to generate relative motion between the support and the carrier head;” [See the  “a CMP driving system 170 which operate cooperatively in order to control a CMP device 171” “The CMP driving system 170 selectively controls the CMP device 171. The CMP device 171 performs the polishing process and includes one or more CMP process components such as, for example, a spindle 173, a polishing pad 175, an optical wave guide 177 (optical fiber), and/or a polishing liquid 179.” (col 8, lines 1-12)… “The CMP device 171 components such as, for example, the spindle 173, the polishing pad 175 and the optical wave guide 177 are positioned to begin the polishing process.” (col 11, lines 3-6)];
 measure a spectrum of light reflected from the substrate “during polishing;” [See the measurement of the spectrum of the light is performed during the polishing process: “the system 100 being employed to monitor a chemical mechanical polishing process” “the wafer 110 (top layer 115 and substrate 113) is undergoing a CMP process;” “The CMP device 171 components such as, for example, the spindle 173, the polishing pad 175 and the optical wave guide 177 are positioned to begin the polishing process.” “during the polishing process, the CMP monitoring system 150 may be employed. The target light source 185 projects one or more beams of light 205 onto the layer 115 (top surface of wafer 110). Reflected light 210 is detected by the one or more light detectors 187 and collected by the CMP monitoring system 150 according to scatterometry techniques.” (col 10, lines 64-67 and col 11, lines 3-13)], but doesn’t explicitly disclose, “reduce a dimensionality of 
However, Sahu discloses, “reduce a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra,” [See dimensionality of measured spectrum is reduced to generate plurality of component by performing transformation of the measured spectrum (e.g.; orthogonal decomposition, spectral mixture resolution, etc.) based on plurality of reference spectrum (e.g.; predetermined spectrum data): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the polishing system with a support with carrier head that holds a polishing pad and keeps it in contact with the substrate layer, and motor that generates relative motion for polishing, and the capability of performing the measurement of the spectrum of the light during the polishing process taught by Singh, and combined the capability of reducing dimensionality of measured data to generate components by performing transformation [Sahu: “systems and methods disclosed herein offer significant computational savings over pre-existing hyperspectral imaging systems, enable methods to be performed more quickly (e.g., systems may be operated with high conveyor belt speeds),” (¶40)], and in order to effectively optimize the on-going polishing process [Singh: “Monitoring the subject wafer as it proceeds through the polishing process effectively allows one to visualize the wafer's appearance and progress in order to effectively optimize the on-going polishing process.” (col 4, lines 58-62)].

Claim(s) 28:
	Regarding claim 28, Gnojewski, Singh, and Sahu disclose all the elements of claim 27.
	Regarding claim 28, Gnojewski further discloses, “spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra” [See feature extraction process, where the system measures and generates spectrum data of samples, and then that data is used to extract features: “Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology).” (¶26)], but doesn’t explicitly disclose, “wherein the transformation is based on spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra.”
However, Sahu discloses, “wherein the transformation is based on spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra.” [See the transformation is based on extracting spectral features extraction from reference/predetermined spectra: “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)… “In step 410, spectral feature values, e.g., a mean value of reflectance intensity, may be calculated for each known material and/or for a class of known material, for which image data is acquired. In step 412, threshold values may be determined for known materials and/or product material. Threshold values may be implemented in processes similar to those described herein (e.g., in method 300) and may be compared to scanned line or pixel data acquired by a real-time detection and removal system” (¶78)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Sahu with the polishing monitoring system taught by Gnojewski, Singh, and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 27.

Claim(s) 29:
	Regarding claim 29, Gnojewski, Singh, and Sahu disclose all the elements of claim 27.
	Regarding claim 29, Gnojewski further discloses, “train the artificial neural network by backpropagation using training data” “and the known characteristic values.” [See the back-propagation technique is used to train the neural network using the training data and the known characteristics: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)… “A cost function may be defined to evaluate a cumulative error at each output node. The cost function may be function showing a relationship between the cumulative errors at each output node as a function of the weighting functions. The minimization of the cost function through a number of iterations may form the basis of a back-propagation of parameters of the ANN 150. In the back-propagation, the ANN 150 may iterate through a process of training by changing the weighting functions according to predefined guidelines until some exit conditions are fulfilled. The exit condition may be defined based on a maximum number of iterations (e.g., 100), convergence of the cost function error below a predefined limit (e.g. 0.001), or the changes in weighing functions being less than a predefine threshold. The weighting function changes may occur at different times during the learning phase depending on the training approach selected by a user.” (¶33)].
	“training data generated by performing dimensional reduction on two or more of the plurality of reference spectra that have known characteristic values,” [See dimensional reduction is performed in more than one data (e.g.; 2048 data points), where the data is known characteristic (i.e.; known structures), and the data is used as an input to the neural network to train the neural network as shown in figure 6: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)].
Claims 30 are rejected under 35 U.S.C. 103 as being unpatentable over Gnojewski, Singh, and Sahu, and further in view of Yoshida (US20180264619A1) [hereinafter Yoshida].
Claim(s) 30:
	Regarding claim 30, Gnojewski, Singh, and Sahu disclose all the elements of claim 27, but they do not explicitly disclose, “modify a pressure in the carrier head based on the characterizing value.”
	Regarding claim 30, Yoshida discloses, “modify a pressure in the carrier head based on the characterizing value.” [Examiner notes that applicant’s specification paragraph 25 describes characterizing value as “a thickness of the layer” as such Yoshida describes characterizing values as “film-thickness.” See the pressure of the head is modified/adjusted (e.g.; controlled/adjusted to meet target pressure values) based on characterizing value (e.g.; determining target pressure values based on film thickness): “The operation controller 7 is configured to set target pressure values for the pressure chambers P1 to P4, respectively, based on the film-thickness profile and the film-thickness distribution that have been produced from the film-thickness data, and operate the pressure regulators R1 to R4 so that the pressures in the pressure chambers P1 to P4 are maintained at the corresponding target pressure values.” (¶98)… “the polishing head including: a head body;” “and at least three pressure regulators configured to control pressures in the at least three actuating chambers” (¶23)].

[Yoshida: “can eliminate the variation in film thickness along a circumferential direction of a substrate” (¶26)].
Conclusion
	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure is listed in the PTO-892 Notice of Reference Cited document.
	
US20130129256A1 - Spectral image dimensionality reduction system and method:
	A method for reducing dimensionality of hyperspectral image data having a number of spatial pixels, each associated with a number of spectral dimensions, includes receiving sets of coefficients associated with each pixel of the hyperspectral image data, a set of basis vectors utilized to generate the sets of coefficients, and a maximum error value (¶8).

US20090275265A1 - Endpoint detection in chemical mechanical polishing using multiple spectra:
	A computer implemented method includes obtaining at least one current spectrum with an in-situ optical monitoring system, comparing the current spectrum to a 

US20100094790A1 - Machine learning of dimensions using spectral intensity response of a reflectometer:
	Methods and systems for determining critical dimensions using an artificial neural network, where the artificial neural network is trained based on spectral intensity response of a reflectometer (¶16).

US20070214098A1 - Neural network for determining the endpoint in a process:
	A system and method for determining an endpoint of an etching process by utilizing a neural network. By learning the features of a group of endpoint curves containing normal and abnormal features for an etch process, the neural network may determine the endpoint of the etch process through pattern recognition (¶14).

US20070042675A1 - Spectrum based endpointing for chemical mechanical polishing:
	A computer-implemented method that includes selecting a reference spectrum. The reference spectrum is a spectrum of white light reflected from a film of interest that is on a first substrate and that has a thickness greater than a target thickness. The method includes obtaining a current spectrum. The current spectrum is a spectrum of white light reflected from a film of interest that is on a second substrate and that has a 


	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMED SHAFAYET whose telephone number is (571)272-8239.  The examiner can normally be reached on M-F 8:30 AM-5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kenneth M Lo can be reached on (571)272-9774.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access 


/M.S./
Examiner
Art Unit 2116



/KENNETH M LO/Supervisory Patent Examiner, Art Unit 2116