DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 02/23/2018, 03/16/2018, and 03/29/2019 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.

Amendments
This office action is in response to arguments filed on 10/28/2020. As per applicant’s request, claims 1, 5, 7, 11, 13-14, and 16 have been amended. No claims have been cancelled and no new claims have been added. Claims 1-20 remain pending.
Response to Arguments
Applicant's arguments filed 10/28/2020 have been fully considered but they are not persuasive.

On page 5 of remarks, applicant argues the following: “Amended claims 1 and 16 recite that “the fixed parameter represents dimensions of a measured structure.” This is not taught or suggested by Izikson or Pandev. The Office Action refers to Izikson’s See Office Action at p. 6. Izikson teaches that the weights are used to show relationships. See Izikson at col. 8, lines 43-45. Izikson is silent about using measurements or dimensions in this manner. Pandev fails to provide this missing teaching of Izikson.”

Examiners response,
	The examiner respectfully disagrees. The prior art mapping has been changed, as necessitated by the applicants amendments, in order to teach the claims. Izikson discloses a neural network that takes as input measured parameters of a wafer(as the critical parameters) in order to predict the critical dimensions of the wafer(as the fixed parameters that represent dimensions of a measure structure). The rejection is therefore maintained.

On page 6 of remarks, applicant argues the following: “Furthermore, claim 15 recites features of “the gradient-based searching.” Applicant respectfully submits that not all these features are taught or suggested by Izikson or Pandev. Claim 15 recites “iv) determining the fixed parameters of the critical parameters from the gradient-based search with one iteration” and “v) updating the fixed parameters with the fixed parameters using the gradient-based search with one iteration.” However, the Office Action refers to step 826 in Izikson for this teaching. See Office Action at p. 10. Izikson learns “[tjhrough many iterations and adjustments.” See Izikson at col. 10, lines 18-20. Thus, any update is after many iterations and not one iteration as recited in Applicant’s claim 15.”


Examiners response,
	The examiner respectfully disagrees. Although it is true that Izikson learns through many iterations and adjustments, each iteration in Izikson results in updating of the fixed parameters. Izikson discloses in Fig. 8 and Col. 9 lines 1-67 that backpropagation is used to update the neural network and that if the error exceeds a certain threshold in block 828, the training process will begin anew from block 820(i.e. a new iterations). Block 826 discloses that weights are adjusted, and with each adjustment, new values for the hidden nodes (820) and the output (822) are produced. Therefore each iteration results in an updated output using the updated (adjusted) weights and updated hidden nodes.

Applicants arguments with regard to the 35 U.S.C. 103 rejections of the dependent claims have been fully considered but are unpersuasive as they rely upon the allowability of the independent claims.

Applicants arguments with regard to the 35 U.S.C. 112b rejections of claims 4-11 and 13-15 have been fully considered and are persuasive. The 112b rejections have been withdrawn.
	
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-3, and 14-17 are rejected under 35 U.S.C. 103 as being unpatentable over Izikson (US 7,873,585 B2), in view of Pandev (US 2016/0282105 A1).

Regarding Claim 1,
	 Izikson teaches a method comprising: predicting, using a processor, (Col. 2, lines 59-67 and Col 3, lines 1-2, disclose using a processor to perform the operations) a value of a fixed parameter with a neural network based on a value of a critical parameter of a semiconductor wafer, (Col. 2, lines 1-20, discloses predicting critical dimension values (as the fixed parameters) from using inputs that comprise known parameter values measured from a plurality of targets at measurement locations of a wafer (as the critical parameters, Fig. 8, col.9, lines 1-27 discloses the inputs ) using a neural network) wherein the neural network is trained based on one or more of the critical parameters (Col. 2, lines 1-36, discloses that the neural network is trained based on the measured known parameter values of the wafer)…and wherein the fixed parameters represents dimensions of a measured structure.(Col. 2, Lines 1-20, critical dimensions (the fixed parameters) are dimensions of the wafer (measured structure) that are predicted from known parameters measured from specific locations on the wafer.
	Izikson does not explicitly teach training a neural network based on…a low-dimensional real-valued vector associated with a spectrum.
	However, Pandev teaches, in an analogous system, training a neural network based on…a low-dimensional real-valued vector associated with a spectrum. 
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural network, as taught by Izikson, to include the training of the neural network using the vector data associated with a spectrum, as taught by Pandev. One of ordinary skill in the art would have been motivated to make the modification in order to provide predictive results that are improved along with a reduction in computation and user time. (Pandev, [0089])
Regarding Claim 2,
	Izikson/Pandev teach the method of claim 1(and thus the rejection of claim 1 is incorporated herein.).
	Pandev further teaches wherein the spectrum is a spectroscopic ellipsometry spectrum… ([0032], Fig. 1, spectroscopic ellipsometer is equipped with illuminator 102 and a spectrometer 104, and therefore the spectrum must be a spectroscopic ellipsometry spectrum).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural network, as taught by Izikson/Pandev, to further include the spectroscopic ellipsometry spectrum, as taught by Pandev. One of ordinary skill in the art would have been Pandev, [0032])

Regarding Claim 3,
	Izikson/Pandev teach the method of claim 1(and thus the rejection of claim 1 is incorporated herein.).
	Pandev further teaches wherein the low-dimensional real-valued vector is mapped by another neural network based on the spectrum.([0057], discloses that the trained parameter isolation model 162 (the neural network) defines a mapping from raw measurement signals based on measurements of targets with a range of values of incidental parameters and one or more parameters of interest to reduced raw measurement signals.[0093] discloses that the data processed by the parameter isolation model is in vector form and can be two dimensional, one dimensional, or single point.)
	One of ordinary skill in the art would have been motivated to make the modification in order to measure characteristics of a semiconductor wafers and allow for spectral analysis of radiation. (Pandev, [0032])

Regarding Claim 14,
	Izikson/Pandev teach the method of claim 1(and thus the rejection of claim 1 is incorporated herein.)
	Izikson further teaches gradient-based searching of the fixed parameters with the neural network. (Col. 9, lines 64-67, and Col. 10, lines 1-16 disclose training a 

Regarding Claim 15,
	Izikson/Pandev teach the method of claim 1(and thus the rejection of claim 1 is incorporated herein.) 
	Izikson further teaches wherein the gradient-based searching includes:
i)    setting nominal values of the fixed parameters;(Fig. 8, step 822, Col. 9, lines 1-67, and Col. 10, lines 1-26, discloses determining values of the output (i.e. nominal values of the fixed parameters) based on values of the hidden nodes and weights.)
 ii)    determining the critical parameters of the fixed parameters at the nominal values; (Fig. 8, Col. 9, lines 1-67, and Col. 10, lines 1-26, discloses measured target metrics (808 and 812 as the fixed parameters) from an analyzed wafer (804) are input into the neural network to determine values for hidden nodes and use those values to determine the output (fixed parameters). Therefore, the critical parameters are already known (i.e. must have been determined) at the nominal values of the fixed parameters as the measured inputs are used to calculate the output of the neural network. 
iii)    determining the fixed parameters of the critical parameters using the neural network; (Fig. 8, step 822, Col. 9, lines 1-67, and Col. 10, lines 1-26, discloses 
iv)    determining the fixed parameters of the critical parameters from the gradient-based search with one iteration; (Fig. 8, step 822, Col. 9, lines 1-67, and Col. 10, lines 1-26, step 826 reiterates until an error threshold is met, the hidden node values at 820 are determined based on inputs and the adjusted weights aii. The critical parameters are included in the inputs of process metrics as disclosed in  Col. 7, lines 35-60, and therefore are determined after each iteration of weights adjustments (i.e. the adjusted nominal weight values using backpropagation are determined (step 826) and used in step 820 with the parameter inputs to determined hidden node values and use those values to determine output values in 822). Therefore, every iteration will include the adjusted weights (826) and inputs (critical parameters) used in the calculations (820) to calculate the output values (the fixed parameters in 822))
v)    updating the fixed parameters with the fixed parameters using the gradient-based search with one iteration; and (Fig. 8, step 826, Col. 9, lines 1-67, and Col. 10, lines 1-26, step 826 reiterates until an error threshold is met, the hidden node values at 820 are determined based on inputs and the adjusted weights aii, and then used to determine the values for the new output in 822. Since the adjusted weights are based on the error between a previously generated predicted output and measured output, the fixed parameters (outputs in block 822 generated from the adjusted weights) are updated with the fixed parameters (the error between the predicted output and 
vi)    repeating the steps i) through v) until a stopping criteria is achieved, wherein the stopping criteria is one of a specification…(Fig. 8, step 826, Col. 9, lines 64-67, and Col. 10, lines 1-26, backpropagation training continues until the specification error threshold in 828 is met.)
Regarding Claim 16,
	Izikson teaches  a system comprising: a primary neural network in electronic communication with a wafer metrology tool,(Col. 9, lines 8-42, discloses the neural network receives inputs related it the photolithography tool in block 810, and therefore must be in electronic communication with the tool) wherein the primary neural network is configured to predict a value of a fixed parameter based on a value of a critical parameter of a semiconductor wafer (Col. 2, lines 1-20, discloses predicting critical dimension values (as the fixed parameters) from using inputs that comprise known parameter values measured from a plurality of targets at measurement locations of a wafer (as the critical parameters, Fig. 8, col.9, lines 1-27 discloses the inputs ) using a neural network)…and wherein the fixed parameters represents dimensions of a measured.(Col. 2, Lines 1-20, critical dimensions (the fixed parameters) are dimensions of the wafer (measured structure) that are predicted from known parameters measured from specific locations on the wafer.)
	Izikson does not explicitly teach predicting…based on a low-dimensional real valued vector derived from a spectrum of the semiconductor wafer.
	However, Pandev teaches, in an analogous system, predicting…based on a low-dimensional real valued vector derived from a spectrum of the semiconductor wafer. ([0073], discloses predicting corresponding parameter values, [0093], discloses training the parameter isolation model to make predictions based on vectors of data that can be two dimensional, one dimensional, or even single point data.[0057] discloses that the parameter isolation model is implemented as a neural network. [0032] - 0033] discloses that the received data is from spectrometer 104 used on a semiconductor wafer.)
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural network, as taught by Izikson, to include the training of the neural network to make predictions using the vector data associated with a spectrum of a wafer, as taught by Pandev. One of ordinary skill in the art would have been motivated to make the modification in order to provide predictive results that are improved along with a reduction in computation and user time. (Pandev, [0089])


Regarding Claim 17,
	Izikson/Pandev teach the system of claim 16. (and thus the rejection of claim 16 is incorporated herein.)
	Pandev further teaches wherein the spectrum is a spectroscopic ellipsometry spectrum… ([0032], Fig. 1, spectroscopic ellipsometer is equipped with illuminator 102 and a spectrometer 104, and therefore the spectrum must be a spectroscopic ellipsometry spectrum).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural network, as taught by Izikson/Pandev, to further include the spectroscopic ellipsometry spectrum, as taught by Pandev. One of ordinary skill in the art would have been motivated to make the modification in order to measure characteristics of a semiconductor wafers and allow for spectral analysis of radiation. (Pandev, [0032])

Claims 4-10 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Izikson (US 7,873,585 B2), in view of Pandev (US 2016/0282105 A1), in view of Socher (US 2017/0140240 A1).

Regarding Claim 4,
	Izikson/Pandev teach the method of claim 1(and thus the rejection of claim 1 is incorporated herein.). Izkikson/Pandev further teach… and training, using the processor, the neural network to predict the fixed parameter based on one or more of the critical parameters and the low-dimensional real-valued vector. (see the rejection of claim 1)
 	Pandev further teaches training, using the processor, an initial neural network by mapping one or more of the spectrum to one or more of the low-dimensional real-valued vectors; ([0093], discloses training the parameter isolation model (as the initial neural network based on vectors of data that can be two dimensional, one dimensional, or even single point data. [0057] discloses that the parameter isolation model is implemented as a neural network. [0033] discloses that the received data is from spectrometer 104.).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction system containing a neural network, as taught by Izikson/Pandev, to include the initial neural network, as taught by Pandev. One of ordinary skill in the art would have been motivated to make the modification in order to provide predictive results that are improved along with a reduction in computation and user time. (Pandev, [0089])
	Although Izikson and Pandev disclose an initial neural network and the other neural network, they do not disclose a connection between the two wherein the real-valued vectors of the neural network are from the initial neural network.
	However, Socher teaches, in an analogous system …vectors of the neural network are from the initial neural network[s] (Fig. 1, [0023], discloses that the top output vectors from trained models 104 and 105 are applied as inputs to the trained model 112. [0023] also discloses that these trained models are neural networks.).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural Izikson/Pandev, to include the initial neural network output vectors being used as inputs to the other neural network, as taught by Socher. One of ordinary skill in the art would have been motivated to make the modification in order to process the top vectors output from trained models and make a prediction using the top vectors. (Socher, [0023])

Regarding Claim 5,
	Izikson/Pandev/Socher teach the method of claim 4. (and thus the rejection of claim 4 is incorporated herein.)
	Pandev further teaches generating the spectrum by simulating profiles sampled within a range of the floating parameters. ([0045] – [0049] discloses simulating measurement signals (generated spectrums of simulated profiles) wherein the measurement model is simulated with the known parameter values of interest. [0068] discloses simulating reference signals to include a range of values of the parameters of interest ([0010] discloses that parameters of interest are floated))
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural networks, as taught by Izikson/Pandev/Socher, to include generated spectrum based on simulating profiles sampled within a range of floating parameters, as further taught by Pandev. One of ordinary skill in the art would have been motivated to make the modification in order to provide predictive results that are improved along with a reduction in computation and user time. (Pandev, [0089])

Regarding Claim 6,
	Izikson/Pandev/Socher teach the method of claim 5. (and thus the rejection of claim 5 is incorporated herein.)
	Pandev further teaches generating profiles of an optical critical dimension model associated with the spectrum. ([0086], discloses measurement techniques applied to determine profile geometry parameters that include critical dimensions. [0094] discloses that the exemplary techniques can be beam profile ellipsometry (as the profile of an optical critical dimension model associated with the spectrum))
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural networks, as taught by Izikson/Pandev/Socher, to include the profiles of optical critical dimensions, as further taught by Pandev. One of ordinary skill in the art would have been motivated to make the modification in order to provide reduced signals as output which leads to streamlining measurement processes predictive results that are improved along with a reduction in computation and user time. (Pandev, [0089])

Regarding Claim 7,
	Izikson/Pandev/Socher teach the method of claim 6(and thus the rejection of claim 6 is incorporated herein.). Izikson/Pandev/Socher further teach generating the profiles of the optical critical dimension.
	Pandev further teaches determining a set of the fixed parameter by sampling the fixed parameters to within a range of the fixed parameters; and for each pair of spectra and fixed parameters, determining a corresponding critical parameter. ([0070] discloses that the computing system 130 generates reference raw signals synthetically from the same range of known parameter values (parameters which include critical dimension parameters) and corresponding nominal parameter values.)
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural networks, as taught by Izikson/Pandev/Socher, to include generated corresponding critical parameters from spectra and fixed parameters, as further taught by Pandev. One of ordinary skill in the art would have been motivated to make the modification in order to provide reduced signals as output which leads to streamlining measurement processes predictive results that are improved along with a reduction in computation and user time. (Pandev, [0089])

Regarding Claim 8,
	Izikson/Pandev/Socher teach the method of claim 6(and thus the rejection of claim 6 is incorporated herein.). 
	Pandev further teaches determining a corresponding floating parameter…with the corresponding critical parameter. ([0030], discloses isolating measurement signal information ([0033] discloses that measurement datais associated with specific measurements of specimen 112(wafer)) associated with a specific parameter of interest (corresponding floating parameter, [0010] discloses that parameters of interest are floated))
Izikson/Pandev/Socher, to include determining corresponding critical and floating parameters, as further taught by Pandev. One of ordinary skill in the art would have been motivated to make the modification in order to provide sufficiently accurate model-based measurement results with dramatically reduced computational effort. (Pandev, [0030])

Regarding Claim 9,
	Izikson/Pandev/Socher teach the method of claim 4. (and thus the rejection of claim 4 is incorporated herein.)
	Pandev further teaches wherein training data for the initial neural network includes a plurality of samples,([0013], discloses using raw measurement data collected from a number of measurement sites across the wafer for training of the parameter isolation module.) wherein each of the samples includes one or more of the spectrum([0033], measurement data includes one or more sampling processes from spectrometer 104) and profiles of an optical critical dimension model associated with one or more of the spectrum. ([0086], discloses measurement techniques applied to determine profile geometry parameters that include critical dimensions. [0094] discloses that the exemplary techniques can be beam profile ellipsometry (as the profile of an optical critical dimension model associated with the spectrum))
Izikson/Pandev/Socher, to include the plurality of samples, one or more spectrum, and profiles of optical critical dimension models, as further taught by Pandev. One of ordinary skill in the art would have been motivated to make the modification in order to provide sufficiently accurate model-based measurement results with dramatically reduced computational effort. (Pandev, [0030])

Regarding Claim 10,
	Izikson/Pandev/Socher teach the method of claim 4. (and thus the rejection of claim 4 is incorporated herein.)
	Pandev further teaches wherein training the initial neural network determines a parameter that minimizes a difference in the low-dimensional real-valued vector for a same one of one or more of the spectrum. ([0043]-[0044], discloses a regression process that is employed to determine specimen parameter values that minimize the differences between the model output values and the experimentally measured values (measured values from the spectrometer, which include the vectors as disclosed in [0093]) for a fixed set of machine parameter values.)
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural networks, as taught by Izikson/Pandev/Socher, to include the parameter that minimizes a difference in the vector value for a spectrum, as further taught by Pandev. One of ordinary skill in the art would have been motivated to make the modification in Pandev, [0045])

Regarding Claim 18,
	Izikson/Pandev teach the system of claim 16 (and thus the rejection of claim 16 is incorporated herein.). Izikson/Pandev teach… wherein the primary neural network receives the low-dimensional real-valued vector.
	Pandev further teaches the secondary neural network is configured to receive the spectrum of the semiconductor wafer and to derive the low-dimensional real-valued vector based on the received spectrum… ([0093], discloses training the parameter isolation model to make prediction (as the initial neural network that receives the spectrum based on vectors of data that can be two dimensional, one dimensional, or even single point data.) [0056] - [0057] discloses that the parameter isolation model is implemented as a neural network that is used to map reference signals to reduced signals. (i.e. derives reduced signals from received reference input signals). [0033] discloses that the received data is from spectrometer 104.).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction system containing a neural network, as taught by Izikson/Pandev, to include the secondary neural network, as taught by Pandev. One of ordinary skill in the art would have been motivated to make the modification in order to provide predictive results that are improved along with a reduction in computation and user time. (Pandev, [0089])
Izikson/Pandev disclose a secondary neural network and the primary neural network, they do not disclose a secondary neural network in electronic communication with the primary neural network…wherein the primary neural network receives the low-dimensional real-valued vector from the secondary neural network.
	However, Socher teaches, in an analogous prediction system, a secondary neural network in electronic communication with the primary neural network wherein the primary neural network receives the low-dimensional real-valued vector from the secondary neural network. (Fig. 1, [0023], discloses that the top output vectors from trained models 104 and 105 (secondary neural networks) are applied as inputs to the trained model 112. (primary neural network) [0023] also discloses that these trained models are neural networks.)
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural networks, as taught by Izikson/Pandev, to include the secondary neural network output vectors are being used as inputs to the primary neural network, as taught by Socher. One of ordinary skill in the art would have been motivated to make the modification in order to process the top vectors output from trained models and make a prediction using the top vectors. (Socher, [0023])


Regarding Claim 19,
	Izikson/Pandev/Socher teach the system of claim 18. (and thus the rejection of claim 18 is incorporated herein.) 
	Izikson further teaches wherein the primary neural network includes a processor. (Col. 2, lines 59-67 and Col 3, lines 1-2, disclose using a processor to perform the operations)

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Izikson (US 7,873,585 B2), in view of Pandev (US 2016/0282105 A1), in view of Socher (US 2017/0140240 A1), in view of Lee et al. (US 2013/0158957 A1), herein referred to as Lee.


Regarding Claim 11,
	Izikson/Pandev/Socher teach the method of claim 4(and thus the rejection of claim 4 is incorporated herein.). Izikson/Pandev/Socher further teach the fixed parameters.
	Izikson/Pandev/Socher does not explicitly teach wherein the neural network minimizes a mean squared error of the fixed parameters relative to training data.
	However, Lee teaches wherein the neural network minimizes a mean squared error of the… [output parameter]… relative to training data. ([0037], discloses that given a set of training data in a neural network, the training may be viewed as solving an optimization problem for minimizing a mean squared error. See 
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural networks, as taught by Izikson/Pandev/Socher, to include the minimizing of a squared error of the outputs relative to training data, as taught by Lee. One of ordinary skill in the art would have been motivated to make the modification in order to very quickly determine corresponding spectra based on a given profile. (Lee, [0037])

Claims 12 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Izikson (US 7,873,585 B2), in view of Pandev (US 2016/0282105 A1), in view of Socher (US 2017/0140240 A1), in view of Graepel et al. (US 2018/0032864 A1), herein referred to as Graepel.

Regarding Claim 12,
	Izikson/Pandev/Socher teach the method of claim 4(and thus the rejection of claim 4 is incorporated herein.).  
	Izikson/Pandev/Socher does not explicitly teach wherein the initial neural network and the neural network have different architectures.
	However, Graepel teaches, in an analogous prediction system,  wherein the…first…neural network and the…second…neural network have different architectures. ([0036], discloses that the value neural network 160 has a different type 
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural networks, as taught by Izikson/Pandev/Socher, to include the different architectures for each neural network, as taught by Graepel. One of ordinary skill in the art would have been motivated to make the modification in order to have neural networks that generate different kinds of outputs and that can be  trained to determine different values of the parameters of the neural network. (Graepel, [0035] – [0037])

Regarding Claim 20,
	Izikson/Pandev/Socher teach the system of claim 18(and thus the rejection of claim 18 is incorporated herein.).  
	Izikson/Pandev/Socher does not explicitly teach wherein the primary neural network and the secondary neural network have different architectures.
	However, Graepel teaches, in an analogous prediction system,  wherein the…first…neural network and the…second…neural network have different architectures. ([0036], discloses that the value neural network 160 has a different type of output layer from the SL neural network as the value neural network results in an output value being a single score.)
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural networks, as taught by Izikson/Pandev/Socher, to include the different architectures Graepel. One of ordinary skill in the art would have been motivated to make the modification in order to have neural networks that generate different kinds of outputs and that can be trained to determine different values of the parameters of the neural network. (Graepel, [0035] – [0037])


Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Izikson (US 7,873,585 B2), in view of Pandev (US 2016/0282105 A1) in view of Poslavsky et al. (US 9,347,872 B1), herein referred to as Poslavsky.

Regarding Claim 13,
	Izikson/Pandev teach the method of claim 1(and thus the rejection of claim 1 is incorporated herein.) Izikson/Pandev teach the fixed parameters and the spectrum. (see the rejection of claim 1) 
	Izikson/Pandev does not explicitly teach averaging the fixed parameters over samples used to generate the spectrum.
	However, Poslavsky teaches, in an analogous system, averaging the fixed parameters over samples used to generate the spectrum. (Col. 8, lines 30-67, disclose reference parameter values (as the fixed parameters) are an average of the parameter values determined based on measurements from each measurement system in the fleet. (as the samples used to generate the spectrum since the ellipsometer 101 is disclosed as a measurement system)
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the wafer parameter prediction neural network, as taught by Izikson/Pandev, to include the averaging of fixed parameters over samples used to generate the spectrum, as taught by Poslavsky. One of ordinary skill in the art would have been motivated to make the modification in order to have a more accurate measurement model that is used to estimate parameter values. (Poslavsky, Col. 8, lines 30-67)


	Conclusion	
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
	Hao et al. “Levenberg-Marquardt Training”. Discloses a Levenberg-Marquardt backpropagation training algorithm that is used to train a neural network throughout repeated training iterations.
           Dhandapani et al. (US 2019/0095797 A1), Discloses a method of semiconductor fabrication using a neural network to generate a set of control parameters that would achieve a target profile.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  


Any inquiry concerning this communication or earlier communications from the examiner should be directed to VASYL DYKYY whose telephone number is (571)270-5019.  The examiner can normally be reached on M-F 7:30 - 4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  






/V.D./Examiner, Art Unit 2122                                                                                                                                                                                                        

/ERIC NILSSON/Primary Examiner, Art Unit 2122