DETAILED ACTION
Notice of AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 11-31 and 33 are pending.
Claim 32 is canceled.
Claims 11-31 and 33 are rejected.
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 07/07/2022 has been entered.
Response to Amendment
This Office Action is responsive to the RCE filed on 07/07/2022.
Claims 11, 18, 21, 27, and 31 are amended. Accordingly, the amended claims are being fully considered by the examiner.
In response to the drawings submitted on 07/07/2022, the drawing objections as set forth in the previous office action have been withdrawn.
In response to the applicant’s amendments to claims 11, 18, 21, 27, and 31, and cancelation of claim 32, all the 35 U.S.C. 112(b) rejections as set forth in the previous office action are withdrawn. However, upon further consideration of the amended claims, new grounds of 35 U.S.C. 112 rejections have been introduced in the current office action.
Information Disclosure Statement
The information disclosure statements (IDS) dated 07/19/2022 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements are being considered by the examiner.
Drawings.
The drawings filled on 07/07/2022 are acceptable for the examination purpose.














Claim Rejections - 35 USC § 112
35 U.S.C. 112(a)
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 11-31 and 33 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor at the time the application was filed, had possession of the claimed invention.
Claim 11:
	Claim 11 recites “receive one or more values of one or more processing or environmental variables that do not depend on a thickness of a layer in the substrate;” in lines 7-8. 
	This limitation describes receiving “one or more values” of “one or more processing or environmental variables,” where the received one or more values “do not depend on a thickness of a layer in the substrate”.
	The specification doesn’t disclose, receiving “one or more values” of “one or more processing or environmental variables,” where the received one or more values “do not depend on a thickness of a layer in the substrate.”
	Specification doesn’t provide any descriptions of: 
one or more values of one or more processing or environmental variables, 
receiving one or more values of one or more processing or environmental variables,
the received one or more values “do not depend on a thickness of a layer in the substrate”
	Regarding the processing or environmental variables, specification only describes, other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value. However, nowhere in the specification provides any details about “one or more values” of one or more processing or environmental variables.
	Regarding receiving one or more values of one or more processing or environmental variables, the specification doesn’t provide any description of receiving one or more values, where the one or more values are the values of the one or more processing or environmental variables.
	Regarding the received one or more values “do not depend on a thickness of a layer in the substrate,” specification doesn’t disclose that the received one or more values “do not depend on a thickness of a layer in the substrate,” where received one or more values are the values of the one or more processing or environmental variables
	The only description the specification provides regarding processing or environmental variables is that the other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value. However, nowhere in the specification, provides any details describing the limitation a)-c) above. See specification ¶48: “This permits the neural network 120 to take into account these other processing or environmental variables in calculation of the characterizing value.” such the specification describes other techniques can be used. However, specification doesn’t limit the shifting operation performed using tool length compensation only.
	One of the ordinary skilled in the art will not understand, based on the description in the specification that:
one or more values of one or more processing or environmental variables, 
receiving one or more values of one or more processing or environmental variables,
the received one or more values “do not depend on a thickness of a layer in the substrate”
	Based on the description in the specification, one of the ordinary skilled in the art will understand that the other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value.
	Appropriate correction is required.


Claim 11:
	Claim 11 recites “generate a characterizing value using a trained artificial neural network, the trained artificial neural network having a plurality of input nodes, the plurality of input nodes including a multiplicity of input nodes to receive the plurality of component values and one or more input nodes of the plurality of input nodes to receive the values of the one or more processing or environmental variables, an output node to output the characterizing value, and a plurality of hidden nodes connecting the plurality of input nodes to the output node, wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network;” in lines 12-20. 
	The specification doesn’t provide any description of “a trained” artificial neural network is used to “generate a characterizing value,” “wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network”
	Specification describes that the neural network is being trained and a two-step process is used to generate a characterizing value such that the dimensionality of the measured spectrum is reduced, and then the reduced dimensionality data is input to an artificial neural network, which will output the characterizing value; however, nowhere in the specification describes any details about “a trained” artificial neural network is used to “generate a characterizing value,” “wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network”
	See specification ¶59: “As of the configuration procedure for the neural network 120, the neural network 120 is trained using the component values and characteristic value for each reference spectrum.” ¶39: “The controller 90 can use a two-step process to generate a characterizing value from a measured spectrum from the in-situ spectrographic monitoring system 70. First, the dimensionality of the measured spectrum is reduced, and then the reduced dimensionality data is input to an artificial neural network, which will output the characterizing value. By performing this process for each measured spectrum, the artificial neural network can generate a sequence of characterizing values.”
	One of the ordinary skilled in the art will not understand, based on the description in the specification that “a trained” artificial neural network is used to “generate a characterizing value,” “wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network”
	Based on the description in the specification, one of the ordinary skilled in the art will understand that the neural network is being trained and a two-step process is used to generate a characterizing value such that the dimensionality of the measured spectrum is reduced, and then the reduced dimensionality data is input to an artificial neural network, which will output the characterizing value.
	Appropriate correction is required.



Claim 21:
	Claim 21 recites “receiving one or more values of one or more processing or environmental variables that do not depend on a thickness of a layer in the substrate;” in lines 6-7. 
	This limitation describes receiving “one or more values” of “one or more processing or environmental variables,” where the received one or more values “do not depend on a thickness of a layer in the substrate”.
	The specification doesn’t disclose, receiving “one or more values” of “one or more processing or environmental variables,” where the received one or more values “do not depend on a thickness of a layer in the substrate.”
	Specification doesn’t provide any descriptions of: 
one or more values of one or more processing or environmental variables, 
receiving one or more values of one or more processing or environmental variables,
the received one or more values “do not depend on a thickness of a layer in the substrate”
	Regarding the processing or environmental variables, specification only describes, other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value. However, nowhere in the specification provides any details about “one or more values” of one or more processing or environmental variables.
	Regarding receiving one or more values of one or more processing or environmental variables, the specification doesn’t provide any description of receiving one or more values, where the one or more values are the values of the one or more processing or environmental variables.
	Regarding the received one or more values “do not depend on a thickness of a layer in the substrate,” specification doesn’t disclose that the received one or more values “do not depend on a thickness of a layer in the substrate,” where received one or more values are the values of the one or more processing or environmental variables
	The only description the specification provides regarding processing or environmental variables is that the other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value. However, nowhere in the specification, provides any details describing the limitation a)-c) above. See specification ¶48: “This permits the neural network 120 to take into account these other processing or environmental variables in calculation of the characterizing value.” such the specification describes other techniques can be used. However, specification doesn’t limit the shifting operation performed using tool length compensation only.
	One of the ordinary skilled in the art will not understand, based on the description in the specification that:
one or more values of one or more processing or environmental variables, 
receiving one or more values of one or more processing or environmental variables,
the received one or more values “do not depend on a thickness of a layer in the substrate”
	Based on the description in the specification, one of the ordinary skilled in the art will understand that the other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value.
	Appropriate correction is required.


Claim 21:
	Claim 21 recites “generating a characterizing value using a trained artificial neural network, the trained artificial neural network having a plurality of input nodes, the plurality of input nodes including a multiplicity of input nodes to receive the plurality of component values and one or more input nodes of the plurality of input nodes to receive the values of the one or more processing or environmental variables, an output node to output the characterizing value, and a plurality of hidden nodes connecting the plurality of input nodes to the output node, wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network;” in lines 10-18. 
	The specification doesn’t provide any description of “a trained” artificial neural network is used to “generate a characterizing value,” “wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network”
	Specification describes that the neural network is being trained and a two-step process is used to generate a characterizing value such that the dimensionality of the measured spectrum is reduced, and then the reduced dimensionality data is input to an artificial neural network, which will output the characterizing value; however, nowhere in the specification describes any details about “a trained” artificial neural network is used to “generate a characterizing value,” “wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network”
	See specification ¶59: “As of the configuration procedure for the neural network 120, the neural network 120 is trained using the component values and characteristic value for each reference spectrum.” ¶39: “The controller 90 can use a two-step process to generate a characterizing value from a measured spectrum from the in-situ spectrographic monitoring system 70. First, the dimensionality of the measured spectrum is reduced, and then the reduced dimensionality data is input to an artificial neural network, which will output the characterizing value. By performing this process for each measured spectrum, the artificial neural network can generate a sequence of characterizing values.”
	One of the ordinary skilled in the art will not understand, based on the description in the specification that “a trained” artificial neural network is used to “generate a characterizing value,” “wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network”
	Based on the description in the specification, one of the ordinary skilled in the art will understand that the neural network is being trained and a two-step process is used to generate a characterizing value such that the dimensionality of the measured spectrum is reduced, and then the reduced dimensionality data is input to an artificial neural network, which will output the characterizing value.
	Appropriate correction is required.



Claim 27:
	Claim 27 recites “receive one or more values of one or more processing or environmental variables that do not depend on a thickness of a layer in the substrate;” in lines 10-11. 
	This limitation describes receiving “one or more values” of “one or more processing or environmental variables,” where the received one or more values “do not depend on a thickness of a layer in the substrate”.
	The specification doesn’t disclose, receiving “one or more values” of “one or more processing or environmental variables,” where the received one or more values “do not depend on a thickness of a layer in the substrate.”
	Specification doesn’t provide any descriptions of: 
one or more values of one or more processing or environmental variables, 
receiving one or more values of one or more processing or environmental variables,
the received one or more values “do not depend on a thickness of a layer in the substrate”
	Regarding the processing or environmental variables, specification only describes, other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value. However, nowhere in the specification provides any details about “one or more values” of one or more processing or environmental variables.
	Regarding receiving one or more values of one or more processing or environmental variables, the specification doesn’t provide any description of receiving one or more values, where the one or more values are the values of the one or more processing or environmental variables.
	Regarding the received one or more values “do not depend on a thickness of a layer in the substrate,” specification doesn’t disclose that the received one or more values “do not depend on a thickness of a layer in the substrate,” where received one or more values are the values of the one or more processing or environmental variables
	The only description the specification provides regarding processing or environmental variables is that the other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value. However, nowhere in the specification, provides any details describing the limitation a)-c) above. See specification ¶48: “This permits the neural network 120 to take into account these other processing or environmental variables in calculation of the characterizing value.” such the specification describes other techniques can be used. However, specification doesn’t limit the shifting operation performed using tool length compensation only.
	One of the ordinary skilled in the art will not understand, based on the description in the specification that:
one or more values of one or more processing or environmental variables, 
receiving one or more values of one or more processing or environmental variables,
the received one or more values “do not depend on a thickness of a layer in the substrate”
	Based on the description in the specification, one of the ordinary skilled in the art will understand that the other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value.
	Appropriate correction is required.


Claim 27:
	Claim 27 recites “generate a characterizing value using an artificial neural network in an inference mode, the artificial neural network having a plurality of input nodes, the plurality of input nodes including a multiplicity of input nodes to receive the plurality of component values in the inference mode and one or more input nodes of the plurality of input nodes to receive the one or more values of the one or more processing or environmental variables in the inference mode, an output node to output the characterizing value when the neural network operates in the inference mode, and a plurality of hidden nodes connecting the plurality of input nodes to the output node,” in lines 14-21. 
	The specification doesn’t provide any description of:
“generate a characterizing value” using an artificial neural network in an inference mode” such that the characterizing value is generated when artificial neural network in an inference mode.
Plurality of input nodes receiving the plurality of component when artificial neural network in an inference mode.
Input nodes receive the one or more values of the one or more processing or environmental variables when artificial neural network in an inference mode. Further as describes above specification doesn’t provide any description of  input nodes receiving the one or more values of the one or more processing or environmental variables.
Output node to output the characterizing value when the neural network operates in the inference mode.

	Further it’s not clear form the description of the specification, what it meant by interfacing mode, the definition of interfacing mode, and how the processes described in a)-d) are performed in interfacing mode.
	Applicant’s specification describes in ¶37, “the controller 90 can derive a characterizing value for each zone of the substrate based on the signal from the in-situ monitoring system.”
	One of the ordinary skilled in the art will not understand, based on the description in the specification the limitations as described in a)-d).
	Based on the description in the specification, one of the ordinary skilled in the art will understand that generating a characterizing value using an artificial neural network, the artificial neural network having a plurality of input nodes, the plurality of input nodes including a multiplicity of input nodes to receive the plurality of component values and one or more input nodes of the plurality of input nodes to receive the one or more values of the one or more processing or environmental variables, an output node to output the characterizing value, and a plurality of hidden nodes connecting the plurality of input nodes to the output node, where the other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value.
	Appropriate correction is required.

	

Claims 12-20, 31 and 33:
	Based on their dependencies in claim 11, claims 12-20, 31 and 33 also include same deficiencies as claim 11; therefore, for the same reasons as described above in claim 11, claims 12-20, 31 and 33 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor at the time the application was filed, had possession of the claimed invention.

Claims 22-26:
	Based on their dependencies in claim 21, claims 22-26 also include same deficiencies as claim 21; therefore, for the same reasons as described above in claim 21, claims 22-26 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor at the time the application was filed, had possession of the claimed invention.

Claims 28-30:
	Based on their dependencies in claim 27, claims 28-30 also include same deficiencies as claim 27; therefore, for the same reasons as described above in claim 27, claims 28-30 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor at the time the application was filed, had possession of the claimed invention.




















35 U.S.C. 112(b)

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 11-31 and 33 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Insufficient antecedent basis and unclear limitations:
Claim 11:
	Claim 11 recites “receive one or more values of one or more processing or environmental variables that do not depend on a thickness of a layer in the substrate;” in lines 7-8.
	From the description of the specification it’s not clear:
one or more values of one or more processing or environmental variables, 
receiving one or more values of one or more processing or environmental variables,
the received one or more values “do not depend on a thickness of a layer in the substrate”
	The following are not clear:
What these processing or environmental variables are
How the one or more values of one or more processing or environmental variables are collected. It’s not clear if they were stored, measured, detected, or inputted.
What are the relations between the “thickness of a layer in the substrate” and the “one or more values of one or more processing or environmental variables” such that the receiving of “one or more values of one or more processing or environmental variables” doesn’t depend on “thickness of a layer in the substrate.”
	The only description the specification provides regarding processing or environmental variables is that the other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value. See specification ¶48: “This permits the neural network 120 to take into account these other processing or environmental variables in calculation of the characterizing value.” such the specification describes other techniques can be used. However, specification doesn’t limit the shifting operation performed using tool length compensation only.
	For the examination purpose, the above described limitation cannot be construed. Thus, the limitation will be interpreted with broadest reasonable interpretation such that the value can be any values of any processing or environmental parameter irrespective of thickness of a layer in the substrate such that the values can be any values.
	Appropriate correction is required.

Claim 11:
	Claim 11 recites “generate a characterizing value using a trained artificial neural network, the trained artificial neural network having a plurality of input nodes, the plurality of input nodes including a multiplicity of input nodes to receive the plurality of component values and one or more input nodes of the plurality of input nodes to receive the values of the one or more processing or environmental variables, an output node to output the characterizing value, and a plurality of hidden nodes connecting the plurality of input nodes to the output node, wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network;” in lines 12-20.
	From the description of the specification it’s not clear:
How “a trained” artificial neural network is used to “generate a characterizing value,” because there is no clear description that the neural network is trained first then that “trained” neural network is used to “generate a characterizing value,”
How the values of the processing or environmental variables are received by “the one or more input nodes of the plurality of input nodes” after “training of the artificial neural network.” It’s not clear how artificial neural network trained such that after that training the values of the processing or environmental variables are received by “the one or more input nodes of the plurality of input nodes.” Further as described above, form the description of the specification it’s not clear what the values of the processing or environmental variables are.
	The only description the specification provides is that the neural network is being trained and a two-step process is used to generate a characterizing value such that the dimensionality of the measured spectrum is reduced, and then the reduced dimensionality data is input to an artificial neural network, which will output the characterizing value.
	See specification ¶59: “As of the configuration procedure for the neural network 120, the neural network 120 is trained using the component values and characteristic value for each reference spectrum.” ¶39: “The controller 90 can use a two-step process to generate a characterizing value from a measured spectrum from the in-situ spectrographic monitoring system 70. First, the dimensionality of the measured spectrum is reduced, and then the reduced dimensionality data is input to an artificial neural network, which will output the characterizing value. By performing this process for each measured spectrum, the artificial neural network can generate a sequence of characterizing values.”
	For the examination purpose, in broadest reasonable it is construed that:
Trained neural network can be any neural network that has been trained, or has some training, or is being trained.
The values of the one or more processing or environmental variables can be any values related to any processing or environmental variable, and processing or environmental variable can be any variables related to process or environment of the process.
The one or more input nodes of the plurality of input nodes receive any values related to environment or process during or after anytime including after the neural network is trained, or after the neural network gets some training, or while the neural network is being trained.
	Appropriate correction is required.

Claim 21:
	Claim 211 recites “receiving one or more values of one or more processing or environmental variables that do not depend on a thickness of a layer in the substrate;” in lines 6-7.
	From the description of the specification it’s not clear:
one or more values of one or more processing or environmental variables, 
receiving one or more values of one or more processing or environmental variables,
the received one or more values “do not depend on a thickness of a layer in the substrate”
	The following are not clear:
What these processing or environmental variables are
How the one or more values of one or more processing or environmental variables are collected. It’s not clear if they were stored, measured, detected, or inputted.
What are the relations between the “thickness of a layer in the substrate” and the “one or more values of one or more processing or environmental variables” such that the receiving of “one or more values of one or more processing or environmental variables” doesn’t depend on “thickness of a layer in the substrate.”
	The only description the specification provides regarding processing or environmental variables is that the other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value. See specification ¶48: “This permits the neural network 120 to take into account these other processing or environmental variables in calculation of the characterizing value.” such the specification describes other techniques can be used. However, specification doesn’t limit the shifting operation performed using tool length compensation only.
	For the examination purpose, the above described limitation cannot be construed. Thus, the limitation will be interpreted with broadest reasonable interpretation such that the value can be any values of any processing or environmental parameter irrespective of thickness of a layer in the substrate such that the values can be any values.
	Appropriate correction is required.

Claim 21:
	Claim 21 recites “generating a characterizing value using a trained artificial neural network, the trained artificial neural network having a plurality of input nodes, the plurality of input nodes including a multiplicity of input nodes to receive the plurality of component values and one or more input nodes of the plurality of input nodes to receive the values of the one or more processing or environmental variables, an output node to output the characterizing value, and a plurality of hidden nodes connecting the plurality of input nodes to the output node, wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network;” in lines 10-18.
	From the description of the specification it’s not clear:
How “a trained” artificial neural network is used to “generate a characterizing value,” because there is no clear description that the neural network is trained first then that “trained” neural network is used to “generate a characterizing value,”
How the values of the processing or environmental variables are received by “the one or more input nodes of the plurality of input nodes” after “training of the artificial neural network.” It’s not clear how artificial neural network trained such that after that training the values of the processing or environmental variables are received by “the one or more input nodes of the plurality of input nodes.” Further as described above, form the description of the specification it’s not clear what the values of the processing or environmental variables are.
	The only description the specification provides is that the neural network is being trained and a two-step process is used to generate a characterizing value such that the dimensionality of the measured spectrum is reduced, and then the reduced dimensionality data is input to an artificial neural network, which will output the characterizing value.
	See specification ¶59: “As of the configuration procedure for the neural network 120, the neural network 120 is trained using the component values and characteristic value for each reference spectrum.” ¶39: “The controller 90 can use a two-step process to generate a characterizing value from a measured spectrum from the in-situ spectrographic monitoring system 70. First, the dimensionality of the measured spectrum is reduced, and then the reduced dimensionality data is input to an artificial neural network, which will output the characterizing value. By performing this process for each measured spectrum, the artificial neural network can generate a sequence of characterizing values.”
	For the examination purpose, in broadest reasonable it is construed that:
Trained neural network can be any neural network that has been trained, or has some training, or is being trained.
The values of the one or more processing or environmental variables can be any values related to any processing or environmental variable, and processing or environmental variable can be any variables related to process or environment of the process.
The one or more input nodes of the plurality of input nodes receive any values related to environment or process during or after anytime including after the neural network is trained, or after the neural network gets some training, or while the neural network is being trained.
	Appropriate correction is required.

Claim 27:
	Claim 27 recites “receive one or more values of one or more processing or environmental variables that do not depend on a thickness of a layer in the substrate;” in lines 10-11.
	From the description of the specification it’s not clear:
one or more values of one or more processing or environmental variables, 
receiving one or more values of one or more processing or environmental variables,
the received one or more values “do not depend on a thickness of a layer in the substrate”
	The following are not clear:
What these processing or environmental variables are
How the one or more values of one or more processing or environmental variables are collected. It’s not clear if they were stored, measured, detected, or inputted.
What are the relations between the “thickness of a layer in the substrate” and the “one or more values of one or more processing or environmental variables” such that the receiving of “one or more values of one or more processing or environmental variables” doesn’t depend on “thickness of a layer in the substrate.”
	The only description the specification provides regarding processing or environmental variables is that the other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value. See specification ¶48: “This permits the neural network 120 to take into account these other processing or environmental variables in calculation of the characterizing value.” such the specification describes other techniques can be used. However, specification doesn’t limit the shifting operation performed using tool length compensation only.
	For the examination purpose, the above described limitation cannot be construed. Thus, the limitation will be interpreted with broadest reasonable interpretation such that the value can be any values of any processing or environmental parameter irrespective of thickness of a layer in the substrate such that the values can be any values.
	Appropriate correction is required.

Claim 27:
	Claim 27 recites “generate a characterizing value using an artificial neural network in an inference mode, the artificial neural network having a plurality of input nodes, the plurality of input nodes including a multiplicity of input nodes to receive the plurality of component values in the inference mode and one or more input nodes of the plurality of input nodes to receive the one or more values of the one or more processing or environmental variables in the inference mode, an output node to output the characterizing value when the neural network operates in the inference mode, and a plurality of hidden nodes connecting the plurality of input nodes to the output node,” in lines 14-21.
	From the description of the specification, it’s not clear:
How the system generates “a characterizing value” using an artificial neural network in an inference mode” such that the characterizing value is generated when artificial neural network in an inference mode. The meaning of the “inference mode” is not clear.
How plurality of input nodes receiving the plurality of component when artificial neural network in an inference mode such the meaning of the “inference mode” is not clear.
How input nodes receive the one or more values of the one or more processing or environmental variables when artificial neural network in an inference mode. Further as describes above, specification doesn’t provide any description of  input nodes receiving the one or more values of the one or more processing or environmental variables.
How output node outputs the characterizing value when the neural network operates in the inference mode.
	Further it’s not clear form the description of the specification, what it meant by interfacing mode, the definition of interfacing mode, and how the processes described in a)-d) are performed in interfacing mode.
	Applicant’s specification describes in ¶37, “the controller 90 can derive a characterizing value for each zone of the substrate based on the signal from the in-situ monitoring system.”
	For the examination purpose, in broadest reasonable it is construed that:
The other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value such that the value can be any values of any processing or environmental parameter irrespective of thickness of a layer in the substrate such that the values can be any values
The values of the one or more processing or environmental variables can be any values related to any processing or environmental variable, and processing or environmental variable can be any variables related to process or environment of the process.
The one or more input nodes of the plurality of input nodes receive any values related to environment or process during or after anytime including after the neural network is trained, or after the neural network gets some training, or while the neural network is being trained.
Generate a characterizing value using an artificial neural network when the artificial neural network can be in an operation mode (e.g.; after training or during training).
	Appropriate correction is required.


Claims 12-20, 31 and 33:
	Based on their dependencies in claim 11, claims 12-20, 31 and 33 also include same deficiencies as claim 11; therefore, for the same reasons as described above in claim 11, claims 12-20, 31 and 33 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Claims 22-26:
	Based on their dependencies in claim 21, claims 22-26 also include same deficiencies as claim 21; therefore, for the same reasons as described above in claim 21, claims 22-26 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.

Claims 28-30:
	Based on their dependencies in claim 27, claims 28-30 also include the same deficiencies as claim 27; therefore, for the same reasons as described above in claim 27, claims 28-30 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.








Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 11-26 are rejected under 35 U.S.C. 103 as being unpatentable over Gnojewski et al. (US20100094790A1) [hereinafter Gnojewski], and further in view of Sahu et al. (US20180100810A1) [hereinafter Sahu].
Claim(s) 11 (amended):
Regarding claim 11, Gnojewski discloses, “A computer program product for controlling processing of a substrate, the computer program product tangibly embodied in a non-transitory computer readable media and comprising instructions for causing one or more computers to:” [See the processor includes main memory, computer readable medium, and set of instructions, where the instructions are executed by a processor to perform functions for controlling processing of a substrate (i.e.; semiconductor processes): “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes.” (¶19)… “The machine 1200 may be a server computer, a client computer, a personal computer (PC), a tablet PC,” “or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.” (¶46)… “The example machine 1200 may include a processor 1260 (e.g., a central processing unit (CPU),” “a main memory 1270 and a static memory 1280, all of which communicate with each other via a bus 1208.” “the machine 1200 also may include” “a disk drive unit 1240,” (¶47)… “The disk drive unit 1240 may include a machine-readable medium 1222 on which is stored one or more sets of instructions (e.g., software) 1224 embodying any one or more of the methodologies or functions described herein.” (¶48)];
“receive, from an in-situ optical monitoring system, a measured spectrum of light reflected from a substrate undergoing processing that modifies a thickness of an outer layer of the substrate;” [See optical monitoring system such as the reflectometer 130 collects a measured (e.g.; 610) spectrum of light that is reflected from the sample (e.g.; semiconductor wafer) that undergoes a process of modifying thickness of a layer: “a set of N SIR data (see block 610) corresponding to N reflectometer measurements (e.g., measurement of dimensions) at N areas of interest on one or more samples are converted, at block 620, to N sets of spectral central moments by the processor 160 (see FIG. 1).” (¶37)… “the spectral central moments of the reflectometer measured SIR (e.g., inputs)” “used to determine values for the weighting functions connecting input nodes (see, for example, reference number 652 in FIG. 6)” (¶24)… “ANN 150 is trained based on the spectral intensity response of a reflectometer 130.” “Reflectometer 130 may be a polarized reflectometer, a spectrometer, an optical time domain reflectometer, or other similar device. Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110. Sample 110 may be a semiconductor sample/wafer, a biological sample, a biochemical/chemical sample, or other sample for which dimensions are to be determined.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer,” (¶23)];
“receive one or more values of one or more processing or environmental variables that do not depend on a thickness of a layer in the substrate;” [Examiner notes that claim requires receive at least one value of only one of 1. processing or 2. environmental variables.
Further, as described in the 112(b) section above, examiner notes that, in broadest reasonable interpretation, the received one value is not dependent on or rely on a thickness of a layer in the substrate.
Gnojewski teaches: See the system receives at least one value of processing or environmental variables (e.g.; from 640 to 630), where the value can be a value that is not related to thickness of a layer in the substrate: “At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “the metrology tool measurement results corresponding to the measured SIRs (e.g., target outputs)” “used to determine values for the weighting functions connecting input nodes (see, for example, reference number 652 in FIG. 6)” (¶24)];
 “reduce the dimensionality of the measured spectrum to generate a plurality of component values;” [See the dimensionality is reduced such that the collected data is compressed to obtain a reduced number of component data, such that a transformation is performed on the measured/collected data based on reference spectra: “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” (¶23)];
“generate a characterizing value using a trained artificial neural network,” [Examiner notes that, based on the broadest reasonable interpretation as described in the 112(b) section, generate a characterizing value using a trained artificial neural network such that the trained neural network can be any neural network that has been trained, or has some training, or is being trained.
See using the artificial neural network ANN50, an output is generated, where the output corresponds to a thickness of the layer in a structure (e.g.; characterizing value), where the artificial neural network is a trained artificial neural network: “a method 300 for determining dimensions using an artificial neural network, where the artificial neural network is trained based on the spectral intensity response of a reflectometer.” “At operation 330, ANN 150 is trained using the spectral central moments as the inputs to the ANN 150 to generate an output. The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)… “artificial neural network is trained based on the spectral intensity response of a reflectometer” (¶4)… “In the measurement phase, the trained ANN (e.g., with pre-determined weighting functions, as trained ANN 750 in FIG. 7) is used to determine the expected metrology tool measurement results using the reflectometer measured SIRs, without using the actual metrology tool.” (¶24)];
“the trained artificial neural network having a plurality of input nodes, the plurality of input nodes including a multiplicity of input nodes to receive the plurality of component values,” [See the plurality of input nodes 652 including a multiplicity of input nodes (e.g.; more than one input nodes). See plurality of input nodes 652 as shown in figure 6 receiving input data 610>>630 including plurality of component values as shown in figure 6, also see 710 in figure 7: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “in the measurement phase 540, as shown in FIG. 7 at block 710, the spectral intensity responses of areas of interest on the samples having similar features as the samples used for training the ANN 150 may be measured and converted to spectral central moments (SCM) at block 712. The SCM may then be applied, block 715, to the inputs of the trained ANN 750.” (¶39)];
“one or more input nodes of the plurality of input nodes to receive the one or more values of the one or more processing or environmental variables,” [Examiner notes that claim requires receive at least one value of only one of 1. processing or 2. environmental variables.
Further, as described in the 112(b) section above, examiner notes that, in broadest reasonable interpretation, the received one value is not dependent on or rely on a thickness of a layer in the substrate.
 Gnojewski teaches: See plurality of input nodes 652 as shown in figure 6 receiving input data including processing or environmental parameter values 640>>630: “At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “the metrology tool measurement results corresponding to the measured SIRs (e.g., target outputs)” “used to determine values for the weighting functions connecting input nodes (see, for example, reference number 652 in FIG. 6)” (¶24)];
“an output node to output the characterizing value,” [See the output nodes, 656 as shown in figure 6, and also see figure 7 inside 750: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “The outputs of the trained ANN 750 (block 720) represent expected reference metrology tool measurement results on the same areas of the same samples” (¶39)];
“and a plurality of hidden nodes connecting the plurality of input nodes to the output node; and” [See plurality of hidden nodes 654 as shown in figure 6, also see inside 750 in figure 7: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“ The feed-forward propagation may start with using the prepared set of inputs along with a set of initial values for the weighing functions relating the hidden nodes to the input nodes (Wji) and output nodes to the hidden nodes (Wkj) of the ANN 150 (see ANN 150 in FIG. 6).” (¶32)];
	“wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network;” [Examiner notes that claim requires receive at least one value of only one of 1. processing or 2. environmental variables.
Further, as described in the 112(b) section above, examiner notes that, in broadest reasonable interpretation, the one or more input nodes of the plurality of input nodes receive any values related to environment or process during or after anytime including after the neural network is trained, or after the neural network gets some training, or while the neural network is being trained. 
	Gnojewski teaches: See the system receives at least one value of processing or environmental variables (e.g.; from 640 to 630), where the artificial neural network is a trained artificial neural network: “At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “the metrology tool measurement results corresponding to the measured SIRs (e.g., target outputs)” “used to determine values for the weighting functions connecting input nodes (see, for example, reference number 652 in FIG. 6)” (¶24)… “artificial neural network is trained based on the spectral intensity response of a reflectometer” (¶4)… “In the measurement phase, the trained ANN (e.g., with pre-determined weighting functions, as trained ANN 750 in FIG. 7) is used to determine the expected metrology tool measurement results using the reflectometer measured SIRs, without using the actual metrology tool.” (¶24)];
	“determine at least one of whether to halt processing of the substrate or an adjustment for a processing parameter based on the characterizing value.” [Examiner notes that the claim requires only one of 1. halt processing of the substrate or 2. an adjustment for a processing parameter. Gnojewski discloses: the neural network outputs a result, where the result is used for an adjustment to the thickness of the layer and for determining a halt (e.g.; endpoint) of the process: “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)], but doesn’t explicitly disclose, “reduce a dimensionality of the measured spectrum to generate a plurality of component values by performing, on the measured spectrum, a transformation based on a plurality of reference spectra;”
However, Sahu discloses, “reduce a dimensionality of the measured spectrum to generate a plurality of component values by performing, on the measured spectrum, a transformation based on a plurality of reference spectra;” [See dimensionality of measured spectrum is reduced to generate plurality of component by performing transformation of the measured spectrum (e.g.; orthogonal decomposition, spectral mixture resolution, etc.) based on plurality of reference spectrum (e.g.; predetermined spectrum data): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of reducing dimensionality of measured data to generate components by performing transformation on the measured spectrum data based on predetermined spectrum data taught by Sahu with the polishing monitoring system taught by Gnojewski as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to enable methods to be performed more quickly and offer significant computational savings [Sahu: “systems and methods disclosed herein offer significant computational savings over pre-existing hyperspectral imaging systems, enable methods to be performed more quickly (e.g., systems may be operated with high conveyor belt speeds),” (¶40)].

Claim(s) 12:
	Regarding claim 12, Gnojewski and Sahu disclose all the elements of claim 11.
	Regarding claim 12, Gnojewski further discloses, “spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra” [See feature extraction process, where the system measures and generates spectrum data of samples, and then that data is used to extract features: “Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology).” (¶26)], but doesn’t explicitly disclose, “wherein the transformation is based on spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra.”
However, Sahu discloses, “the transformation is based on spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra” [See the transformation is based on extracting spectral features extraction from reference/predetermined spectra: “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)… “In step 410, spectral feature values, e.g., a mean value of reflectance intensity, may be calculated for each known material and/or for a class of known material, for which image data is acquired. In step 412, threshold values may be determined for known materials and/or product material. Threshold values may be implemented in processes similar to those described herein (e.g., in method 300) and may be compared to scanned line or pixel data acquired by a real-time detection and removal system” (¶78)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Sahu with the polishing monitoring system taught by Gnojewski and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 11.

Claim(s) 13:
	Regarding claim 13, Gnojewski and Sahu disclose all the elements of claims 11-12.
	Regarding claim 13, Gnojewski further discloses, “instructions to perform feature extraction on the plurality of reference spectra to generate the plurality of spectral” “components” [See feature extraction process, where the system measures and generates spectrum data of samples, and then that data is used to extract features: “Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology).” (¶26)], but doesn’t explicitly disclose, “instructions to perform feature extraction on the plurality of reference spectra to generate the plurality of spectral feature components.”
However, Sahu discloses, “instructions to perform feature extraction on the plurality of reference spectra to generate the plurality of spectral feature components.” [See the instructions to perform feature extraction on the reference spectra to generate plurality of spectral feature component (e.g.; spectral feature values): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)… “In step 410, spectral feature values, e.g., a mean value of reflectance intensity, may be calculated for each known material and/or for a class of known material, for which image data is acquired. In step 412, threshold values may be determined for known materials and/or product material. Threshold values may be implemented in processes similar to those described herein (e.g., in method 300) and may be compared to scanned line or pixel data acquired by a real-time detection and removal system” (¶78)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Sahu with the polishing monitoring system taught by Gnojewski and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 11.

Claim(s) 14:
	Regarding claim 14, Gnojewski and Sahu disclose all the elements of claims 11-13.
	Regarding claim 14, Gnojewski further discloses, “wherein the instructions to perform feature extraction comprise instructions to perform” “autoencoding” [Examiner notes that the claim requires only one of principal component analysis, singular value decomposition, independent component analysis, or autoencoding. Gnojewski discloses, autoencoding: See the autoencoding process, where an artificial neural network performs efficient data encoding automatically: “at block 710, the spectral intensity responses of areas of interest on the samples having similar features as the samples used for training the ANN 150 may be measured and converted to spectral central moments (SCM) at block 712. The SCM may then be applied, block 715, to the inputs of the trained ANN 750. The outputs of the trained ANN 750 (block 720) represent expected reference metrology tool measurement results on the same areas of the same samples. In other words, the trained ANN 750 may simulate the reference metrology tool.” (¶39)… “the trained ANN 750 measurement results may also be free from human and system errors involved in typical reference tool measurements.” (¶40)… “At operation 330, ANN 150 is trained using the spectral central moments as the inputs to the ANN 150 to generate an output. The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)], but doesn’t explicitly disclose, “wherein the instructions to perform feature extraction comprise instructions to perform principal component analysis, singular value decomposition, independent component analysis,”
However, Sahu discloses, “wherein the instructions to perform feature extraction comprise instructions to perform principal component analysis, singular value decomposition, independent component analysis,” [Examiner notes that the claim requires only one of principal component analysis, singular value decomposition, independent component analysis, or autoencoding. See the instructions to perform feature extraction comprise principal component analysis (e.g.; principal component analysis) and singular value decomposition (e.g.; orthogonal decomposition): “a combination of various data reduction and analysis techniques like principal component analysis, orthogonal decomposition, spectral mixture resolution, etc. could be used as a first step to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Sahu with the polishing monitoring system taught by Gnojewski and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 11.

Claim(s) 15:
	Regarding claim 15, Gnojewski and Sahu disclose all the elements of claim 11.
	Regarding claim 15, Gnojewski further discloses, “instructions to perform dimensional reduction on two or more of the plurality of reference spectra that have known characteristic values to generate training data.” [See dimensional reduction is performed in more than one data (e.g.; 2048 data points), where the data is known characteristic (i.e.; known structures), and the data is used as an input to the neural network to train the neural network as shown in figure 6: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)].

Claim(s) 16:
	Regarding claim 16, Gnojewski and Sahu disclose all the elements of claims 11 and 15.
	Regarding claim 16, Gnojewski further discloses, “instructions to train the artificial neural network by backpropagation using the training data and the known characteristic values.” [See the back-propagation technique is used to train the neural network using the training data and the known characteristics: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)… “A cost function may be defined to evaluate a cumulative error at each output node. The cost function may be function showing a relationship between the cumulative errors at each output node as a function of the weighting functions. The minimization of the cost function through a number of iterations may form the basis of a back-propagation of parameters of the ANN 150. In the back-propagation, the ANN 150 may iterate through a process of training by changing the weighting functions according to predefined guidelines until some exit conditions are fulfilled. The exit condition may be defined based on a maximum number of iterations (e.g., 100), convergence of the cost function error below a predefined limit (e.g. 0.001), or the changes in weighing functions being less than a predefine threshold. The weighting function changes may occur at different times during the learning phase depending on the training approach selected by a user.” (¶33)].


Claim(s) 17:
	Regarding claim 17, Gnojewski and Sahu disclose all the elements of claims 11 and 15-16.
	Regarding claim 17, Gnojewski further discloses, further discloses, “the two or more spectra includes fewer spectra than the plurality of spectra.” [See the spectra 850, 860, and 870 as shown in figure 8, where two or more spectra are less than the rest of the spectra (e.g.; spectra having wavelength less than 400 and less than 300 such that they are smaller than the rest of the spectra that have wavelength more than 400): “The reflectometer 130 measurements may be made at certain points on the sample. The result of the reflectometer 130 measurements may include SIRs such as the spectra shown in FIG. 8 (see spectra 850, 860, and 870). A SIR is an intensity versus wavelength function that defines an intensity at a certain wavelength. The intensity may be provided as a reflection coefficient at the certain wavelength.” (¶21)].

Claim(s) 18 (amended):
	Regarding claim 18, Gnojewski and Sahu disclose all the elements of claim 11.
Regarding claim 18, Gnojewski further discloses, “wherein the processing or environmental variables include at least one of a first measurement from another sensor in a processing system in which the substrate undergoes the processing, a second measurement from a sensor that is outside the processing system, a value from a processing recipe stored by a controller of the processing system, or a value of a variable tracked by the controller.”
“wherein the one or more processing or environmental variables include at least one of a measurement from another sensor in a processing system in which the substrate undergoes the processing, a value from a processing recipe stored by a controller of the processing system, or a value of a variable tracked by the controller.”
 [Examiner notes that the claim requires only one of a value from a processing recipe stored by a controller of the processing system, or a value of a variable tracked by the controller.
Further, as described in the 112(b) section above, examiner notes that, in broadest reasonable interpretation, processing or environmental variable can be any variables related to process or environment of the process.
Gnojewski discloses, receiving a value of a variable tracked by the controller: See the controller collects data related to the variable spectrum using the optical monitoring system such as the reflectometer 130, where 130 collects a measured spectrum of light that is reflected from the sample (e.g.; semiconductor wafer), and the data is input to the artificial neural network at input nodes 652: “ANN 150 is trained based on the spectral intensity response of a reflectometer 130.” “Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110.” (¶19)…  “The reflectometer 130 measurements may be made at certain points on the sample. The result of the reflectometer 130 measurements may include SIRs such as the spectra shown in FIG. 8 (see spectra 850, 860, and 870).” (¶21)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” (¶23)… “At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)].

Claim(s) 19:
	Regarding claim 19, Gnojewski and Sahu disclose all the elements of claim 11.
Regarding claim 19, Gnojewski further discloses, “instructions to cause a chemical mechanical polishing system to polish the substrate,” [See the processing include chemical mechanical polishing (i.e.; CMP) to polish a substrate: “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “Post-polish cleans may be used as part of the CMP technique to remove the slurry from the surface of the sample after removal of a silicon dioxide layer, before subjecting the sample to reflectometry or other metrology measurements” (¶20)].
“instructions” “to receive the measured spectrum of light from the in-situ optical monitoring system during polishing of the substrate.” [See optical monitoring system such as the reflectometer 130 collects a measured spectrum of light that is reflected from the sample (e.g.; semiconductor wafer) that undergoes a process of modifying thickness of a layer: “ANN 150 is trained based on the spectral intensity response of a reflectometer 130.” “Reflectometer 130 may be a polarized reflectometer, a spectrometer, an optical time domain reflectometer, or other similar device. Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110. Sample 110 may be a semiconductor sample/wafer, a biological sample, a biochemical/chemical sample, or other sample for which dimensions are to be determined.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer,” (¶23)].

Claim(s) 20:
	Regarding claim 20, Gnojewski and Sahu disclose all the elements of claims 11 and 19.
Regarding claim 20, Gnojewski further discloses, “the instructions to determine at least one of whether to halt processing of the substrate or an adjustment for a processing parameter comprise instructions to modify a pressure in a carrier head of the chemical mechanical polishing system.” [Examiner notes that the claim requires only one of 1. halt processing of the substrate or 2. an adjustment for a processing parameter. Gnojewski discloses: the neural network outputs a result, where the result is used for an adjustment to the thickness of the layer and for determining a halt (e.g.; endpoint) of the process: “determining” “endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)].

Claim(s) 21 (amended):
Regarding claim 21, Gnojewski discloses, “A method of processing a substrate, comprising: subjecting a substrate to processing that modifies a thickness of an outer layer of the substrate;” [See the method for controlling processing of a substrate (i.e.; semiconductor processes): “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes.” (¶19)… “An artificial neural network may be trained to generate an output, using the spectral central moments as inputs. The output may include simulated metrological data associated with the sample. The term metrology shall be taken to include theoretical and practical aspects of measurement (e.g., measurement of semiconductor samples).” (¶17)… “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)];
“measuring during the processing with an in-situ optical monitoring system a measured spectrum of light reflected from the substrate undergoing processing;” [See optical monitoring system such as the reflectometer 130 collects a measured spectrum of light that is reflected from the sample (e.g.; semiconductor wafer) that undergoes a process of modifying thickness of a layer: “ANN 150 is trained based on the spectral intensity response of a reflectometer 130.” “Reflectometer 130 may be a polarized reflectometer, a spectrometer, an optical time domain reflectometer, or other similar device. Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110. Sample 110 may be a semiconductor sample/wafer, a biological sample, a biochemical/chemical sample, or other sample for which dimensions are to be determined.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer,” (¶23)];
“receiving one or more values of one or more processing or environmental variables that do not depend on a thickness of a layer in the substrate;” [Examiner notes that claim requires receive at least one value of only one of 1. processing or 2. environmental variables.
Further, as described in the 112(b) section above, examiner notes that, in broadest reasonable interpretation, the received one value is not dependent on or rely on a thickness of a layer in the substrate.
Gnojewski teaches: See the system receives at least one value of processing or environmental variables (e.g.; from 640 to 630), where the value can be a value that is not related to thickness of a layer in the substrate: “At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “the metrology tool measurement results corresponding to the measured SIRs (e.g., target outputs)” “used to determine values for the weighting functions connecting input nodes (see, for example, reference number 652 in FIG. 6)” (¶24)];
“reducing a dimensionality of the measured spectrum to generate a plurality of component values” [See the dimensionality is reduced such that the collected data is compressed to obtain a reduced number of component data: “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” (¶23)];
“generating a characterizing value using a trained artificial neural network,” [Examiner notes that, based on the broadest reasonable interpretation as described in the 112(b) section, generate a characterizing value using a trained artificial neural network such that the Trained neural network can be any neural network that has been trained, or has some training, or is being trained.
See using the artificial neural network ANN50, an output is generated, where the output corresponds to a thickness of the layer in a structure (e.g.; characterizing value), where the artificial neural network is a trained artificial neural network: “a method 300 for determining dimensions using an artificial neural network, where the artificial neural network is trained based on the spectral intensity response of a reflectometer.” “At operation 330, ANN 150 is trained using the spectral central moments as the inputs to the ANN 150 to generate an output. The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)… “artificial neural network is trained based on the spectral intensity response of a reflectometer” (¶4)… “In the measurement phase, the trained ANN (e.g., with pre-determined weighting functions, as trained ANN 750 in FIG. 7) is used to determine the expected metrology tool measurement results using the reflectometer measured SIRs, without using the actual metrology tool.” (¶24)];
“the trained artificial neural network having a plurality of input nodes, the plurality of input nodes including a multiplicity of input nodes to receive the plurality of component values,” [See the plurality of input nodes 652 including a multiplicity of input nodes (e.g.; more than one input nodes). See plurality of input nodes 652 as shown in figure 6 receiving input data 610>>630 including plurality of component values as shown in figure 6, also see 710 in figure 7: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “in the measurement phase 540, as shown in FIG. 7 at block 710, the spectral intensity responses of areas of interest on the samples having similar features as the samples used for training the ANN 150 may be measured and converted to spectral central moments (SCM) at block 712. The SCM may then be applied, block 715, to the inputs of the trained ANN 750.” (¶39)];
“one or more input nodes of the plurality of input nodes to receive the one or more values of the one or more processing or environmental variables,” [Examiner notes that claim requires receive at least one value of only one of 1. processing or 2. environmental variables.
Further, as described in the 112(b) section above, examiner notes that, in broadest reasonable interpretation, the received one value is not dependent on or rely on a thickness of a layer in the substrate.
 Gnojewski teaches: See plurality of input nodes 652 as shown in figure 6 receiving input data including processing or environmental parameter values 640>>630: “At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “the metrology tool measurement results corresponding to the measured SIRs (e.g., target outputs)” “used to determine values for the weighting functions connecting input nodes (see, for example, reference number 652 in FIG. 6)” (¶24)];
“an output node to output the characterizing value,” [See the output nodes, 656 as shown in figure 6, and also see figure 7 inside 750: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “The outputs of the trained ANN 750 (block 720) represent expected reference metrology tool measurement results on the same areas of the same samples” (¶39)];
“a plurality of hidden nodes connecting the plurality of input nodes to the output node;” [See plurality of hidden nodes 654 as shown in figure 6, also see inside 750 in figure 7: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“ The feed-forward propagation may start with using the prepared set of inputs along with a set of initial values for the weighing functions relating the hidden nodes to the input nodes (Wji) and output nodes to the hidden nodes (Wkj) of the ANN 150 (see ANN 150 in FIG. 6).” (¶32)];
	“wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network;” [Examiner notes that claim requires receive at least one value of only one of 1. processing or 2. environmental variables.
Further, as described in the 112(b) section above, examiner notes that, in broadest reasonable interpretation, the one or more input nodes of the plurality of input nodes receive any values related to environment or process during or after anytime including after the neural network is trained, or after the neural network gets some training, or while the neural network is being trained. 
	Gnojewski teaches: See the system receives at least one value of processing or environmental variables (e.g.; from 640 to 630), where the artificial neural network is a trained artificial neural network: “At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “the metrology tool measurement results corresponding to the measured SIRs (e.g., target outputs)” “used to determine values for the weighting functions connecting input nodes (see, for example, reference number 652 in FIG. 6)” (¶24)… “artificial neural network is trained based on the spectral intensity response of a reflectometer” (¶4)… “In the measurement phase, the trained ANN (e.g., with pre-determined weighting functions, as trained ANN 750 in FIG. 7) is used to determine the expected metrology tool measurement results using the reflectometer measured SIRs, without using the actual metrology tool.” (¶24)];
“determining at least one of whether to halt processing of the substrate or an adjustment for a processing parameter based on the characterizing value.” [Examiner notes that the claim requires only one of 1. halt processing of the substrate or 2. an adjustment for a processing parameter. Gnojewski discloses: the neural network outputs a result, where the result is used for an adjustment to the thickness of the layer and for determining a halt (e.g.; endpoint) of the process: “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)], but doesn’t explicitly disclose, “reducing a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra;”
However, Sahu discloses, “reducing a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra;” [See dimensionality of measured spectrum is reduced to generate plurality of component by performing transformation of the measured spectrum (e.g.; orthogonal decomposition, spectral mixture resolution, etc.) based on plurality of reference spectrum (e.g.; predetermined spectrum data): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of reducing dimensionality of measured data to generate components by performing transformation on the measured spectrum data based on predetermined spectrum data taught by Sahu with the method taught by Gnojewski as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to enable methods to be performed more quickly and offer significant computational savings [Sahu: “systems and methods disclosed herein offer significant computational savings over pre-existing hyperspectral imaging systems, enable methods to be performed more quickly (e.g., systems may be operated with high conveyor belt speeds),” (¶40)].

Claim(s) 22:
	Regarding claim 22, Gnojewski and Sahu disclose all the elements of claim 21.
Regarding claim 22, Gnojewski further discloses, “the processing comprises chemical mechanical polishing.” [See the processing include chemical mechanical polishing (i.e.; CMP): “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “Post-polish cleans may be used as part of the CMP technique to remove the slurry from the surface of the sample after removal of a silicon dioxide layer, before subjecting the sample to reflectometry or other metrology measurements” (¶20)].

Claim(s) 23:
	Regarding claim 23, Gnojewski and Sahu disclose all the elements of claim 21.
	Regarding claim 23, Gnojewski further discloses, “performing feature extraction on the plurality of reference spectra” [See feature extraction process, where the system measures and generates spectrum data of samples, and then that data is used to extract features: “Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology).” (¶26)], but doesn’t explicitly disclose, “performing feature extraction on the plurality of reference spectra to generate the transformation.”
However, Sahu discloses, “performing feature extraction on the plurality of reference spectra to generate the transformation.” [See the instructions to perform feature extraction on the reference spectra to generate plurality of spectral feature component (e.g.; spectral feature values): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)… “In step 410, spectral feature values, e.g., a mean value of reflectance intensity, may be calculated for each known material and/or for a class of known material, for which image data is acquired. In step 412, threshold values may be determined for known materials and/or product material. Threshold values may be implemented in processes similar to those described herein (e.g., in method 300) and may be compared to scanned line or pixel data acquired by a real-time detection and removal system” (¶78)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Sahu with the method taught by Gnojewski and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 21.

Claim(s) 24:
	Regarding claim 24, Gnojewski and Sahu disclose all the elements of claims 23 and 21.
	Regarding claim 24, Gnojewski further discloses, “wherein the feature extraction comprises” “autoencoding” [Examiner notes that the claim requires only one of principal component analysis, singular value decomposition, independent component analysis, or autoencoding. Gnojewski discloses, autoencoding: See the autoencoding process, where an artificial neural network performs efficient data encoding automatically: “at block 710, the spectral intensity responses of areas of interest on the samples having similar features as the samples used for training the ANN 150 may be measured and converted to spectral central moments (SCM) at block 712. The SCM may then be applied, block 715, to the inputs of the trained ANN 750. The outputs of the trained ANN 750 (block 720) represent expected reference metrology tool measurement results on the same areas of the same samples. In other words, the trained ANN 750 may simulate the reference metrology tool.” (¶39)… “the trained ANN 750 measurement results may also be free from human and system errors involved in typical reference tool measurements.” (¶40)… “At operation 330, ANN 150 is trained using the spectral central moments as the inputs to the ANN 150 to generate an output. The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)], but doesn’t explicitly disclose, “wherein the feature extraction comprises principal component analysis, singular value decomposition, independent component analysis,”
However, Sahu discloses, wherein the feature extraction comprises principal component analysis, singular value decomposition, independent component analysis,” [Examiner notes that the claim requires only one of principal component analysis, singular value decomposition, independent component analysis, or autoencoding. See the instructions to perform feature extraction comprise principal component analysis (e.g.; principal component analysis) and singular value decomposition (e.g.; orthogonal decomposition): “a combination of various data reduction and analysis techniques like principal component analysis, orthogonal decomposition, spectral mixture resolution, etc. could be used as a first step to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Sahu with the method taught by Gnojewski and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 21.
Claim(s) 25:
	Regarding claim 25, Gnojewski and Sahu disclose all the elements of claims 21-22.
	Regarding claim 25, Gnojewski further discloses, “performing dimensional reduction on two or more of the plurality of reference spectra that have known characteristic values to generate training data.” [See dimensional reduction is performed in more than one data (e.g.; 2048 data points), where the data is known characteristic (i.e.; known structures), and the data is used as an input to the neural network to train the neural network as shown in figure 6: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)].

Claim(s) 26:
	Regarding claim 26, Gnojewski and Sahu disclose all the elements of claims 25 and 21-22.
	Regarding claim 26, Gnojewski further discloses, “training the artificial neural network by backpropagation using the training data and the known characteristic values.” [See the back-propagation technique is used to train the neural network using the training data and the known characteristics: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)… “A cost function may be defined to evaluate a cumulative error at each output node. The cost function may be function showing a relationship between the cumulative errors at each output node as a function of the weighting functions. The minimization of the cost function through a number of iterations may form the basis of a back-propagation of parameters of the ANN 150. In the back-propagation, the ANN 150 may iterate through a process of training by changing the weighting functions according to predefined guidelines until some exit conditions are fulfilled. The exit condition may be defined based on a maximum number of iterations (e.g., 100), convergence of the cost function error below a predefined limit (e.g. 0.001), or the changes in weighing functions being less than a predefine threshold. The weighting function changes may occur at different times during the learning phase depending on the training approach selected by a user.” (¶33)].




Claims 27-29 are rejected under 35 U.S.C. 103 as being unpatentable over Gnojewski, and further in view of Singh et al. (US6594024B1) [hereinafter Singh] and Sahu.
Claim(s) 27 (amended):
Regarding claim 27, Gnojewski discloses, “an in-situ optical monitoring system to measure a spectrum of light reflected from the substrate” “a controller configured to receive, from then optical in-situ monitoring system, a measured spectrum of light reflected from the substrate,” [See optical monitoring system such as the reflectometer 130 collects a measured spectrum of light that is reflected from the sample (e.g.; semiconductor wafer) that undergoes a process of modifying thickness of a layer: “ANN 150 is trained based on the spectral intensity response of a reflectometer 130.” “Reflectometer 130 may be a polarized reflectometer, a spectrometer, an optical time domain reflectometer, or other similar device. Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110. Sample 110 may be a semiconductor sample/wafer, a biological sample, a biochemical/chemical sample, or other sample for which dimensions are to be determined.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer,” (¶23)];
“receive one or more values of one or more processing or environmental variables that do not depend on a thickness of a layer in the substrate;” [Examiner notes that claim requires receive at least one value of only one of 1. processing or 2. environmental variables.
Further, as described in the 112(b) section above, examiner notes that, in broadest reasonable interpretation, the received one value is not dependent on or rely on a thickness of a layer in the substrate.
Gnojewski teaches: See the system receives at least one value of processing or environmental variables (e.g.; from 640 to 630), where the value can be a value that is not related to thickness of a layer in the substrate: “At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “the metrology tool measurement results corresponding to the measured SIRs (e.g., target outputs)” “used to determine values for the weighting functions connecting input nodes (see, for example, reference number 652 in FIG. 6)” (¶24)];
“reduce a dimensionality of the measured spectrum to generate a plurality of component values” [See the dimensionality is reduced such that the collected data is compressed to obtain a reduced number of component data: “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” (¶23)];
“generate a characterizing value using an artificial neural network in an inference mode,” [Examiner notes that, based on the broadest reasonable interpretation as described in the 112(b) section, generate a characterizing value using an artificial neural network such that the neural network can be any neural network that has been trained, or has some training, or is being trained.
See using the artificial neural network ANN50, an output is generated, where the output corresponds to a thickness of the layer in a structure (e.g.; characterizing value), where the artificial neural network is a trained artificial neural network: “a method 300 for determining dimensions using an artificial neural network, where the artificial neural network is trained based on the spectral intensity response of a reflectometer.” “At operation 330, ANN 150 is trained using the spectral central moments as the inputs to the ANN 150 to generate an output. The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)… “artificial neural network is trained based on the spectral intensity response of a reflectometer” (¶4)… “In the measurement phase, the trained ANN (e.g., with pre-determined weighting functions, as trained ANN 750 in FIG. 7) is used to determine the expected metrology tool measurement results using the reflectometer measured SIRs, without using the actual metrology tool.” (¶24)];
“the artificial neural network having a plurality of input nodes, the plurality of input nodes including a multiplicity of input nodes to receive the plurality of component values in the inference mode,” [See the plurality of input nodes 652 including a multiplicity of input nodes (e.g.; more than one input nodes). See plurality of input nodes 652 as shown in figure 6 receiving input data 610>>630 including plurality of component values as shown in figure 6, also see 710 in figure 7: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “in the measurement phase 540, as shown in FIG. 7 at block 710, the spectral intensity responses of areas of interest on the samples having similar features as the samples used for training the ANN 150 may be measured and converted to spectral central moments (SCM) at block 712. The SCM may then be applied, block 715, to the inputs of the trained ANN 750.” (¶39)];
“one or more input nodes of the plurality of input nodes to receive the one or more values of the one or more processing or environmental variables,” [Examiner notes that claim requires receive at least one value of only one of 1. processing or 2. environmental variables.
Further, as described in the 112(b) section above, examiner notes that, in broadest reasonable interpretation, the received one value is not dependent on or rely on a thickness of a layer in the substrate.
 Gnojewski teaches: See plurality of input nodes 652 as shown in figure 6 receiving input data including processing or environmental parameter values 640>>630: “At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “the metrology tool measurement results corresponding to the measured SIRs (e.g., target outputs)” “used to determine values for the weighting functions connecting input nodes (see, for example, reference number 652 in FIG. 6)” (¶24)];
“an output node to output the characterizing value when the neural network operates in the inference mode,” [As described in the 112(b) section above, examiner notes that, in broadest reasonable interpretation, generate a characterizing value using an artificial neural network when the artificial neural network can be in an operation mode (e.g.; after training or during training). 
See the output nodes, 656 as shown in figure 6, and also see figure 7 inside 750: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“At block 630, N training sets including N sets of spectral central moments from the block 620 and N target outputs resulting from N measurement by a reference tool (at block 640) are formed and applied for training the ANN 150.” (¶37)… “The outputs of the trained ANN 750 (block 720) represent expected reference metrology tool measurement results on the same areas of the same samples” (¶39)];
“a plurality of hidden nodes connecting the plurality of input nodes to the output node,” [See plurality of hidden nodes 654 as shown in figure 6, also see inside 750 in figure 7: “A typical ANN may include a predefined number of input, output, and hidden nodes connected via synaptic interconnects (see arrows connecting input nodes 652 to hidden nodes 654 and hidden nodes 654 to output nodes 656 in ANN 150 in FIG. 6) represented by weighting functions (e.g., Wji connecting hidden node number j to input node number i and Wkj connecting output node number k to hidden node number j).” (¶24)…“ The feed-forward propagation may start with using the prepared set of inputs along with a set of initial values for the weighing functions relating the hidden nodes to the input nodes (Wji) and output nodes to the hidden nodes (Wkj) of the ANN 150 (see ANN 150 in FIG. 6).” (¶32)];
“determine at least one of whether to halt processing of the substrate or an adjustment for a processing parameter based on the characterizing value” [Examiner notes that the claim requires only one of 1. halt processing of the substrate or 2. an adjustment for a processing parameter. Gnojewski discloses: the neural network outputs a result, where the result is used for an adjustment to the thickness of the layer and for determining a halt (e.g.; endpoint) of the process: “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique. The silicon dioxide layer to be removed may cover a grating structure that includes active regions such as, but not limited to, transistors.” (¶19)… “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The output may be, but is not limited to a simulated metrology result such as a critical dimension or a thickness of a layer in a structure.” (¶28)], but doesn’t explicitly disclose, “A polishing system, comprising: a support to hold a polishing pad; a carrier head to hold a substrate in contact with the polishing pad; a motor to generate relative motion between the support and the carrier head;” measure a spectrum of light reflected from the substrate “during polishing;” “receive” “a measured spectrum of light reflected from the substrate undergoing polishing” “reduce a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra,”
However, Singh discloses, “A polishing system, comprising: a support to hold a polishing pad; a carrier head to hold a substrate in contact with the polishing pad; a motor to generate relative motion between the support and the carrier head;” [See the polishing system 170 including CMP device 171 with carrier head and polishing pad 175 (e.g.; support that holds pad and includes a carrier head to push the pad against the layer), a spindle 173 (i.e.; motor that generates relative motion): “a CMP driving system 170 which operate cooperatively in order to control a CMP device 171” “The CMP driving system 170 selectively controls the CMP device 171. The CMP device 171 performs the polishing process and includes one or more CMP process components such as, for example, a spindle 173, a polishing pad 175, an optical wave guide 177 (optical fiber), and/or a polishing liquid 179.” (col 8, lines 1-12)… “The CMP device 171 components such as, for example, the spindle 173, the polishing pad 175 and the optical wave guide 177 are positioned to begin the polishing process.” (col 11, lines 3-6)];
 measure a spectrum of light reflected from the substrate “during polishing;” “receive” “a measured spectrum of light reflected from the substrate undergoing polishing” [See the measurement of the spectrum of the light is performed during the polishing process and the measured spectrum of the light is collected: “the system 100 being employed to monitor a chemical mechanical polishing process” “the wafer 110 (top layer 115 and substrate 113) is undergoing a CMP process;” “The CMP device 171 components such as, for example, the spindle 173, the polishing pad 175 and the optical wave guide 177 are positioned to begin the polishing process.” “during the polishing process, the CMP monitoring system 150 may be employed. The target light source 185 projects one or more beams of light 205 onto the layer 115 (top surface of wafer 110). Reflected light 210 is detected by the one or more light detectors 187 and collected by the CMP monitoring system 150 according to scatterometry techniques.” (col 10, lines 64-67 and col 11, lines 3-13)], but doesn’t explicitly disclose, “reduce a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra,”
However, Sahu discloses, “reduce a dimensionality of the measured spectrum to generate a plurality of component values using a transformation based on a plurality of reference spectra,” [See dimensionality of measured spectrum is reduced to generate plurality of component by performing transformation of the measured spectrum (e.g.; orthogonal decomposition, spectral mixture resolution, etc.) based on plurality of reference spectrum (e.g.; predetermined spectrum data): “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the polishing system with a support with carrier head that holds a polishing pad and keeps it in contact with the substrate layer, and motor that generates relative motion for polishing, and the capability of performing the measurement and receiving/collecting of the spectrum of the light during the polishing process taught by Singh, and combined the capability of reducing dimensionality of measured data to generate components by performing transformation on the measured spectrum data based on predetermined spectrum data taught by Sahu with the polishing monitoring system taught by Gnojewski as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to enable methods to be performed more quickly and offer significant computational savings [Sahu: “systems and methods disclosed herein offer significant computational savings over pre-existing hyperspectral imaging systems, enable methods to be performed more quickly (e.g., systems may be operated with high conveyor belt speeds),” (¶40)], and in order to effectively optimize the on-going polishing process [Singh: “Monitoring the subject wafer as it proceeds through the polishing process effectively allows one to visualize the wafer's appearance and progress in order to effectively optimize the on-going polishing process.” (col 4, lines 58-62)].

Claim(s) 28:
	Regarding claim 28, Gnojewski, Singh, and Sahu disclose all the elements of claim 27.
	Regarding claim 28, Gnojewski further discloses, “spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra” [See feature extraction process, where the system measures and generates spectrum data of samples, and then that data is used to extract features: “Reflectometer 130 may be used to measure spectral intensity response (SIR) of a sample 110.” “determining critical dimensions, feature sizes, layer thicknesses, and/or endpoints of semiconductor processes. The semiconductor processes may include, but are not limited to, removal of a silicon dioxide layer using a Chemical Mechanical Planarization (CMP) technique.” (¶19)… “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology).” (¶26)], but doesn’t explicitly disclose, “wherein the transformation is based on spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra.”
However, Sahu discloses, “wherein the transformation is based on spectral feature components extracted from the plurality of reference spectra by feature extraction applied to the plurality of reference spectra.” [See the transformation is based on extracting spectral features extraction from reference/predetermined spectra: “a combination of various data reduction and analysis techniques like” “orthogonal decomposition, spectral mixture resolution, etc. could be used” “to reduce the dimensionality of the data and extract relevant features which could be used by a classifier. Preferably, Euclidean distance is implemented during processing to simplify processing of the scanned lines and/or one or more pixels of a scanned line. The Euclidean distance quickly provides calculated spectrum values that may be compared to predetermined spectrum data or threshold values such that the presence of foreign material may be accurately detected.” (¶32)… “In step 410, spectral feature values, e.g., a mean value of reflectance intensity, may be calculated for each known material and/or for a class of known material, for which image data is acquired. In step 412, threshold values may be determined for known materials and/or product material. Threshold values may be implemented in processes similar to those described herein (e.g., in method 300) and may be compared to scanned line or pixel data acquired by a real-time detection and removal system” (¶78)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of Sahu with the polishing monitoring system taught by Gnojewski, Singh, and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination for the same reasons as described above in claim 27.

Claim(s) 29 (amended):
	Regarding claim 29, Gnojewski, Singh, and Sahu disclose all the elements of claim 27.
	Regarding claim 29, Gnojewski further discloses, “train the artificial neural network by backpropagation using training data” [See the back-propagation technique is used to train the neural network using the training data: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)… “A cost function may be defined to evaluate a cumulative error at each output node. The cost function may be function showing a relationship between the cumulative errors at each output node as a function of the weighting functions. The minimization of the cost function through a number of iterations may form the basis of a back-propagation of parameters of the ANN 150. In the back-propagation, the ANN 150 may iterate through a process of training by changing the weighting functions according to predefined guidelines until some exit conditions are fulfilled. The exit condition may be defined based on a maximum number of iterations (e.g., 100), convergence of the cost function error below a predefined limit (e.g. 0.001), or the changes in weighing functions being less than a predefine threshold. The weighting function changes may occur at different times during the learning phase depending on the training approach selected by a user.” (¶33)].
	“training data generated by performing dimensional reduction on two or more of the plurality of reference spectra that have known characteristic values,” [See dimensional reduction is performed in more than one data (e.g.; 2048 data points), where the data is known characteristic (i.e.; known structures), and the data is used as an input to the neural network to train the neural network as shown in figure 6: “The simulated SIR may relate to a number of measured dimensions associated with the known structure, including critical dimensions (e.g., dimension of smallest feature sizes in semiconductor technology). The measured dimensions of the known structure may be used as the input parameters of the RCWA.” (¶26)… “As a method of compressing the data set, the processor 160 may be used to automatically determine the spectral central moments of the reflectometer SIR data to reduce the amount of data to be used.” (¶21)… “The spectral central moments may be viewed as a compressed form of the SIR data.” “converting the SIR data into 3rd order spectral central moments may compress a set of 2048 data points into eight central moments for each polarization. The spectral central moments may represent certain weighted averages of the set of 2048 data points computed by the processor 160. Considering that a reflectometer 130 may be a polarized reflectometer that may generate an SIR including a set of 4096 data points for both polarizations, the total number of spectral central moments generated by the processor 160 for the two polarizations of the reflectometer 130 may include 16 points (i.e., 2*8).” (¶22)… “the spectral central moments associated with the SIRs resulting from a number of measurements may be used as inputs to the ANN 150 to train the ANN 150.” “The simulated results obtained from a trained ANN may represent expected values resulting from the measurement of the same feature (e.g., a thickness of a layer) measured by the reflectometer, as measured by a metrology tools” (¶23)… “The spectral central moments may be applied as inputs to the ANN 150 to train the ANN 150. The ANN 150, once trained, may be used to generate output data.” (¶27)].

Claims 30 are rejected under 35 U.S.C. 103 as being unpatentable over Gnojewski, Singh, and Sahu, and further in view of Yoshida (US20180264619A1) [hereinafter Yoshida].
Claim(s) 30:
	Regarding claim 30, Gnojewski, Singh, and Sahu disclose all the elements of claim 27, but they do not explicitly disclose, “modify a pressure in the carrier head based on the characterizing value.”
	Regarding claim 30, Yoshida discloses, “modify a pressure in the carrier head based on the characterizing value.” [Examiner notes that applicant’s specification paragraph 25 describes characterizing value as “a thickness of the layer” as such Yoshida describes characterizing values as “film-thickness.” See the pressure of the head is modified/adjusted (e.g.; controlled/adjusted to meet target pressure values) based on characterizing value (e.g.; determining target pressure values based on film thickness): “The operation controller 7 is configured to set target pressure values for the pressure chambers P1 to P4, respectively, based on the film-thickness profile and the film-thickness distribution that have been produced from the film-thickness data, and operate the pressure regulators R1 to R4 so that the pressures in the pressure chambers P1 to P4 are maintained at the corresponding target pressure values.” (¶98)… “the polishing head including: a head body;” “and at least three pressure regulators configured to control pressures in the at least three actuating chambers” (¶23)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of adjusting pressure based on characterizing values by Yoshida with the polishing monitoring system taught by Gnojewski, Singh, and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to eliminate variation in film thickness along a circumferential direction of a substrate [Yoshida: “can eliminate the variation in film thickness along a circumferential direction of a substrate” (¶26)].

Claims 31 and 33 are rejected under 35 U.S.C. 103 as being unpatentable over Gnojewski and Sahu, and further in view of David (US20180358271A1) [hereinafter David].
Claim(s) 31 (amended):
	Regarding claim 31, Gnojewski and Sahu disclose all the elements of claims 11 and 18, but they do not explicitly disclose, “wherein the measurement from the another sensor in the processing system in which the substrate undergoes the processing comprises a measurement of a temperature of the substrate, and wherein the measurement is obtained from a temperature sensor.”
However, David discloses, “wherein the measurement from the another sensor in the processing system in which the substrate undergoes the processing comprises a measurement of a temperature of the substrate, and wherein the measurement is obtained from a temperature sensor.” [See temperature measurement obtained (e.g.; of the CMP process) and the measured temperature is input: “The input data may be collected during wafer fabrication,” “input data can be collected from the process equipment 720 during steps for etch, CMP, gap fill, blanket, RTP, etc., and may include process variables such as” “temperature,” (¶85)… “Process equipment measurements or metrics can also be used as inputs to the algorithm, such as” “temperature sensors,” “This data can be collected in process steps” “Examples of these include” “CMP polish times,” (¶75)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of measuring temperature of the substrate with a temperature sensor and inputting the measured temperature data taught by David with the polishing monitoring system taught by Gnojewski and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to increase yield and improve device performance [David: “this new access to deeper understanding and insight can then be leveraged to increase yield, improve device performance,” (¶41)].



Claim(s) 33:
	Regarding claim 33, Gnojewski and Sahu disclose all the elements of claims 11 and 18, but they do not explicitly disclose, “wherein the value from the processing recipe stored by the controller of the processing system comprises a polishing parameter used for polishing the substrate, and wherein the value is obtained from the controller.”
However, David discloses, “wherein the value from the processing recipe stored by the controller of the processing system comprises a polishing parameter used for polishing the substrate, and wherein the value is obtained from the controller.” [See a value of polishing parameter such as pressure is obtained from a controller: “input data can be collected from the process equipment 720 during steps for etch, CMP, gap fill, blanket, RTP, etc., and may include process variables such as” “pressure,” (¶85)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of obtaining a value of a parameter such as pressure taught by David with the polishing monitoring system taught by Gnojewski and Sahu as discussed above. A person of ordinary skill in the spectrographic monitoring system field would have been motivated to make such combination in order to increase yield and improve device performance [David: “this new access to deeper understanding and insight can then be leveraged to increase yield, improve device performance,” (¶41)].


Response to Arguments
Applicant's arguments filed 07/07/2022 have been fully considered but they are not persuasive.
Applicant responds
(a)	Section 103 Rejections 
	Applicant respectfully submits that the cited portions of Gnojewski and Sahu, either alone or in combination, do not disclose or suggest these features of amended claim 11.
	In particular, the cited portions of Gnojewski do not disclose or suggest a trained artificial neural network that includes one or more input nodes that are configured to receive the values of the processing or environmental variables after training of the artificial neural network. Instead, 
	Accordingly, Applicant respectfully submits that claim 11 and its dependent claims are in condition for allowance. Independent claim 21 and its dependent claims are allowable for corresponding reasons.
(Page: 11-)

With respect to (a) above, Examiner appreciates the interpretative description given by Applicant in response. 
	As described in the 35 U.S.C. 112(a) and 112(b) section, there are the following deficiencies, such that there is no clear description in the specification for the following limitations:
“one or more values” of one or more processing or environmental variables,
“receiving” one or more values of one or more processing or environmental variables,
the received one or more values “do not depend on a thickness of a layer in the substrate,
generate a characterizing value using a trained artificial neural network
trained artificial neural network (e.g.; fully trained, and not trained when generating operation is performed)
one or more input nodes of the plurality of input nodes to receive the values of the one or more processing or environmental variables (see i) and ii) above)
wherein the one or more input nodes of the plurality of input nodes are configured to receive the values of the processing or environmental variables after training of the artificial neural network (see i) ii), and iv) above).

	Further regarding claim 27, as described in the 35 U.S.C. 112(a) and 112(b) section, there are the following deficiencies, such that there is no clear description in the specification for the following limitations:
“generate a characterizing value” using an artificial neural network in an inference mode” such that the characterizing value is generated when artificial neural network in an inference mode.
Plurality of input nodes receiving the plurality of component when artificial neural network in an inference mode.
Input nodes receive the one or more values of the one or more processing or environmental variables when artificial neural network in an inference mode. Further as describes above specification doesn’t provide any description of  input nodes receiving the one or more values of the one or more processing or environmental variables.
Output node to output the characterizing value when the neural network operates in the inference mode.

	In broadest reasonable interpretation, regarding the limitations that include the above described deficiencies, the following are construed for those limitations:
the value can be any values of any processing or environmental parameter irrespective of thickness of a layer in the substrate such that the values can be any values.
Trained neural network can be any neural network that has been trained, or has some training, or is being trained.
The values of the one or more processing or environmental variables can be any values related to any processing or environmental variable, and processing or environmental variable can be any variables related to process or environment of the process.
The one or more input nodes of the plurality of input nodes receive any values related to environment or process during or after anytime including after the neural network is trained, or after the neural network gets some training, or while the neural network is being trained.
	Regarding claim 27,
The other processing or environmental variables are taken into account in the purpose of calculation of the characterizing value such that the value can be any values of any processing or environmental parameter irrespective of thickness of a layer in the substrate such that the values can be any values
The values of the one or more processing or environmental variables can be any values related to any processing or environmental variable, and processing or environmental variable can be any variables related to process or environment of the process.
The one or more input nodes of the plurality of input nodes receive any values related to environment or process during or after anytime including after the neural network is trained, or after the neural network gets some training, or while the neural network is being trained.
Generate a characterizing value using an artificial neural network when the artificial neural network can be in an operation mode (e.g.; after training or during training).
	As described in the current office action, Gnojewski and Sahu disclose all the elements of claims 11 and 21, and Gnojewski, Singh, and Sahu disclose all the elements of claim 27. 
Applicant’s arguments are fully considered, but for the above described reasons, they are not persuasive; therefore, claims 11-31 and 33 are rejected under 35 USC 103 in view of the references as presented in the current office action.

(b)	Section 103 Rejections 
	Applicant respectfully submits that the cited portions of Gnojewski and Sahu, either alone or in combination, do not disclose or suggest these features of amended claim 11.
	In particular, the cited portions of Gnojewski do not disclose or suggest a trained artificial neural network that includes one or more input nodes that are configured to receive the values of the processing or environmental variables after training of the artificial neural network. Instead, 
	Accordingly, Applicant respectfully submits that claim 11 and its dependent claims are in condition for allowance. Independent claim 21 and its dependent claims are allowable for corresponding reasons.
(Page: 11-)

With respect to (b) above, Examiner appreciates the interpretative description given by Applicant in response.
Applicant’s arguments are fully considered, but for the same reasons as described above in a), they are not persuasive; therefore, claims 11-31 and 33 are rejected under 35 USC 103 in view of the references as presented in the current office action.














Conclusion
	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure is listed in the PTO-892 Notice of Reference Cited document.
	
US20140032463A1 - Accurate and fast neural network training for library-based critical dimension (cd) metrology:
	A method of accurate neural network training for library-based CD metrology includes optimizing a threshold for a principal component analysis (PCA) of a spectrum data set to provide a principal component (PC) value. A training target for one or more neural networks is estimated. The one or more neural networks are trained based both on the training target and on the PC value provided from optimizing the threshold for the PCA. A spectral library is provided based on the one or more trained neural networks (¶7).

	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMED SHAFAYET whose telephone number is (571)272-8239. The examiner can normally be reached M-F 8:30 AM-5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kenneth M Lo can be reached on (571)272-9774. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/M.S./
Examiner
Art Unit 2116



/KENNETH M LO/Supervisory Patent Examiner, Art Unit 2116