DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16 2013, is being examined under the pre-AIA  first to invent provisions.
The present application, filed on 09/24/2019. Claims 1-20 are pending and have been examined. Claims 1, 7 and 15 are independent claim.
The present application claims benefits of provisional application has PRO 62/738,060 (filed on 09/28/2018) and PCT/US19/53304 (filed on 09/26/2019).

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 02/14/2020. The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 7-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Claim 7:
Step 1:  Claim 7 recites method, thus a process, one of the four statutory categories of patentable subject matter.
Step 2A Prong 1:  The claim recites the limitations 
using the one or more outputs for semiconductor processing fault detection,
which is mental processes capable of being performed in the human mind (for example, these are observation and evaluation).
wherein the input is based on runs of manufacturing processes of semiconductor processing equipment; obtaining one or more outputs [from the trained LSTM RNN model], the one or more outputs comprising reconstruction data - "obtaining outputs comprising reconstruction data from inputs based on runs of manufacturing processes of semiconductor processing equipment" is a mental process. 
Step 2A Prong 2:  The claim recites the additional elements of 
trained long short-term memory (LSTM) recurrent neural network (RNN) model, The claim merely uses the trained long short-term memory (LSTM) recurrent neural network (RNN) model as a tool to perform the mental process (see MPEP 2106.05(f)).
Thus, none of the additional elements integrate the abstract idea into a practical application and the claim is directed towards an abstract idea.
Step 2B:  The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Mere instruction to apply a judicial exception does not amount to significant more. See MPEP 2106.05(f). Therefore, the claim does not include additional elements which provide an inventive concept nor represent significantly more than the abstract idea, and the claim is not patent eligible.
Claim 8: 
Claim 8 and dependent on claim 7, recites the additional mental process steps of abstract ideas (time windowing the trace data to generate a plurality of sequenced data sets, wherein each of the plurality of sequenced data sets corresponds to a respective time window, wherein the input comprises the plurality of sequenced data sets). Claim 8 further recites the additional elements (receiving, from a plurality of sensors, trace data corresponding to the manufacturing processes of the semiconductor processing equipment), which is insignificant extra-solution activity of mere data gathering (MPEP 2106.05(g)). The insignificant extra-solution activities of data gathering are well-understood routine and conventional (see MPEP 2106.05(d)(II), “Receiving or transmitting data over a network”). Claim 8 further recited the additional elements (semiconductor processing fault detection is associated with one or more of semiconductor manufacturing for wafers or display manufacturing) which is Field of Use and Technological Environment ((MPEP 2106.05(h), “Limiting the abstract idea of collecting information, analyzing it, and displaying certain results of the collection and analysis to data related to the electric power grid, because limiting application of the abstract idea to power-grid monitoring is simply an attempt to limit the use of the abstract idea to a particular technological environment, Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016)”).
Claim 11 and 12: 
Claim 11 and 12 dependent on Claim 7, recites the additional mental process steps of abstract ideas (wherein the input comprises a current plurality of sequenced data sets, wherein the encoder determines a compressed representation of the input, wherein the decoder uses the compressed representation to predict a future plurality of sequenced data sets; wherein using the one or more outputs for…. fault detection; comparing the input to the reconstruction data to generate model reconstruction error; and identifying an anomaly responsive to determining that the model reconstruction error is greater than a threshold error). Claim 11 and 12 further recited the additional elements (the LSTM RNN model comprises an encoder and a decoder ) the limitation does not make use of nor apply any abstract idea in order to provide an improvement in a computer or any other technology. The claim merely uses the trained long short-term memory (LSTM) recurrent neural network (RNN) model as a tool to perform the mental process (see MPEP 2106.05(f)).
Claim 9: 
Claim 9, dependent on Claim 8, recites the additional mental process steps of abstract ideas (“wherein the input comprises the plurality of sequenced data sets at a first set of windows of time, wherein the reconstruction data comprises predicted sequenced data sets at a second set of windows of time, wherein each window of time of the second set of windows of time is offset from a corresponding window of time of the first set of windows of time by one or more windows of time”) but no additional elements that could provide a practical application nor provide significantly more to abstract idea.
Claim 10: 
Claim 10, dependent on Claim 7 further recited the additional elements (wherein the LSTM RNN model comprises a plurality of layers of LSTM cells, wherein output of a first layer of the plurality of layers is input to a second layer of the plurality of layers) the limitation does not make use of nor apply any abstract idea in order to provide an improvement in a computer or any other technology. The claim merely uses the trained long short-term memory (LSTM) recurrent neural network (RNN) model as a tool to perform the mental process (see MPEP 2106.05(f)).
Claim 13:
Claim 13, dependent on Claim 12 further recited the additional mental process steps of abstract ideas (“generating a plurality of anomaly scores from the one or more outputs, wherein each of the plurality of anomaly scores corresponds to a respective sensor of a plurality of sensors; and ranking contribution to the model reconstruction error by each of the plurality of sensors based on the plurality of anomaly scores”; “causing an anomaly response action to occur in response to detecting the anomaly”) but no additional elements that could provide a practical application nor provide significantly more to abstract idea.
Claim 15 – 20: 
Claims 15-20 recites, non-transitory computer readable storage medium having instructions stored thereon, which, when executed by a processing device perform by a method of claims 7-12, thus generic computer components upon which to execute the recited abstract idea (MPEP 2106.05(f)). Therefore, Claims 15-20 are rejected for reason set forth in the rejection of claims 7-12, respectively.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention. 

Claims 1-12 and 14-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kim (“DeepNAP: Deep neural anomaly pre-detection in a semiconductor fab”).

Claim 1. 
Kim teaches A method comprising: training a long short-term memory (LSTM) recurrent neural network (RNN) model for semiconductor processing fault detection, the training of the LSTM RNN model comprising (1. Introduction & Page 2 “We propose DeepNAP, an RNN based model that pre-detects anomalies….. We demonstrate the efficacy of DeepNAP on the real multivariate semiconductor manufacturing process dataset” and 4.2. Prediction module & Page 4 “We use a recurrent neural network with long short-term memory (LSTM)):
generating training data for the LSTM RNN model, wherein the generating of the training data comprises generating first training input and first target output based on normal runs of manufacturing processes of semiconductor processing equipment (5.1. Dataset & Page 7, 1rd paragraph “Each point in the dataset was obtained every 30 s to consider communication loads between fab sensors and the manufacturing process”  and 5.2. Baselines & proposed models & Page 7, 2nd paragraph “we preprocessed the training set to make it completely normal … using a non-anomalous training dataset is crucial” and Table 1 and Table 2 teaches generating training data in the model); 
and providing the training data to train the LSTM RNN model on the first training input and the first target output to generate a trained LSTM RNN model for the semiconductor processing fault detection (5.2. Baselines & proposed models & Page 7, Paragraph 4-5 “we trained DeepNAP on raw training data which contains non-annotated anomalous signals….. we trained it on raw training data to show the effectiveness of partial reconstruction” teaches using training data to train model (LSTM) provide output ).

Claim 2. 
Kim teaches The method of claim 1 further comprising:
Kim further teaches receiving, from a plurality of sensors, trace data corresponding the normal runs of the manufacturing processes of the semiconductor processing equipment (5.1. Dataset & Page 7 “Each point in the dataset was obtained every 30 s to consider communication loads between fab sensors and the manufacturing process” teaches sensor data received in the system and 5.1. Dataset & Page 6 “We tested our pre-detection model on the real multivariate time series Semiconductor Manufacturing Process (SMP) dataset” system run of the Semiconductor Manufacturing Process); 
and time windowing the trace data to generate a plurality of sequenced data sets, wherein each of the plurality of sequenced data sets corresponds to a respective time window (4.2. Prediction module & Page 4, 3rd paragraph “Given an input sequence s0:t = [x0, x1, . . . , xt ], where x is a single multivariate data point and t is a current time step, we want to maximize the probability of observing the next k inputs, st+1:t+k = [xt+1, xt+2, . . . , xt+k]” teaches time windowing for the each of the sequenced data), 
wherein the first training input and the first target output are based on at least a subset of the plurality of sequenced data sets (Figure 2 teaches receiving output based on subset of input), 
wherein semiconductor processing fault detection is associated with one or more of semiconductor manufacturing for wafers or display manufacturing (5.1. Dataset & Page 6, Paragraph 6 “We tested our pre-detection model on the real multivariate time series Semiconductor Manufacturing Process (SMP) dataset” and Figure 1 teaches semiconductor processing provide fault detection ).
Claim 3. 
Kim teaches The method of claim 2, 
Kim further teaches wherein the first training input comprises a first subset of the plurality of sequenced data sets at a first set of windows of time and a second subset of the plurality of sequenced data sets at a second set of windows of time, wherein each window of time of the second set of windows of time is offset from a corresponding window of time of the first set of windows of time by one or more windows of time (4.2. Prediction module & Page 4, 3rd paragraph “Given an input sequence s0:t = [x0, x1, . . . , xt ], where x is a single multivariate data point and t is a current time step, we want to maximize the probability of observing the next k inputs, st+1:t+k = [xt+1, xt+2, . . . , xt+k]” and Figure 2 teaches first and second set of window of time).
Claim 4. 
Kim teaches The method of claim 2, 
Kim further teaches wherein the first target output is same as the first training input, wherein the first training input comprises the plurality of sequenced data sets (4.1. Overview of our approach & Page 4, 2nd paragraph “we feed input signals into our model for training” and 4.2. Prediction module & Page 4, Paragraph 4 “We use a recurrent neural network with long short-term memory (LSTM) as our encoder to capture the sequential patterns of inputs” and Figure 2 teaches output x1 and x2 is same as input x1 and x2).
Claim 5. 
Kim teaches The method of claim 1, 
Kim further teaches wherein the LSTM RNN model comprises a plurality of layers of LSTM cells, wherein output of a first layer of the plurality of layers is input to a second layer of the plurality of layers (3.1. Long short-term memory & Page 3, 3rd paragraph “A multi-layered LSTM can be constructed using ht as new inputs” teaches output of the first layer is the input pf the second layer).
Claim 6. 
Kim teaches The method of claim 1, 
Kim further teaches wherein the LSTM RNN model comprises an encoder and a decoder, wherein the encoder determines a compressed representation of the first training input, wherein the decoder uses the compressed representation to predict the first target output (Page 3 “Given an input sequence s0:t = [x0, x1, . . . , xt ], LSTM encodes the input sequence into hidden states, ht = LSTM(xt , ht−1, 
    PNG
    media_image1.png
    20
    14
    media_image1.png
    Greyscale
enc ) where 
    PNG
    media_image1.png
    20
    14
    media_image1.png
    Greyscale
enc denotes the encoder’s trainable parameters. The decoder decodes the last hidden state of the encoder into an output sequence [ ˆy0, ˆy1, . . . , ˆyk]” teaches LSTM comprising decoder and encoder wherein encoder compressed input sequence to hidden states, wherein the decoder use the that compressed data).

Claim 7:
Kim teaches A method comprising: providing input to a trained long short-term memory (LSTM) recurrent neural network (RNN) model (1. Introduction & Page 2 “We propose DeepNAP, an RNN based model that pre-detects anomalies….. We demonstrate the efficacy of DeepNAP on the real multivariate semiconductor manufacturing process dataset” and 4.2. Prediction module & Page 4 “We use a recurrent neural network with long short-term memory (LSTM)), 
wherein the input is based on runs of manufacturing processes of semiconductor processing equipment (4.4. Deep neural anomaly pre-detection model & Page 6, Paragraph 3 “Given an input sequence s0:t, the prediction module outputs the predicted sequence ˆst+1:t+k.” teaches input received and 5.1. Dataset & Page 6 “We tested our pre-detection model on the real multivariate time series Semiconductor Manufacturing Process (SMP) dataset” teaches based input runs of Semiconductor Manufacturing Process); 
obtaining one or more outputs from the trained LSTM RNN model, the one or more outputs comprising reconstruction data; and using the one or more outputs for semiconductor processing fault detection (5.2. Baselines & proposed models & Page 7 “we trained DeepNAP on raw training data which contains non-annotated anomalous signals” train the DeepNap model and 4.4. Deep neural anomaly pre-detection model & Page 4 “Given an input sequence s0:t, the prediction module outputs the predicted sequence ˆst+1:t+k…..The objective function for the detection module with partial reconstruction is as follows: 
    PNG
    media_image2.png
    57
    315
    media_image2.png
    Greyscale
 (15)
where  
    PNG
    media_image3.png
    25
    38
    media_image3.png
    Greyscale
 corresponds to the reconstructed version of  
    PNG
    media_image4.png
    29
    39
    media_image4.png
    Greyscale
 and 
    PNG
    media_image5.png
    37
    144
    media_image5.png
    Greyscale
 We jointly minimize Eqs. (10) and (15) to train DeepNAP in an end-to-end fashion… using predicted sequences can be useful in anomaly pre-detection, existing signals s0:t can also be helpful in detecting anomalies” teaches output received after trained model which comprising reconstruction data to provide output and output sequence useful in anomaly detection (fault detection)).

Claim 8: 
Kim teaches The method of claim 7 further comprising: 
Kim further teaches receiving, from a plurality of sensors, trace data corresponding to the manufacturing processes of the semiconductor processing equipment (5.1. Dataset & Page 7 “Each point in the dataset was obtained every 30 s to consider communication loads between fab sensors and the manufacturing process” teaches sensor data received in the system); 
and time windowing the trace data to generate a plurality of sequenced data sets, wherein each of the plurality of sequenced data sets corresponds to a respective time window (4.2. Prediction module & Page 4, 3rd paragraph “Given an input sequence s0:t = [x0, x1, . . . , xt ], where x is a single multivariate data point and t is a current time step, we want to maximize the probability of observing the next k inputs, st+1:t+k = [xt+1, xt+2, . . . , xt+k]” teaches time windowing for the each of the sequenced data), 
wherein the input comprises the plurality of sequenced data sets (Figure 2 teaches sequence data set), 
wherein semiconductor processing fault detection is associated with one or more of semiconductor manufacturing for wafers or display manufacturing (5.1. Dataset & Page 6, Paragraph 6 “We tested our pre-detection model on the real multivariate time series Semiconductor Manufacturing Process (SMP) dataset” and Figure 1 teaches semiconductor processing provide fault detection).

Claim 9. 
Kim teaches The method of claim 8, 
Kim further teaches wherein the input comprises the plurality of sequenced data sets at a first set of windows of time, wherein the reconstruction data comprises predicted sequenced data sets at a second set of windows of time, wherein each window of time of the second set of windows of time is offset from a corresponding window of time of the first set of windows of time by one or more windows of time (4.2. Prediction module & Page 4, 3rd paragraph “Given an input sequence s0:t = [x0, x1, . . . , xt ], where x is a single multivariate data point and t is a current time step, we want to maximize the probability of observing the next k inputs, st+1:t+k = [xt+1, xt+2, . . . , xt+k]” and Figure 2 teaches first and second set of window of time).

Claim 10. 
Kim teaches The method of claim 7, 
Kim further teaches wherein the LSTM RNN model comprises a plurality of layers of LSTM cells, wherein output of a first layer of the plurality of layers is input to a second layer of the plurality of layers (3.1. Long short-term memory & Page 3, 3rd paragraph “A multi-layered LSTM can be constructed using ht as new inputs” teaches output of the first layer is the input pf the second layer).

Claim 11. 
Kim teaches The method of claim 7, 
Kim further teaches wherein the LSTM RNN model comprises an encoder and a decoder, wherein the input comprises a current plurality of sequenced data sets, wherein the encoder determines a compressed representation of the input, wherein the decoder uses the compressed representation to predict a future plurality of sequenced data sets (Page 3 “Given an input sequence s0:t = [x0, x1, . . . , xt ], LSTM encodes the input sequence into hidden states, ht = LSTM(xt , ht−1, 
    PNG
    media_image1.png
    20
    14
    media_image1.png
    Greyscale
enc ) where 
    PNG
    media_image1.png
    20
    14
    media_image1.png
    Greyscale
enc denotes the encoder’s trainable parameters. The decoder decodes the last hidden state of the encoder into an output sequence [ ˆy0, ˆy1, . . . , ˆyk]” teaches LSTM comprising decoder and encoder wherein encoder compressed input sequence to hidden states, wherein the decoder use the that compressed data).

Claim 12:
Kim teaches The method of claim 7, 
Kim further teaches wherein using the one or more outputs for semiconductor processing fault detection comprises: comparing the input to the reconstruction data to generate model reconstruction error (4.3. Detection module & Page 5 “whether the signals are anomalous. Existing neural network based approaches mostly use reconstruction loss to detect anomalies. The outputs of the detection module used in the study of Malhotra [23] are the reconstructed inputs, ˆs0:t…..In our experiments, we observed that an LSTM function approximates an identity function to some degrees, which inhibits effective anomaly detection” teaches model anomaly detection which comparing data); 
and identifying an anomaly responsive to determining that the model reconstruction error is greater than a threshold error (5.2. Baselines & proposed models & Page 7, Paragraph 5 “As each model’s performance can vary depending on the threshold values, we controlled the thresholds of each model to achieve the best performances on the val-idation set…..we used the validation set to find hyper parameter sets that yield the highest AUC. The resulting hyper parameters of DeepNAP are listed in Table 2. Time and space complexities of each model are shown in Table 3” teaches threshold value compare to validation set and validation set provide highest AUC (reconstruction error)).

Claim 14:
 The method of claim 12, further comprising: causing an anomaly response action to occur in response to detecting the anomaly (“If a model accurately predicts unseen signals, an anomaly detection model can incorporate the predicted future informa- tion to make early alarms”).

Claim 15-20:
Claims 15-20 recites A non-transitory computer readable storage medium having instructions stored and executed by a processing device, the processing device for performing precisely the method of Claims 7-12, As Kim performs their method on a computer (Kim, 1. Introduction & Page 2 “We propose DeepNAP, an RNN based model that pre-detects anomalies….. We demonstrate the efficacy of DeepNAP on the real multivariate semiconductor manufacturing process dataset”) in which a non-transitory computer readable storage medium in inherent, Claim 15-20 are rejected for reasons set forth in the rejections of Claim 7-12, respectively.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Malhotra (“LSTM-based Encoder-Decoder for Multi-sensor Anomaly Detection”). 
Claim 13:
Kim teaches The method of claim 12 further comprising:
Kim further teaches generating a plurality of anomaly scores from the one or more outputs, wherein each of the plurality of anomaly scores corresponds to a respective sensor a plurality of sensors (5.1. Dataset & Page 7, Paragraph 1 “Each point in the dataset was obtained every 30 s to consider communication loads between fab sensors and the manufacturing process” and  5.3. Results & Page 8 “Table 4 summarizes the performance of each model on the SMP dataset….The F1 and AUC scores of our proposed models are significantly higher than those of baseline models. DeepNAP (-Prediction), which is a single partial reconstruction model, achieved an F1 score of 0.829 without using any recurrent neural networks" teaches anomaly scores from the output and anomaly scores corresponds to a sensor); 
and ranking contribution to the model reconstruction error by each of the plurality of sensors (5.1. Dataset & Page 7, Paragraph 1 “Each point in the dataset was obtained every 30 s to consider communication loads between fab sensors and the manufacturing process” teaches order of the error in model contribution).
While Kim teaches reconstruction error by each of the plurality sensor, Kim does not teach plurality of the sensors based on the plurality of anomaly scores .
Malhotra, however teaches reconstruction error by each of the plurality of sensors based on the plurality of anomaly scores (3.1. Datasets & Page 3 “contains readings for 12 sensors such as coolant temperature, torque, accelerator (control variable), etc”) and 2. EncDec-AD & Page 2 “The reconstruction errors are then used to obtain the likelihood of a point in a test time-series being anomalous s.t. for each point x (i), an anomaly score a (i) of the point being anomalous is obtained. A higher anomaly score indicates a higher likelihood of the point being anomalous” and 3.1. Datasets & Page 3 “contains readings for 12 sensors such as coolant temperature, torque, accelerator (control variable), etc” and Figure 1 teaches reconstruction error by the each of the anomaly score and anomaly score provide predictable and unpredictable signal).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Kim by using plurality of sensors based on the plurality of anomaly scores, as does Malhotra, as the reconstruction error by each of the plurality of sensors. The motivation to do so is that the “EncDec-AD gives better results for Engine-NP” and engine-NP use multi-sensor which is similar to that of Kim (Malhotra, 3.2. Observations & pg. 4, 2st column, 1th paragraph).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LOKESHA G PATEL whose telephone number is (571)272-6267. The examiner can normally be reached Monday-Friday 8am-5pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Afshar, Kamran can be reached on (571) 272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LOKESHA G PATEL/Examiner, Art Unit 2125     
/BRIAN M SMITH/Primary Examiner, Art Unit 2122