DETAILED ACTION
This is the first office action regarding application number 16/361,915, filed March 22, 2019.


Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Specification
The disclosure is objected to because of the following informalities:
Paragraph [0032]: The subscript abbreviations (PS, V0) are not defined (e.g.,                                 
                                    
                                        
                                            O
                                            p
                                            e
                                            n
                                        
                                        
                                            P
                                            S
                                        
                                    
                                
                            ,                                 
                                    
                                        
                                            O
                                            p
                                            e
                                            n
                                        
                                        
                                            V
                                            0
                                        
                                    
                                
                            ,                                 
                                    
                                        
                                            B
                                            r
                                            i
                                            d
                                            g
                                            e
                                        
                                        
                                            P
                                            S
                                        
                                    
                                
                            ). Appropriate correction is required.

Claim Objections
Claim 11 is objected to because of the following informality: A missing semi-colon mark [;] at the end of the following claim limitation: “computing, through the local phase of a volume diagnosis procedure, the training probability distributions from the diagnosis reports, each of the training probability distributions respectively corresponding to one of the training dies[;]”. Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 8-14 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claims do not fall within at least one of the four categories of patent eligible subject matter because the system recited in independent Claim 8 (and inherited in the associated dependent Claims 9-14) is directed to software per se, which is not one of the four categories of patent eligible subject matter recited in 35 U.S.C. 101 (process, machine, manufacture, or composition of matter). The terms “model training engine” and “volume diagnosis adjustment engine” recited in Claim 8 do not invoke 112(f) claim interpretation since the term “engine” is defined by the Merriam-Webster dictionary (merriam-webster.com/dictionary/engine, retrieved on 2/7/2022) as “computer software that performs a fundamental function especially of a larger program”, and hence the term “engine” is not considered as a nonce term/generic placeholder (as it fails to meet the 112(f) three-prong test at Step A). Therefore, according to the provided Merriam-Webster dictionary definition, both terms “model training engine” and “volume diagnosis adjustment engine” are forms of computer software programs, which results in independent Claim 8 only reciting software elements, and hence directing independent Claim 8 to a software per se implementation. Applicant is advised to positively recite hardware as part of this system identified in independent Claim 8 (i.e., a computer processor and memory/non-transitory machine-readable medium) in order to resolve the 101 rejection to allow eligibility of independent Claim 8 and its associated dependent Claims 9-14.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.

3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-5, 8-12, and 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over Cheng et al., Volume Diagnosis Data Mining, 2017 22nd IEEE European Test Symposium (ETS), 10 pages [hereafter referred as Cheng], in view of Benware et al., Determining a Failure Root Cause Distribution From a Population of Layout-Aware Scan Diagnosis Results, IEEE Design & Test of Computers, 2012, pp.8-18 [hereafter referred as Benware].
Regarding Claim 1, Cheng teaches
A method comprising: by a computing system (Examiner’s note: Cheng teaches accessing diagnosis reports containing multiple fault, defect, physical feature information, and creating and training a volume diagnosis Bayesian network model to determine probability distributions of faults and associated probable root causes, and using cross validation techniques to produce training and test data based on the domain knowledge information provided in the volume diagnosis reports, all of which require use of a computing system containing a computer processor executing instructions, where the instructions are stored on a computer-readable medium (Cheng p.2 Section II. Volume Diagnosis Model; pp.6-8 Section V. Volume Diagnosis Practical Usages, Figures 5(a),(b), and Figures 6(a)-(g)).): 
accessing a diagnosis report for a given circuit die that has failed scan testing (Examiner’s note: Cheng teaches a diagnosis driven yield analysis (DDYA) method where diagnosis reports are inspected Cheng p.1 Abstract; p.1 col.1 Section I. Introduction 1st-3rd paragraphs: “Scan diagnosis … is used to determine the defect locations and defect mechanism for a given failing device and the scan test patterns used. … With physical defect features reported by diagnosis tools, several papers have proposed to use volume (large amount of) diagnosis reports with appropriate statistical analysis to automatically identify a common physical defect feature …”; p.1 col.2 Section I. Introduction 2nd-6th paragraphs; and p.2 col.1 Section I. Introduction 4th-5th paragraphs: “… With volume diagnosis reports, DDYA uses statistical method to identify the correct distribution of these diagnosis reports … DDYA in this paper is based on MLE. So DDYA problem has two parts: how to get correct volume diagnosis likelihood model and how to ensure the distribution identified by MLE is correct with limited diagnosis reports. … In this paper, a Bayesian network [26] is used to model volume diagnosis reports …”).); 
computing, through a local phase of a volume diagnosis procedure, a probability distribution for the given circuit die from the diagnosis report, wherein the probability distribution specifies probabilities for different root causes as having caused the given circuit die to fail (Examiner’s note: In light of applicant’s specification paragraph [0010], a “local phase of a volume diagnosis procedure” is defined as a phase involving individual failed ICs (i.e., semiconductor devices). Cheng teaches creating a Bayesian network model by first determining a probability P(r) (i.e., the probability of one diagnosis report) based on a distribution of probabilities including identifying all mutually exclusive and independent root causes P(c) that are associated with specific defects and various identified faults for an individual failed IC associated with the single diagnosis report (Cheng p.2 col.1 last paragraph-col.2 4th paragraph (Section II. Volume Diagnosis Model).); 
adjusting the probability distribution into an adjusted probability distribution using a supervised learning model, the supervised learning model trained with a training set comprising training probability distributions computed from training dies through the local phase of the volume diagnosis procedure (Examiner’s note: Cheng teaches using a most-likelihood estimation (MLE) method to compute and Cheng p.2 col.1 last paragraph-col.2 4th paragraph (Section II. Volume Diagnosis Model): “… Based on Bayesian network we assume that                         
                            P
                            
                                
                                    v
                                
                            
                            =
                            
                                ∏
                                
                                    P
                                    (
                                    r
                                    )
                                
                            
                        
                    . That is the probability of all sampled volume diagnosis reports is equal to the product of the probability of each diagnosis report if all diagnosis reports are independent. … Most-Likelihood Estimation (MLE) finds the distribution of P(c) of all roots causes such at P(v) has maximum value. … MLE is accurate if the model used is correct and the sampled data is infinite. The accuracy of this Bayesian network depends on the accuracy of P(r|f), P(f|d), and P(d|c).”; Cheng p.6 col.1 2nd paragraph-col.2 2nd paragraph (Section V. Volume Diagnosis Practical Uses): “… The accuracy of P(v) depends on the accuracies of P(r|f), P(f|d) and P(d|c). … Without good domain knowledge, an alternative is to use supervised machine learning techniques to derive these parameters based on good training data. With more aggressive deep learning techniques it is possible that a new model can be created to replace the Bayesian network. Some domain knowledge is still needed to get proper training data.”; and Cheng p.6 col.2 4th paragraph-p.7 col.1 4th paragraph: “… As in volume diagnosis, P(r|f), P(f|d) and P(d|c) can be correct based on unlimited diagnosis reports … Over-fitting can be alleviated by increasing sample data size … There are several popular machine earning techniques to deal with the over-fitting problem … cross validation was used in DDYA … to divide the total sampled data set into N parts randomly. N-1 parts are used as training data and the remaining 1 part is used as test data. MLE finds the most likely distribution of training data and applied this distribution on testing data to measure its fitness.”).) …
providing the adjusted probability distribution for the given circuit die as an input … to determine a … distribution for multiple circuit dies that have failed the scan testing (Examiner’s note: As indicated earlier, Cheng further teaches applying cross-validation techniques to a total sampled data set representing a set of limited diagnosis reports, separating the sampled data set into training data and test data sets, where the test data set was used to evaluate the Bayesian network model and measure the fitness of the model by using the most likely distribution for the probability parameters that were identified and computed during the training phase (Cheng p.6 col.2 4th paragraph-p.7 col.2 2nd paragraph: “… As in volume diagnosis, P(r|f), P(f|d) and P(d|c) can be correct based on unlimited diagnosis reports … Over-fitting can be alleviated by increasing sample data size … There are several popular machine earning techniques to deal with the over-fitting problem … cross validation was used in DDYA … to divide the total sampled data set into N parts randomly. N-1 parts are used as training data and the remaining 1 part is used as test data. MLE finds the most likely distribution of training data and applied this distribution on testing data to measure its fitness. … The fitting distribution which is most similar to underlying distribution fits best in test data as shown in Figure 6(c) and 6(f).”).).  
While Cheng teaches building a supervised learning model with proper training data to further improve the probability parameters represented in the volume diagnosis Bayesian network, Cheng does not explicitly teach
… each training probability distribution labeled with an actual root cause that caused a given training die to fail …
… providing the adjusted probability distribution … as an input to a global phase of the volume diagnosis procedure to determine a global root cause distribution …
Benware teaches 
… each training probability distribution labeled with an actual root cause that caused a given training die to fail (Examiner’s note: Benware teaches performing experiments based on simulated Benware p.10 Figure 1), where the experiments involve creating a population of diagnosis reports and analyzing them using the RCD method for a root cause distribution with only a single root cause specified, where this specifying of a single root cause for a root cause distribution associated with a diagnosis report (containing defect information associated with possible root causes) is a form of labeling (Benware p.14 col.1-p.15 col.1 (Results from simulated defect experiments): “Experiments based on simulated defect responses in an IC have been carried out to evaluate the accuracy of RCD. In each experiment, the following steps are followed. 1) Specify a root cause distribution. … In each experiment, a population of diagnosis reports is created and analyzed with RCD for a root cause distribution with only a single root cause specified. Only a subset of possible root cause was used as the injected root cause, however, each root cause model type (e.g., critical area shorts) is represented in the results.”).) …
… providing the adjusted probability distribution … as an input to a global phase of the volume diagnosis procedure to determine a global root cause distribution (Examiner’s note: In light of applicant’s specification paragraph [0010], a “global phase of a volume diagnosis procedure” is defined as a phase involving a population of failed ICs (i.e., semiconductor devices). As indicated earlier, Benware teaches  performing experiments based on simulated defect responses using a root cause deconvolution (RCD) method involving the creation of a Bayesian network model (Benware p.10 Figure 1; p.11 Figure 2; pp.11-p.12 Creating the diagnosis Bayes net; and p.14 col.1-p.15 col.1 (Results from simulated defect experiments).), where this RCD analysis involves performing an expectation-maximization (EM) algorithm to learn and adjust the probability parameters identified in the Bayesian network equations                         
                            
                                
                                    c
                                
                                
                                    i
                                    ,
                                    l
                                
                                
                                    (
                                    t
                                    )
                                
                            
                        
                    ,                         
                            
                                
                                    θ
                                
                                
                                    l
                                
                                
                                    (
                                    t
                                    +
                                    1
                                    )
                                
                            
                        
                    , and P(                        
                            
                                
                                    R
                                    C
                                
                                
                                    i
                                
                            
                            =
                            
                                
                                    r
                                    c
                                
                                
                                    l
                                
                            
                            |
                            
                                
                                    θ
                                
                                
                                    
                                        
                                            t
                                            +
                                            1
                                        
                                    
                                
                            
                        
                    ) (Benware pp.12-14 Learning the parameter values, in particular p.12 col.2 3rd paragraph: “The primary objective of the learning phase is to determine the unknown parameter values of the model, in this case 𝛉, which leads to an understanding of the root cause distribution. … if one could accurately compute the likelihood of each root cause for each symptom, one could again determine the parameter 𝛉 by summing the likelihood values for each root cause and thus determining P(RC). … using the well-known expectation-maximization (EM) algorithm.”). Benware further teaches applying the RCD analysis (and hence the Bayesian network model and its adjusted probability distribution) on all cores for four lots manufactured on a 28-nm bulk process, where separate populations of failing devices were created for each layout configuration and each manufactured lot to produce a plurality of populations for RCD analysis (Benware pp.15-16 col.2 1st paragraph-p.16 1st paragraph (Results from 28-nm yield ramp): “… this section presents the results from applying the methodology to the early stages of a 28-nm yield ramp. … RCD was performed on all cores for four lots manufactured on a 28-nm bulk process … the data were processed with RCD as independent populations. A separate population of failing devices was created for each layout configuration and each manufactured lot, making for a total of 24 populations for RCD analysis. After all the analyses were completed, total root cause estimates per lot were obtained by summing the results obtained from each layout configuration.”).) …
Both Cheng and Benware are analogous art since they both teach using scan diagnosis reports to create a Bayesian network model to estimate and adjust root cause distributions.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the scan diagnosis reports taught in Cheng and label them with a specified root cause taught in Benware as a way to improve the estimations of root cause distributions produced by the Bayesian network model. The motivation to combine is taught in Benware, as limiting the number of identified root causes to the most relevant limits the complexity of a model/algorithm performing the root cause analysis, as well as reducing the occurrence of the training data inadvertently overfitting the model, effectively improving the memory storage required for the model as well as the accuracy of the predictions generated by the model/algorithm (Benware p.9 col.1 1st-3rd paragraph; p.13 col.2 3rd paragraph).
Regarding Claim 2, Cheng in view of Benware teaches
The method of claim 1, wherein the volume diagnosis procedure utilizes an unsupervised learning model to 
compute the probability distribution for the given circuit die (Examiner’s note: Under its broadest reasonable interpretation, an unsupervised learning model is a model that learns and groups patterns from received data. As indicated earlier, Cheng teaches creating a Bayesian network model by first determining a probability P(r) (i.e., the probability of one diagnosis report) based on a distribution of Cheng p.1 1st paragraph-p.2 1st paragraph and p.2 Figure 1 (Section I. Introduction); and Cheng p.2 col.1 last paragraph-col.2 4th paragraph (Section II. Volume Diagnosis Model).), 
determine the global root cause distribution for multiple circuit dies, or both.  
Regarding Claim 3, Cheng in view of Benware teaches
The method of claim 1, wherein the global phase of the volume diagnosis procedure is performed using a root cause deconvolution (RCD) model (Examiner’s note: As indicated earlier, Benware teaches simulating defect responses for a population of created diagnosis reports by injecting a root cause (such as a critical area short), and analyzing these reports using a root cause deconvolution method, including applying an EM algorithm to learn and adjust parameter values in a Bayesian network (Benware p.10 Figure 1; p.11 Figure 2; pp.11-p.12 Creating the diagnosis Bayes net; pp.12-13 Learning the parameter values; and p.14 col.1-p.15 col.1 (Results from simulated defect experiments).). As indicated earlier, Benware further teaches applying the RCD analysis (and hence the Bayesian network model and its adjusted probability distribution) on all cores for four lots manufactured on a 28-nm bulk process, where separate populations of failing devices were created for each layout configuration and each manufactured lot to produce a plurality of populations for RCD analysis (Benware pp.15-16 col.2 1st paragraph-p.16 1st paragraph (Results from 28-nm yield ramp)).).  
Regarding Claim 4, Cheng in view of Benware teaches
The method of claim 1, further comprising generating the supervised learning model, including by: 
accessing the training dies, wherein each training die has been injected with a given root cause to actually cause a scan test failure (Examiner’s note: As indicated earlier, Benware teaches simulating defect responses for a population of created diagnosis reports by injecting a root cause (such as a critical area short), and analyzing these reports using a root cause deconvolution (RCD) method, where each created diagnosis report contains failure information detected by scan testing on a tester for a defective Benware p.9 col.2 5th paragraph-p.10 col.1 1st paragraph (Layout-aware diagnosis) and p.14 col.1-p.15 col.1 (Results from simulated defect experiments).); 
generating diagnosis reports for each of the training dies (Examiner’s note: As indicated earlier, Benware teaches simulating defect responses for a population of created diagnosis reports, where each created diagnosis report contains failure information detected by scan testing on a tester for a defective die (Benware p.9 col.2 5th paragraph-p.10 col.1 1st paragraph (Layout-aware diagnosis) and p.14 col.1-p.15 col.1 (Results from simulated defect experiments).); 
computing, through the local phase of a volume diagnosis procedure, the training probability distributions from the diagnosis reports, each of the training probability distributions respectively corresponding to one of the training dies (Examiner’s note: As indicated earlier, Cheng teaches creating a Bayesian network model by first determining a probability P(r) (i.e., the probability of one diagnosis report) based on a distribution of probabilities including identifying all mutually exclusive and independent root causes P(c) that are associated with specific defects and various identified faults for an individual failed IC associated with the single diagnosis report (Cheng p.2 col.1 last paragraph-col.2 4th paragraph (Section II. Volume Diagnosis Model). This computing and creation of a Bayesian network model taught in Chen is functionally analogous to the computing and creation of a Bayesian network model taught in Benware, where Benware also teaches modelling the Bayesian network through determination of probabilities for the root causes and associated suspect defects as part of the root cause deconvolution method involving a EM algorithm performing a most-likelihood estimation, where the defect information for determining these probabilities are also retrieved from the created diagnosis reports (Benware p.10 Figure 1; p.11 Figure 2; and p.11 col.1-p.12 col.2 (Creating the diagnosis Bayes net)).); 
labeling each of the training probability distributions with the given root cause for the training die corresponding to the training probability distribution, the given root cause indicative of the actual root cause for the training probability distribution (Examiner’s note: As indicated earlier, Benware teaches performing experiments based on simulated defect responses using the RCD method (Benware p.10 Figure 1), where the experiments involve creating a population of diagnosis reports and analyzing them using the RCD method for a root cause distribution with only a single root cause specified, where this Benware p.14 col.1-p.15 col.1 (Results from simulated defect experiments)).); and 
providing, as the training set, the labeled training probability distributions to train the supervised learning model (Examiner’s note: As indicated earlier, Benware teaches simulating defect responses for a population of created diagnosis reports by injecting a root cause, and analyzing these reports using the RCD method, where using the created Bayesian network model to analyze these reports to learn the probability parameters in the Bayesian network model is a part of the RCD analysis (Benware p.9 col.2 5th paragraph-p.10 col.1 1st paragraph (Layout-aware diagnosis) and Benware p.14 col.1-p.15 col.1 (Results from simulated defect experiments)).).  
Regarding Claim 5, Cheng in view of Benware teaches
The method of claim 4, wherein the training dies are generated via 
simulation (Examiner’s note: Cheng teaches simulating silicon defects lots to get failure files for the experiments (Cheng p.2 col.1 2nd paragraph: “In [16-24] to verify the effectiveness of these techniques, handful of silicon defects and lots of simulated defects are used to get failure files for the experiments.”), where one of the references cited [19] is the Benware reference, where Benware teaches that the defect responses are simulated for a population of diagnosis reports, where each diagnosis reports is based on scan diagnosis results reporting failure information for an individual IC (Benware p.8 col.2 2nd paragraph and p.14 col.1-p.15 col.1 (Results from simulated defect experiments): “Experiments based on simulated defect responses in an IC have been carried out to evaluate the accuracy of RCD. … In each experiment, a population of diagnosis reports is created and analyzed with RCD for a root cause distribution with only a single root cause specified. Only a subset of possible root cause was used as the injected root cause, however, each root cause model type (e.g., critical area shorts) is represented in the results.”).), emulation, or a combination of both.  
Regarding Claim 8, Cheng teaches
A system comprising: 
a model training engine configured to train a supervised learning model with a training set comprising training probability distributions computed for training dies through a local phase of a volume diagnosis procedure (Examiner’s note: In light of applicant’s specification paragraph [0010], a “local phase of a volume diagnosis procedure” is defined as a phase involving individual failed ICs (i.e., semiconductor devices). Cheng teaches applying cross-validation techniques to a total sampled data set, separating the sampled data set into training data and test data sets, where the training data set was used in a training phase to perform the MLE method to find the most likely distribution for the probability parameters identified in the volume diagnosis Bayesian network model. Cheng further teaches that the MLE methods is used for improving the accuracy of these probability parameters, and that a combination of supervised machine learning, deep learning techniques, and domain knowledge is used to generate proper training data to build a supervised learning model, where the process of improving the accuracy of these probability parameters represents a form of adjustment of the probability distribution, and where this generation of proper training data includes using the domain knowledge information related to each of the probability parameters (where this domain knowledge is found in the scan diagnosis reports associated with individual failed ICs, thus forming a set of training data for each defective IC) (Cheng p.2 col.1 last paragraph-col.2 4th paragraph (Section II. Volume Diagnosis Model): “… Based on Bayesian network we assume that                         
                            P
                            
                                
                                    v
                                
                            
                            =
                            
                                ∏
                                
                                    P
                                    (
                                    r
                                    )
                                
                            
                        
                    . That is the probability of all sampled volume diagnosis reports is equal to the product of the probability of each diagnosis report if all diagnosis reports are independent. … Most-Likelihood Estimation (MLE) finds the distribution of P(c) of all roots causes such at P(v) has maximum value. … MLE is accurate if the model used is correct and the sampled data is infinite. The accuracy of this Bayesian network depends on the accuracy of P(r|f), P(f|d), and P(d|c).”; Cheng p.6 col.1 2nd paragraph-col.2 2nd paragraph (Section V. Volume Diagnosis Practical Uses): “… The accuracy of P(v) depends on the accuracies of P(r|f), P(f|d) and P(d|c). … Without good domain knowledge, an alternative is to use supervised machine learning techniques to derive these parameters based on good training data. With more aggressive deep learning techniques it is possible that a new model can be created to replace the Bayesian network. Some domain knowledge is still needed to get proper training data.”; and Cheng p.6 col.2 4th paragraph-p.7 col.1 4th paragraph: “… As in volume diagnosis, P(r|f), P(f|d) and P(d|c) can be correct based on unlimited diagnosis reports … Over-fitting can be alleviated by increasing sample data size … There are several popular machine earning techniques to deal with the over-fitting problem … cross validation was used in DDYA … to divide the total sampled data set into N parts randomly. N-1 parts are used as training data and the remaining 1 part is used as test data. MLE finds the most likely distribution of training data and applied this distribution on testing data to measure its fitness.”).), wherein: 
each given training probability distribution specifies probabilities for different root causes as having caused a given training die to fail as computed by the volume diagnosis procedure (Examiner’s note: Cheng teaches creating a Bayesian network model by first determining a probability P(r) (i.e., the probability of one diagnosis report) based on a distribution of probabilities including identifying all mutually exclusive and independent root causes P(c) that are associated with specific defects and various identified faults for an individual failed IC associated with the single diagnosis report (Cheng p.2 col.1 last paragraph-col.2 4th paragraph (Section II. Volume Diagnosis Model).) … 
a volume diagnosis adjustment engine configured to:   
access a diagnosis report for a given circuit die that has failed scan testing (Examiner’s note: Cheng teaches a diagnosis driven yield analysis (DDYA) method where diagnosis reports are inspected for defect, physical feature, and fault information, where the physical feature represents probable root causes, and where the diagnosis reports are produced as a result of scan diagnosis used for determining defect locations on semiconductor devices (Cheng p.1 Abstract; p.1 col.1 Section I. Introduction 1st-3rd paragraphs: “Scan diagnosis … is used to determine the defect locations and defect mechanism for a given failing device and the scan test patterns used. … With physical defect features reported by diagnosis tools, several papers have proposed to use volume (large amount of) diagnosis reports with appropriate statistical analysis to automatically identify a common physical defect feature …”; p.1 col.2 Section I. Introduction 2nd-6th paragraphs; and p.2 col.1 Section I. Introduction 4th-5th paragraphs: “… With volume diagnosis reports, DDYA uses statistical method to identify the correct distribution of these diagnosis reports … DDYA in this paper is based on MLE. So DDYA problem has two parts: how to get correct volume diagnosis likelihood model and how to ensure the distribution identified by MLE is correct with limited diagnosis reports. … In this paper, a Bayesian network [26] is used to model volume diagnosis reports …”).); 
compute, through the local phase of the volume diagnosis procedure, a probability distribution for the given circuit die from the diagnosis report, wherein the probability distribution specifies probabilities for different root causes as having caused the given circuit die to fail (Examiner’s note: As indicated earlier, Cheng teaches creating a Bayesian network model by first determining a probability P(r) (i.e., the probability of one diagnosis report) based on a distribution of probabilities including identifying all mutually exclusive and independent root causes P(c) that are associated with specific defects and various identified faults for an individual failed IC associated with the single diagnosis report (Cheng p.2 col.1 last paragraph-col.2 4th paragraph (Section II. Volume Diagnosis Model).); 
adjust the probability distribution into an adjusted probability distribution using the supervised learning model (Examiner’s note: As indicated earlier, Cheng teaches using a most-likelihood estimation (MLE) method to compute and solve the probability parameters represented in the volume diagnosis Bayesian network model. Cheng further teaches that the MLE methods is used for improving the accuracy of these probability parameters, and that a combination of supervised machine learning, deep learning techniques, and domain knowledge is used to generate proper training data to build a supervised learning model, where the process of improving the accuracy of these probability parameters represents a form of adjustment of the probability distribution (Cheng p.2 col.1 last paragraph-col.2 4th paragraph (Section II. Volume Diagnosis Model): “… Based on Bayesian network we assume that                         
                            P
                            
                                
                                    v
                                
                            
                            =
                            
                                ∏
                                
                                    P
                                    (
                                    r
                                    )
                                
                            
                        
                    . That is the probability of all sampled volume diagnosis reports is equal to the product of the probability of each diagnosis report if all diagnosis reports are independent. … Most-Likelihood Estimation (MLE) finds the distribution of P(c) of all roots causes such at P(v) has maximum value. … MLE is accurate if the model used is correct and the sampled data is infinite. The accuracy of this Bayesian network depends on the accuracy of P(r|f), P(f|d), and P(d|c).”; Cheng p.6 col.1 2nd paragraph-col.2 2nd paragraph (Section V. Volume Diagnosis Practical Uses): “… The accuracy of P(v) depends on the accuracies of P(r|f), P(f|d) and P(d|c). … Without good domain knowledge, an alternative is to use supervised machine learning techniques to derive these parameters based on good training data. With more aggressive deep learning techniques it is possible that a new model can be created to replace the Bayesian network. Some domain knowledge is still needed to get proper training data.”.); and 
provide the adjusted probability distribution for the given circuit die as an input … to determine a … distribution for multiple circuit dies that have failed the scan testing (Examiner’s note: As indicated earlier, Cheng further teaches applying cross-validation techniques to a total sampled data set representing a set of limited diagnosis reports, separating the sampled data set into training data and test data sets, where the test data set was used to evaluate the Bayesian network model and measure the fitness of the model by using the most likely distribution for the probability parameters that were identified and computed during the training phase (Cheng p.6 col.2 4th paragraph-p.7 col.2 2nd paragraph: “… As in volume diagnosis, P(r|f), P(f|d) and P(d|c) can be correct based on unlimited diagnosis reports … Over-fitting can be alleviated by increasing sample data size … There are several popular machine earning techniques to deal with the over-fitting problem … cross validation was used in DDYA … to divide the total sampled data set into N parts randomly. N-1 parts are used as training data and the remaining 1 part is used as test data. MLE finds the most likely distribution of training data and applied this distribution on testing data to measure its fitness. … The fitting distribution which is most similar to underlying distribution fits best in test data as shown in Figure 6(c) and 6(f).”).).  
While Cheng teaches building a supervised learning model with proper training data to further improve the probability parameters represented in the volume diagnosis Bayesian network, Cheng does not explicitly teach
… each given training probability distribution is labeled with an actual root cause that caused the given training die to fail …
… provide the adjusted probability distribution … as an input to a global phase of the volume diagnosis procedure to determine a global root cause distribution …
Benware teaches
… each given training probability distribution is labeled with an actual root cause that caused the given training die to fail (Examiner’s note: Benware teaches performing experiments based on simulated defect responses using a root cause deconvolution (RCD) method involving the creation of a Bayesian Benware p.10 Figure 1), where the experiments involve creating a population of diagnosis reports and analyzing them using the RCD method for a root cause distribution with only a single root cause specified, where this specifying of a single root cause for a root cause distribution associated with a diagnosis report (containing defect information associated with possible root causes) is a form of labeling (Benware p.14 col.2 4th paragraph: “Table 1 summarizes the design attributes and results of many simulation experiments involving only a single root cause for two different designs. In each experiment, a population of diagnosis reports is created and analyzed with RCD for a root cause distribution with only a single root cause specified. Only a subset of possible root causes was used as the injected root cause, however, each root cause model type (e.g., critical area shorts) is represented in the results.”).) …
… provide the adjusted probability distribution … as an input to a global phase of the volume diagnosis procedure to determine a global root cause distribution (Examiner’s note: In light of applicant’s specification paragraph [0010], a “global phase of a volume diagnosis procedure” is defined as a phase involving a population of failed ICs (i.e., semiconductor devices). As indicated earlier, Benware teaches  performing experiments based on simulated defect responses using a RCD method involving the creation of a Bayesian network model (Benware p.10 Figure 1; p.11 Figure 2; pp.11-p.12 Creating the diagnosis Bayes net; and p.14 col.1-p.15 col.1 (Results from simulated defect experiments).), where this RCD analysis involves performing an expectation-maximization (EM) algorithm to learn and adjust the probability parameters identified in the Bayesian network equations                         
                            
                                
                                    c
                                
                                
                                    i
                                    ,
                                    l
                                
                                
                                    (
                                    t
                                    )
                                
                            
                        
                    ,                         
                            
                                
                                    θ
                                
                                
                                    l
                                
                                
                                    (
                                    t
                                    +
                                    1
                                    )
                                
                            
                        
                    , and P(                        
                            
                                
                                    R
                                    C
                                
                                
                                    i
                                
                            
                            =
                            
                                
                                    r
                                    c
                                
                                
                                    l
                                
                            
                            |
                            
                                
                                    θ
                                
                                
                                    
                                        
                                            t
                                            +
                                            1
                                        
                                    
                                
                            
                        
                    ) (Benware pp.12-14 Learning the parameter values, in particular p.12 col.2 3rd paragraph: “The primary objective of the learning phase is to determine the unknown parameter values of the model, in this case 𝛉, which leads to an understanding of the root cause distribution. … if one could accurately compute the likelihood of each root cause for each symptom, one could again determine the parameter 𝛉 by summing the likelihood values for each root cause and thus determining P(RC). … using the well-known expectation-maximization (EM) algorithm.”). Benware further teaches applying the RCD analysis (and hence the Bayesian network model and its adjusted probability distribution) on all cores for four lots manufactured on a 28-nm bulk process, where separate Benware pp.15-16 col.2 1st paragraph-p.16 1st paragraph (Results from 28-nm yield ramp): “… this section presents the results from applying the methodology to the early stages of a 28-nm yield ramp. … RCD was performed on all cores for four lots manufactured on a 28-nm bulk process … the data were processed with RCD as independent populations. A separate population of failing devices was created for each layout configuration and each manufactured lot, making for a total of 24 populations for RCD analysis. After all the analyses were completed, total root cause estimates per lot were obtained by summing the results obtained from each layout configuration.”).) …
Both Cheng and Benware are analogous art since they both teach using scan diagnosis reports to create a Bayesian network model to estimate and adjust root cause distributions.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the scan diagnosis reports taught in Cheng and label them with a specified root cause taught in Benware as a way to improve the estimations of root cause distributions produced by the Bayesian network model. The motivation to combine is taught in Benware, as provided in the prior art claim mapping of Claim 1 provided above.
Regarding Claim 9, 
Claim 9 recites the system of claim 8, where the system further comprises claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 2, and hence is rejected under similar rationale provided by Chang in view of Benware as indicated in Claim 2, in view of the rejections applied to Claim 8.  
Regarding Claim 10, 
Claim 10 recites the system of claim 8, where the system further comprises claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 3, and hence is rejected under similar rationale provided by Chang in view of Benware as indicated in Claim 3, in view of the rejections applied to Claim 8.  
Regarding Claim 11, 
Chang in view of Benware as indicated in Claim 4, in view of the rejections applied to Claim 8.  
Regarding Claim 12, 
Claim 12 recites the system of claim 11, where the system further comprises claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 5, and hence is rejected under similar rationale provided by Chang in view of Benware as indicated in Claim 5, in view of the rejections applied to Claim 11.  
Regarding Claim 15, 
Claim 15 recites a non-transitory machine readable medium comprising processor executable instructions on a computing system, where those instructions comprise of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 1, and hence is rejected under similar rationale and motivations provided by Chang and Benware as indicated in Claim 1. In addition, as indicated earlier, Cheng teaches accessing diagnosis reports containing multiple fault, defect, physical feature information, and creating and training a volume diagnosis Bayesian network model to determine probability distributions of faults and associated probable root causes, and using cross validation techniques to produce training and test data based on the domain knowledge information provided in the volume diagnosis reports, all of which require use of a computing system containing a computer processor executing instructions, where the instructions are stored on a computer-readable medium (Cheng p.2 Section II. Volume Diagnosis Model; pp.6-8 Section V. Volume Diagnosis Practical Usages, Figures 5(a),(b), and Figures 6(a)-(g)).).
Regarding Claim 16, 
Claim 16 recites the non-transitory machine-readable medium of claim 15, where the non-transitory machine-readable medium further comprises of instructions that includes claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 2, and hence is rejected under similar rationale provided by Chang in view of Benware as indicated in Claim 2, in view of the rejections applied to Claim 15.  
Regarding Claim 17, 
Claim 17 recites the non-transitory machine-readable medium of claim 15, where the non-transitory machine-readable medium further comprises of instructions that includes claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 4, and hence is rejected under similar rationale provided by Chang in view of Benware as indicated in Claim 4, in view of the rejections applied to Claim 15.  
Regarding Claim 18, 
Claim 18 recites the non-transitory machine-readable medium of claim 17, where the non-transitory machine-readable medium further comprises of instructions that includes claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 5, and hence is rejected under similar rationale provided by Chang in view of Benware as indicated in Claim 5, in view of the rejections applied to Claim 17.  
Claims 6-7, 13-14, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Cheng et al., Volume Diagnosis Data Mining, 2017 22nd IEEE European Test Symposium (ETS), 10 pages [hereafter referred as Cheng], in view of Benware et al., Determining a Failure Root Cause Distribution From a Population of Layout-Aware Scan Diagnosis Results, IEEE Design & Test of Computers, 2012, pp.8-18 [hereafter referred as Benware] as applied to Claims 1, 8, and 15; in even further view of Rajski et al., U.S. PGPUB 2006/0066339, Determining and Analyzing Integrated Circuit Yield and Quality, published 3/30/2006 [hereafter referred as Rajski].
Regarding Claim 6, Cheng in view of Benware as applied to Claim 1 teaches
The method of claim 1.
While Cheng in view of Benware teaches a supervised learning model, Cheng in view of Benware does not explicitly teach
wherein the supervised learning model comprises a linear function that linearly adjusts the probability distribution computed for the given circuit die.  
Rajski teaches
wherein the supervised learning model comprises a linear function that linearly adjusts the probability distribution computed for the given circuit die (Examiner’s note: Rajski Figure 32 teaches a Rajski Equation 12, where the estimates of                         
                            
                                
                                    p
                                
                                
                                    f
                                    a
                                    i
                                    l
                                
                            
                            (
                            
                                
                                    f
                                
                                
                                    i
                                
                            
                            )
                        
                     can be generated using well-known regression techniques (Rajski [0277], [0284]-[0291]). Rajski further teaches that a data calibration step that iteratively reduces the estimation error caused by equivalent classes (represented by the failure probability parameters shown in Rajski Equation 12) (Rajski [0303]-[0309], in particular [0303]: “The predicted distribution of yield loss mechanisms are desirably calibrated such that, in the statistical sense, the estimation error caused by equivalent classes can be reduced. As shown in Fig. 32, data calibration (22.2) can be performed in an iterative fashion with diagnostic results computation (22.1).”).).  
Both Cheng in view of Benware and Rajski are analogous art since they both teach analysis of fail test result data using probabilistic analysis methods to identify failure probabilities associated with a subset of physical feature defects.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the supervised learning model taught in Cheng in view of Benware and use the linear regression analysis technique taught in Rajski as a way to improve the analysis and computation of the feature fail probabilities. The motivation to combine is taught in Rajski, by showing that the estimation of the feature fail probabilities can be approximated through a linear equation, where the linear equation is easily solvable through performing calculations on a generic computer, hence improving the computational efficiency of the system (Rajski [0290]-[0291]).
Regarding Claim 7, Cheng in view of Benware, in even further view of Rajski teaches
The method of claim 6, 
wherein the linear function comprises an adjustment matrix that linearly adjusts an input probability distribution (Examiner’s note: As indicated earlier, Rajski teaches a data calibration step for iteratively reducing the estimation error computed in the data defect computation step, where the data calibration step can be summarized in the form of a linear equation shown in Rajski Equation 16, which Rajski Equation 15 that defines the probability distribution                         
                            
                                
                                    P
                                    (
                                    O
                                
                                
                                    i
                                
                            
                            )
                        
                     that a defect is predicted by diagnosis as class i, where the matrix 𝚪 shown in Rajski Equation 16 represents the conditional probability                         
                            
                                
                                    P
                                    (
                                    O
                                
                                
                                    i
                                
                            
                            |
                            
                                
                                    D
                                
                                
                                    j
                                
                            
                            )
                        
                     that is adjusted according to the data calibration step (Rajski [0305]).); and 
wherein the adjustment matrix has dimensions of 'N' x 'N', wherein 'N' is a number of different root causes in probability distributions computed by the local phase of a volume diagnosis procedure (Examiner’s note: As indicated earlier, Rajski teaches a data calibration step using Rajski Equation 16 (which contains a matrix 𝚪). Rajski further teaches performing data calibration based on assumptions of no ambiguity between different classes and ambiguity between different classes, where for the case of ambiguity between different classes, Rajski Equation 16 will be calibrated based on the identified equation shown in Rajski Equation 17, where matrix                         
                            
                                
                                    Γ
                                
                                
                                    -
                                    1
                                
                            
                        
                     represents the inverse matrix of 𝚪, indicating that the matrix is a square matrix with equal dimensions i=j, where both i and j represent the number of defect classes, where these defect classes represent identified root causes (Rajski [0306]; [0154]: “… Process (21.1) is performed to try to identify the defect, respectively the class or subclass of the defect, which can best explain the failing behavior of the integrated circuit.”; and [0216]: “… each defect has an ID indicating which class it belongs to, in the event that all candidates fall into the same class …”).).  
Regarding Claim 13, 
Claim 13 recites the system of claim 8, where the system further comprises claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 6, and hence is rejected under similar rationale and motivations provided by Chang in view of Benware and Rajski as indicated in Claim 6, in view of the rejections applied to Claim 8.  
Regarding Claim 14, 
Claim 14 recites the system of claim 13, where the system further comprises claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 7, and hence is rejected under similar rationale provided by Chang in view of Benware, in further view of Rajski as indicated in Claim 7, in view of the rejections applied to Claim 13.  
Regarding Claim 19, 
Chang in view of Benware and Rajski as indicated in Claim 6, in view of the rejections applied to Claim 15.  
Regarding Claim 20, 
Claim 20 recites the non-transitory machine-readable medium of claim 19, where the non-transitory machine-readable medium further comprises of instructions that includes claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 7, and hence is rejected under similar rationale provided by Chang in view of Benware, in further view of Rajski as indicated in Claim 7, in view of the rejections applied to Claim 19.  

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Benware, Robert Brady, U.S. PGPUB 2012/0297264, Root Cause Distribution Determination Based on Layout Aware Scan Diagnosis Results, published 11/22/2012.
Rajski et al., Analyzing Volume Diagnosis Results with Statistical Learning for Yield Improvement, 12th IEEE European Test Symposium (ETS ’07), IEEE 2007, 6 pages.
Benware et al., U.S. PGPUB 2014/0059511, Generating Root Cause Candidates for Yield Analysis, published 2/27/2014.
Schuermyer et al., U.S. PGPUB 2017/0052861, Identifying Failure Mechanisms Based on a Population of Scan Diagnostic Reports, published 2/23/2017.
Cheng et al., Automatic Identification of Yield Limiting Layout Patterns Using Root Cause Deconvolution on Volume Scan Diagnosis Data, 2017 IEEE 26th Asian Test Symposium, IEEE 2017, pp.214-219.
Wang et al., Machine Learning-Based Volume Diagnosis, 2009 Design, Automation Test in Europe Conference Exhibition (2009 EDAA), pp.902-905.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM WAI YIN KWAN whose telephone number is 303-297-4332. The examiner can normally be reached on Monday-Friday 8:00am - 4:30pm PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 571-272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/WILLIAM WAI YIN KWAN/Examiner, Art Unit 2121                               



                                                                                                                                                                         /Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121