DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 17 is objected to because of the following informalities:  the claim recites, “comprising program code executable by the processor to receive as input information generated by a previous dynamic analysis iteration perform dynamic analysis on the input information” which should be “comprising program code executable by the processor to receive as input, information generated by a previous dynamic analysis iteration, and perform dynamic analysis on the input information”.  Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5, 8-9, 13-15 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Miller et al. (US PGPUB 2021/0271587; hereinafter “Miller”) in view .
Claim 1:	
Miller teaches a system, comprising a processor (Fig. 9: Processor(s) 906) to:
receive a source code sample to be classified ([0064] “A source code program or code snippet is selected for analysis by one or more of the trained random forest models (block 802).”);
execute a code analysis to generate an internal analysis state ([0064] “The syntax-type tree generator parses the source code program to generate a syntax-type tree for each method in the program (block 804).”);
extract features from the internal analysis state ([0064] “The feature extraction component extracts the appropriate features from each syntax-type tree to form a feature vector for a respective model (block 806).”); and
generate a label based on the extracted features via a machine learning classifier model trained on internal analysis states of hybrid code analyses ([0065] “The feature vector is applied to each tree in the random forest for classification. A trained decision tree from the random forest is selected (block 808) and is tested against the trained and optimized parameters in each binary test in each node (block 810),” wherein the “hybrid code analyses” is taught below by Chari. [0080] “obtaining a label from the random forest classifier model that indicates whether or not the extracted features indicate use of the uninitialized variable represents a runtime error.”).


wherein the executed code analysis is a hybrid code analysis to generate the internal analysis state ([0021] “features extracted from static analysis, dynamic analysis, or both static and dynamic analysis of malware samples,” wherein “hybrid code analysis” is described in the Applicant’s specification as a combination of static and dynamic analysis. [0028] “The features 130 are typical features used for malware determined using static analysis, dynamic analysis, or both static and dynamic analysis.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Miller with the hybrid code analysis as taught by Chari in order “to improve the accuracy of the malware clustering” (Chari [0020]).

With further regard to Claim 1, Miller in view of Chari does not teach the following, however, Bhatt teaches:
wherein the features are extracted via a trained machine learning model modified using transfer learning ([0002] “The idea of using ML[Machine Learning]-based automation systems has led to significant contributions to domain adaptation and transfer learning (DA/TL) techniques. The DA/TL techniques leverage knowledge, such as labeled data, from one or multiple source domains.” [0061] “FIG. 3 depicts a flowchart that illustrates a domain adaptation method of text classification based on 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Miller in view of Chari with the transfer learning as taught by Bhatt in order “to learn an accurate model for unlabeled data in a target domain” (Bhatt [0002]).

Claim 5:	
Miller in view of Chari and Bhatt teaches the system of claim 1 and Miller further teaches wherein the internal analysis state comprises an internal representation of code analysis results or generated alerts ([0064] “The syntax-type tree generator parses the source code program to generate a syntax-type tree for each method in the program (block 804),” wherein a “syntax-type tree” is “an internal representation of code analysis results”.).

Claim 8:	
Miller in view of Chari and Bhatt teaches the system of claim 1 and Miller further teaches wherein the machine learning classifier model comprises a support-vector machine, a neural network, a decision tree, or a random forest model ([0030] “In one aspect, the machine learning model is a random forest classifier. A random forest is an ensemble-based machine learning technique for classification.”).

Claim 9:	

receiving, via a processor, a source code sample to be classified (Fig. 9: Processor(s) 906. [0064] “A source code program or code snippet is selected for analysis by one or more of the trained random forest models (block 802).”);
executing, via the processor, a code analysis to generate an internal analysis state ([0064] “The syntax-type tree generator parses the source code program to generate a syntax-type tree for each method in the program (block 804).”);
extracting, via the processor, features from the internal analysis state ([0064] “The feature extraction component extracts the appropriate features from each syntax-type tree to form a feature vector for a respective model (block 806).”); and
generating, via the processor, a label based on the extracted features via a machine learning classifier model trained on internal analysis states of hybrid code analyses ([0065] “The feature vector is applied to each tree in the random forest for classification. A trained decision tree from the random forest is selected (block 808) and is tested against the trained and optimized parameters in each binary test in each node (block 810),” wherein the “hybrid code analyses” is taught below by Chari. [0080] “obtaining a label from the random forest classifier model that indicates whether or not the extracted features indicate use of the uninitialized variable represents a runtime error.”).

With further regard to Claim 9, Miller does not teach the following, however, Chari teaches:

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Miller with the hybrid code analysis as taught by Chari in order “to improve the accuracy of the malware clustering” (Chari [0020]).

With further regard to Claim 9, Miller in view of Chari does not teach the following, however, Bhatt teaches:
wherein the features are extracted via a trained machine learning model modified using transfer learning ([0002] “The idea of using ML[Machine Learning]-based automation systems has led to significant contributions to domain adaptation and transfer learning (DA/TL) techniques. The DA/TL techniques leverage knowledge, such as labeled data, from one or multiple source domains.” [0061] “FIG. 3 depicts a flowchart that illustrates a domain adaptation method of text classification based on learning of transferable feature representations from a source domain for a target domain”).


Claim 13:	
Miller in view of Chari and Bhatt teaches the computer-implemented method of claim 9 and Bhatt further teaches comprising training the machine learning classifier model, wherein training the machine learning classifier model comprises:
receiving, via a processor, a labeled source code sample ([0005] “wherein the received input data comprises labeled instances of the source domain”); and
wherein the features are extracted via a trained machine learning model modified using transfer learning ([0002] “The idea of using ML[Machine Learning]-based automation systems has led to significant contributions to domain adaptation and transfer learning (DA/TL) techniques. The DA/TL techniques leverage knowledge, such as labeled data, from one or multiple source domains.” [0061] “FIG. 3 depicts a flowchart that illustrates a domain adaptation method of text classification based on learning of transferable feature representations from a source domain for a target domain”).

With further regard to Claim 13, Miller further teaches wherein training the machine learning classifier model comprises:

extracting, via the processor, features from the one or more internal analysis states ([0064] “The feature extraction component extracts the appropriate features from each syntax-type tree to form a feature vector for a respective model (block 806).”); and
training, via the processor, the machine learning classifier model to generate a label based on the extracted features ([0023] “In the training phase 100, the source code extraction component 104 extracts source code programs 106 from a source code repository 102 to find suitable code snippets to train the machine learning model.” [0065] “The feature vector is applied to each tree in the random forest for classification. A trained decision tree from the random forest is selected (block 808) and is tested against the trained and optimized parameters in each binary test in each node (block 810).” [0080] “obtaining a label from the random forest classifier model that indicates whether or not the extracted features indicate use of the uninitialized variable represents a runtime error.”).

With further regard to Claim 13, Chari further teaches:
wherein the executed code analysis is a hybrid code analysis to generate the internal analysis state ([0021] “features extracted from static analysis, dynamic analysis, or both static and dynamic analysis of malware samples,” wherein “hybrid code analysis” is described in the Applicant’s specification as a combination of static and dynamic analysis. [0028] “The features 130 are typical features used for malware 

Claim 14:	
Miller in view of Chari and Bhatt teaches the computer-implemented method of claim 9 and Miller further teaches wherein executing the hybrid code analysis comprises generating a graph dump per function of the source code sample and combining the graph dumps into a single binary file ([0030] “A random forest is an ensemble-based machine learning technique for classification. This technique is constructed using multiple decision trees that are trained to produce a probability representing a classification or label identifying the class that represents the mode of the classes of the decision trees,” wherein a “tree” is a type of “graph”. [0033] “This method of combining trees is an ensemble method. The individual decision trees are weak learners and the ensemble produces a strong learner. Decision trees can suffer from over-fitting which leads to poor generalization and a higher error rate. An ensemble of decision trees, such as a random forest, improves generalization.”).

Claim 15:	
Miller teaches a computer program product for classifying source code, the computer program product comprising a computer-readable storage medium having program code embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, the program code executable by a processor to cause the processor to ([0070] “A computing device 902 may include one or more processors 
receive a source code sample to be classified ([0064] “A source code program or code snippet is selected for analysis by one or more of the trained random forest models (block 802).”);
execute a code analysis to generate an internal analysis state ([0064] “The syntax-type tree generator parses the source code program to generate a syntax-type tree for each method in the program (block 804).”);
extract features from the internal analysis state ([0064] “The feature extraction component extracts the appropriate features from each syntax-type tree to form a feature vector for a respective model (block 806).”); and
generate a label based on the extracted features via a machine learning classifier model trained on internal analysis states of hybrid code analyses ([0065] “The feature vector is applied to each tree in the random forest for classification. A trained decision tree from the random forest is selected (block 808) and is tested against the trained and optimized parameters in each binary test in each node (block 810),” wherein the “hybrid code analyses” is taught below by Chari. [0080] “obtaining a label from the random forest classifier model that indicates whether or not the extracted features indicate use of the uninitialized variable represents a runtime error.”).

With further regard to Claim 15, Miller does not teach the following, however, Chari teaches:

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the computer program product as disclosed by Miller with the hybrid code analysis as taught by Chari in order “to improve the accuracy of the malware clustering” (Chari [0020]).

With further regard to Claim 15, Miller in view of Chari does not teach the following, however, Bhatt teaches:
wherein the features are extracted via a trained machine learning model modified using transfer learning ([0002] “The idea of using ML[Machine Learning]-based automation systems has led to significant contributions to domain adaptation and transfer learning (DA/TL) techniques. The DA/TL techniques leverage knowledge, such as labeled data, from one or multiple source domains.” [0061] “FIG. 3 depicts a flowchart that illustrates a domain adaptation method of text classification based on learning of transferable feature representations from a source domain for a target domain”).


Claim 19:	
Miller in view of Chari and Bhatt teaches the computer program product of claim 15 and Bhatt further teaches comprising program code executable by the processor to:
receive a labeled source code sample ([0005] “wherein the received input data comprises labeled instances of the source domain”); and
wherein the features are extracted via a trained machine learning model modified using transfer learning ([0002] “The idea of using ML[Machine Learning]-based automation systems has led to significant contributions to domain adaptation and transfer learning (DA/TL) techniques. The DA/TL techniques leverage knowledge, such as labeled data, from one or multiple source domains.” [0061] “FIG. 3 depicts a flowchart that illustrates a domain adaptation method of text classification based on learning of transferable feature representations from a source domain for a target domain”).

With further regard to Claim 19, Miller further teaches comprising program code executable by the processor to:

extract features from the one or more internal analysis states ([0064] “The feature extraction component extracts the appropriate features from each syntax-type tree to form a feature vector for a respective model (block 806).”); and
train the machine learning classifier model to generate a label based on the extracted features ([0023] “In the training phase 100, the source code extraction component 104 extracts source code programs 106 from a source code repository 102 to find suitable code snippets to train the machine learning model.” [0065] “The feature vector is applied to each tree in the random forest for classification. A trained decision tree from the random forest is selected (block 808) and is tested against the trained and optimized parameters in each binary test in each node (block 810).” [0080] “obtaining a label from the random forest classifier model that indicates whether or not the extracted features indicate use of the uninitialized variable represents a runtime error.”).

With further regard to Claim 19, Chari further teaches:
wherein the executed code analysis is a hybrid code analysis to generate the internal analysis state ([0021] “features extracted from static analysis, dynamic analysis, or both static and dynamic analysis of malware samples,” wherein “hybrid code analysis” is described in the Applicant’s specification as a combination of static and dynamic analysis. [0028] “The features 130 are typical features used for malware 

Claim 20:
Miller in view of Chari and Bhatt teaches the computer program product of claim 15 and Miller further teaches comprising program code executable by the processor to generate a graph dump per function of the source code sample and combine the graph dumps into a single binary file ([0030] “A random forest is an ensemble-based machine learning technique for classification. This technique is constructed using multiple decision trees that are trained to produce a probability representing a classification or label identifying the class that represents the mode of the classes of the decision trees,” wherein a “tree” is a type of “graph”. [0033] “This method of combining trees is an ensemble method. The individual decision trees are weak learners and the ensemble produces a strong learner. Decision trees can suffer from over-fitting which leads to poor generalization and a higher error rate. An ensemble of decision trees, such as a random forest, improves generalization.”).

Claims 2-3, 10-11 and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Miller in view of Chari and Bhatt as applied to Claims 1, 9 and 15 above, and further in view of Apte et al. (US PGPUB 2017/0192758; hereinafter “Apte”).
Claim 2:	

wherein the processor is to execute a plurality of dynamic analysis iterations to construct a plurality of internal analysis states from which the features are extracted ([0037] “In TAT 20, dynamic code analysis is used to interpret (e.g., understand) the static and dynamic program calls to gather (e.g., collect) the application program flow and/or generate a call tree report. The call tree report gives the user detailed information about all the calls and called program(s) with screens, databases, and files used.” [0064] “Dynamic code analysis of BRE 22 can be an iterative process, when gaps in the source code are identified, new test cases are created for the identified gap and re-executed (e.g., re-simulated)”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Miller in view of Chari and Bhatt with the dynamic analysis iterations as taught by Apte in order “to confirm that all of the gaps [to be tested] are covered” (Apte [0064]).

Claim 3:	
Miller in view of Chari, Bhatt and Apte teaches the system of claim 2 and Apte further teaches wherein each of the dynamic analysis iterations receives as input information generated by a previous dynamic analysis iteration ([0064] “BRE 22 ensures that all of the business rules are extracted and 100% of the source code coverage is achieved. Dynamic code analysis of BRE 22 can be an iterative process, when gaps in 

Claim 10:	
Miller in view of Chari and Bhatt teaches all the limitations of claim 9 as described above. Miller in view of Chari and Bhatt does not teach the following, however, Apte teaches:
comprising executing a plurality of dynamic analysis iterations to construct a plurality of internal analysis states from which the features are extracted ([0037] “In TAT 20, dynamic code analysis is used to interpret (e.g., understand) the static and dynamic program calls to gather (e.g., collect) the application program flow and/or generate a call tree report. The call tree report gives the user detailed information about all the calls and called program(s) with screens, databases, and files used.” [0064] “Dynamic code analysis of BRE 22 can be an iterative process, when gaps in the source code are identified, new test cases are created for the identified gap and re-executed (e.g., re-simulated)”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed 

Claim 11:	
Miller in view of Chari, Bhatt and Apte teaches the method of claim 10 and Apte further teaches wherein executing the plurality of dynamic analysis iterations comprises receiving as input information generated by a previous dynamic analysis iteration ([0064] “BRE 22 ensures that all of the business rules are extracted and 100% of the source code coverage is achieved. Dynamic code analysis of BRE 22 can be an iterative process, when gaps in the source code are identified, new test cases are created for the identified gap and re-executed (e.g., re-simulated) to confirm that all of the gaps are covered.” [0124] “In BRE dynamic code analysis extracts all the branch path of the programs based on the test data provided to the simulated program execution. The objective is to make sure all the rules are extracted and 100% code coverage is achieved. This is an iterative process, when gaps are identified new test cases will be created for the identified gap and re-executed to confirm all the gaps are covered.”).

Claim 16:	
Miller in view of Chari and Bhatt teaches all the limitations of claim 15 as described above. Miller in view of Chari and Bhatt does not teach the following, however, Apte teaches:

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the computer program product as disclosed by Miller in view of Chari and Bhatt with the dynamic analysis iterations as taught by Apte in order “to confirm that all of the gaps [to be tested] are covered” (Apte [0064]).

Claim 17:	
Miller in view of Chari and Bhatt teaches all the limitations of claim 15 as described above. Miller in view of Chari and Bhatt does not teach the following, however, Apte teaches:
comprising program code executable by the processor to receive as input information generated by a previous dynamic analysis iteration perform dynamic analysis on the input information ([0064] “BRE 22 ensures that all of the business rules are extracted and 100% of the source code coverage is achieved. Dynamic code 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the computer program product as disclosed by Miller in view of Chari and Bhatt with the dynamic analysis iterations as taught by Apte in order “to confirm that all of the gaps [to be tested] are covered” (Apte [0064]).

Claims 4, 12 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Miller in view of Chari and Bhatt as applied to Claims 1, 9 and 15 above, and further in view of King et al. (US PGPUB 2016/0085524; hereinafter “King”).
Claim 4:	
Miller in view of Chari and Bhatt teaches all the limitations of claim 1 as described above. Miller in view of Chari and Bhatt does not teach the following, however, King teaches:
wherein the processor is to execute the hybrid code analysis with various time limits to generate a plurality of internal analysis states ([0028] “FIG. 3 is a flow diagram 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Miller in view of Chari and Bhatt with the analysis time limit as taught by King since “This allows the system to provide as much information as possible within a bounded execution time” (King [0007]).

Claim 12:	
Miller in view of Chari and Bhatt teaches all the limitations of claim 9 as described above. Miller in view of Chari and Bhatt does not teach the following, however, King teaches:
comprising executing the hybrid code analysis with various time limits to generate a plurality of internal analysis states ([0028] “FIG. 3 is a flow diagram that illustrates the processing of the dynamic analysis component to execute and discover code.” [0030] “Continuing in decision block 350, if the component has reached a limit while executing the file, then the component continues at block 360, else the component loops to block 310 to continue executing the file. The system may define several types 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as disclosed by Miller in view of Chari and Bhatt with the analysis time limit as taught by King since “This allows the system to provide as much information as possible within a bounded execution time” (King [0007]).

Claim 18:	
Miller in view of Chari and Bhatt teaches all the limitations of claim 15 as described above. Miller in view of Chari and Bhatt does not teach the following, however, King teaches:
comprising program code executable by the processor to execute the hybrid code analysis with various time limits to generate a plurality of internal analysis states ([0028] “FIG. 3 is a flow diagram that illustrates the processing of the dynamic analysis component to execute and discover code.” [0030] “Continuing in decision block 350, if the component has reached a limit while executing the file, then the component continues at block 360, else the component loops to block 310 to continue executing the file. The system may define several types of limits that place an upper bound on execution time of a particular file. These limits may include time-based limits (e.g., a threshold execution time).” [0031] “in block 360, the component marks the analysis of the file as incomplete.”).
.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Miller in view of Chari and Bhatt as applied to Claim 1 above, and further in view of Mcallister et al. (US PGPUB 2020/0082094; hereinafter “Mcallister”).
Claim 6:	
Miller in view of Chari and Bhatt teaches all the limitations of claim 1 as described above. Miller in view of Chari and Bhatt does not teach the following, however, Mcallister teaches:
wherein the machine learning classifier model is trained to predict a vulnerability score per line of code ([0064] “classification models may be trained on labeled layers in a training set, and the pattern may be deemed matched upon a designated classification being indicated after inputting the layer at issue into the trained classification model.” [0077] “Such risk scores may be based, for example, on a count of the number of potential vulnerabilities detected.” [0110] “the annotation includes information pertaining to several security vulnerabilities, examples including classifications, rankings, scores, or other metrics based on attributes of the vulnerabilities, such as classifying the line of code as unsecure based upon a number of security vulnerabilities having risk scores above some value”).
.

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Miller in view of Chari and Bhatt as applied to Claim 1 above, and further in view of Kerley et al. (US PGPUB 2010/0100693; hereinafter “Kerley”).
Claim 7:	
Miller in view of Chari and Bhatt teaches all the limitations of claim 1 as described above. Miller in view of Chari and Bhatt does not teach the following, however, Kerley teaches:
wherein the machine learning classifier model is trained to receive input alerts and output new alerts ([0050] “A monitoring system 1 of the invention processes the alerts received from the external detection systems. It comprises an adapter 2 which receives the alerts. It in turn feeds an alert classifier 3.” [0051] “The output of the system 1 is an output alert comprising a linking of a source alert with another source alert or with an historical event.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system as disclosed by Miller in view of Chari and Bhatt with the input and output alerts as taught by Kerley in order “to ensure that [the alerts] meet specific data quality tests” (Kerley [0054]).

With further regard to Claim 7, Miller further teaches wherein the model utilizes data in the vector format ([0028] “There are several features extracted for each type of runtime error and these features are combined into a feature vector. A portion of the feature vectors 114 generated for a type of runtime error is used as training data for the model generation component 116 to train a model 118 and another portion of the feature vectors 114 can be used by the model generation component 116 to test the model 118.”).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOANNE GONZALES MACASIANO whose telephone number is (571)270-7749.  The examiner can normally be reached on Monday to Thursday, 10:30 AM to 6:00 PM Eastern Standard Time.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Dennis Chow can be reached on (571)272-7767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for 

/J.G.M/
Examiner, Art Unit 2194      
	
/DOON Y CHOW/Supervisory Patent Examiner, Art Unit 2194