Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This action is responsive to (preliminary) application filed on 8/23/2019. Claims 1 and 16 are independents. Claim 13 is canceled. Claims 1-12 and 14-27 are currently pending.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 12/14/2021 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement are being considered by the examiner.

Claim Rejections -35 USC 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103(a) are summarized as follows:
1. Determining the scope and contents of the prior art.

3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

	Claims 1-12 and 14-27 are rejected under 35 U.S.C. 103 as being unpatentable over Tan et al. (US 20190138731 A1), hereinafter Tan, in view of Klaiman et al. (US 20210049443 A1), hereinafter Klaiman, further in view of Shen et al. (US 20210035556 A1), hereinafter Shen.

	Regarding claims 1,16 and 27, Tan teaches [a] method (para. 0002) for (of) automatically detecting a security vulnerability in a source code using a machine learning model (para. 0006, use deep learning to generate new semantic features to help build more accuracy security vulnerability prediction models), wherein the method comprises:
obtaining the source code from a client codebase, wherein the client codebase is a complete or an incomplete body of the source code for a given software program or 
parsing the source code into an abstract syntax tree (AST) (para. 0012, Abstract Syntax Tree (AST) nodes from the set of training code);
	using a machine learning (ML) model to perform a ML based analysis on abstract syntax tree (AST) for detecting a first security vulnerability over a static source code (para. 0006, 0012, 0066, determining defects and security vulnerabilities in software code; extracting Abstract Syntax Tree (AST) nodes from the set of training code as tokens; defect prediction models using different machine learning classifiers were used including, but not limited to, ADTree, Naive Bayes, and Logistic Regression; obtaining tokens includes extracting syntactic information from the set of training code. In yet another aspect, extracting syntactic information includes extracting Abstract Syntax Tree (AST) nodes from the set of training code as tokens. In yet a further aspect, generating a DBN includes training the DBN), the machine learning based analysis comprising:
	flattening the abstract syntax tree (AST) into a sequence of structured tokens, wherein the sequence of structured tokens comprises a semantic structure and a syntactic structure of the source code (para. 0012, obtaining tokens includes extracting syntactic information from the set of training code. In yet another aspect, extracting syntactic information includes extracting Abstract Syntax Tree (AST) nodes from the set of training code as tokens. In yet a further aspect, generating a DBN includes training 
implementing a natural language processing technique on the sequence of structured tokens for mapping the sequence of structured tokens to one or more integers (para. 0012 and 0013, mapping between integer vectors and the tokens),
Tan does not explicitly disclose wherein the natural language processing comprises a Byte Pair Encoding (BPE). However, in an analogous art, Klaiman teaches wherein the natural language processing comprises a Byte Pair Encoding (BPE) (para. 0007, 0012, 0033 and 0048, applying byte-pair-encoding to segment the text 155 into tokens that corresponds to partial words as well as full words. Moreover, the classification engine 110 may preprocess the text 155 by at least embedding the tokens to form a matrix representation of the text 155. The matrix representation of the text 155 may include multiple vectors, each of which corresponding to one of the tokens from the text 155. Embedding a token may therefore include transforming each token included in the text 155 to form a corresponding vector representation.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to combine the teachings of Tan and Klaiman because BPE segments the text into both partial words and full words to produce a better training set of token data (Klaiman, para. 0033).
The combination of Tan and Klaiman does not explicitly disclose pre-training the machine learning model using an unlabeled source code as an input to predict a subsequent sub-token in the sequence of structured tokens, and training the machine 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to combine the teachings of Tan, Klaiman 

	Regarding claims 2 and 17, the combination of Tan, Klaiman and Shen teaches all of the limitations of claims 1 and 16, as described above. Tan further teaches wherein the method comprises detecting a second security vulnerability before compilation of the source code by performing a static analysis on a vectorized call graph (para. 0061-0063, tuning of parameters in order to improve the detection of bugs [finding more bugs]).

	Regarding claims 3 and 18, the combination of Tan, Klaiman and Shen teaches all of the limitations of claims 2 and 17, as described above. Tan further teaches wherein the method comprises detecting a third security vulnerability during the compilation of the source code by performing a library analysis on the vectorized call graph (para. 0061-0063, tuning of parameters in order to improve the detection of bugs [finding more bugs]).

	Regarding claims 4 and 19, the combination of Tan, Klaiman and Shen teaches all of the limitations of claims 3 and 18, as described above. Tan further teaches wherein the method comprises performing, using the machine learning model, a post-analysis on the first security vulnerability, the second security vulnerability, and the third 

	Regarding claims 5 and 20, the combination of Tan, Klaiman and Shen teaches all of the limitations of claims 1 and 16, as described above. Tan further teaches wherein the method comprises generating a database with the source code and its associated metadata, wherein the source code comprises the unlabeled source code and the labeled source code (para. 0093, For labelling security vulnerability data, vulnerabilities which were recorded in National Vulnerability Database (NVD) are collected. Specifically, all the source of vulnerability reports of a project recorded in NVD are collected. Usually, a vulnerability report contains a bug report recorded in BTS. After a CVE is linked to a bug report, the security vulnerability data can be labelled).

	Regarding claims 6 and 21, the combination of Tan, Klaiman and Shen teaches all of the limitations of claims 1 and 16, as described above. Tan further teaches wherein the abstract syntax tree (AST) is a tree representation of an abstract syntactic structure of the source code written in a programming language (para. 0085, an ADTree based explanation generator for general defect prediction models with traditional source code metrics. More specifically, a decision tree (ADTree) classifier model is generated or built using history data with general traditional source code metrics. The ADTree classifier assigns each metric a weight and adds up the weights of all metrics of a change. For example, if a change contains a function call sequence, i.e. A->B->C, then it may receive a weight of 0.1 according to the ADTree model. If this sum of weights is 

	Regarding claims 7 and 22, the combination of Tan, Klaiman and Shen teaches all of the limitations of claims 3 and 18, as described above. Tan further teaches wherein the method comprises generating a call graph by integrating the abstract syntax tree (AST) with a control and a dataflow of the source code, wherein the call graph represents calling relationships between subroutines in a computer program (para. 0085, an ADTree based explanation generator for general defect prediction models with traditional source code metrics. More specifically, a decision tree (ADTree) classifier model is generated or built using history data with general traditional source code metrics. The ADTree classifier assigns each metric a weight and adds up the weights of all metrics of a change. For example, if a change contains a function call sequence, i.e. A->B->C, then it may receive a weight of 0.1 according to the ADTree model. If this sum of weights is over a threshold, the input data (i.e. a source code file, a commit, or a change) is predicted buggy. The disclosure may interprets the predicted buggy instance with metrics that have high weights. In addition, for better presenting the confidence of the generated explanations, the method also shows the X-out-of-Y 

	Regarding claims 8 and 23, the combination of Tan, Klaiman and Shen teaches all of the limitations of claims 7 and 22, as described above. Tan further teaches wherein the method comprises implementing an embedded technique on the call graph to generate the vectorized call graph (FIG. 9 and para. 0057, source files (or a set of training code) are parsed to obtain tokens. Using these tokens, vectors of AST nodes are then encoded. Semantic features are then generated based on the tokens and then defect prediction can be performed).

	Regarding claims 9 and 24, the combination of Tan, Klaiman and Shen teaches all of the limitations of claims 4 and 19, as described above. Tan further teaches wherein the method comprises providing the final security vulnerability on an expert device for receiving a first input from a security expert (para. 0072, given input data such as a source code file, a commit, or a change, if the input data is declared buggy (i.e. contains software bugs or security vulnerabilities), the method of the disclosure may further scan the source code of this predicted buggy instance for common software bug and vulnerability patterns. In its declaration, a check is performed to determine the location of the predicted bugs within the code and the reason why they are considered bugs).

Regarding claims 10 and 25, the combination of Tan, Klaiman and Shen teaches all of the limitations of claims 9 and 24, as described above. Tan further teaches wherein the method comprises processing the first input on the final security vulnerability, wherein the first input comprises a feedback associated with the final security vulnerability (para. 0033, the DBN enables the network to reconstruct the input data using generated features by adjusting weights between nodes in different layers).

	Regarding claims 11 and 26, the combination of Tan, Klaiman and Shen teaches all of the limitations of claims 9 and 24, as described above. Tan further teaches wherein the method comprises providing the first input on the final security vulnerability as training data to train the machine learning model and to improve an accuracy of the prediction of a presence of security vulnerabilities within the source code (para. 0063, the tuning of parameters in order to improve the detection of bugs).

	Regarding claim 12, the combination of Tan, Klaiman and Shen teaches all of the limitations of claim 4, as described above. Tan further teaches wherein the method comprises providing the final security vulnerability to a user on a user device (para. 0013, report of the software defects and vulnerabilities is displayed;).

	Regarding claim 14, the combination of Tan, Klaiman and Shen teaches all of the limitations of claim 1, as described above. Tan further teaches wherein the source code comprises at least one of a method, a class, a package or variable names along with comments and string literals (para. 0055, defects may be collected from a bug 

	Regarding claim 15, the combination of Tan, Klaiman and Shen teaches all of the limitations of claims 3, as described above. Tan further teaches the library analysis is performed using a software component analysis tool (para.0083 and 0084, the system uses these checkers to scan the predicted buggy code snippets. It is determined that there is a match between a buggy code snippet and a checker [software component analysis tool], if any violations to the checker is reported on the buggy code snippet).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHU CHUN GAO whose telephone number is (571)270-5999. The examiner can normally be reached on Monday - Thursday 6:00-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KRISTINE KINCAID can be reached on 571-272-4063. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


/SHU CHUN GAO/ 	Examiner, Art Unit 2437 


/MATTHEW SMITHERS/           Primary Examiner, Art Unit 2437