Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is in response to an application filed 2/26/20.
Claims 1-20 are pending.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-6, 8-13 and 15-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract idea without significantly more. The claims recite performing an analysis of source code which constitutes a mental process. 
Claim 1 recites:
determining, by a processor, a plurality of keywords based on a corpus of programming artifacts, the corpus of programming artifacts comprising source code corresponding to a software project; 
determining a plurality of context/keyword pair sets based on the plurality of keywords and the corpus of programming artifacts, wherein each context/keyword pair set of the plurality of context/keyword pair sets comprises a first keyword, a second keyword, and a context type corresponding to a co-occurrence of the first keyword and the second keyword in the corpus of programming artifacts; and 
constructing a word embedding matrix based on the plurality of context/keyword pair sets.


This judicial exception is not integrated into a practical application because the “processor” is only recited at a high level of generality (i.e., as a generic processor performing generic computer functions) such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application. Thus, the claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because As discussed above, the use of a generic “processor” to perform the determining and constructing steps amounts to no more than an instruction to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Thus the claim is not patent eligible.
Claims 2-6 recite details regarding how the terms (i.e. tokens) and portion (i.e. context) are identified and introduces “ranking” and “selecting” steps. These additional steps also cover performance of the limitations in the mind and thus are directed to an abstract mental process. 
Claim 8 recites a system “a memory” and “one or more processors” for performing the method of claim 1 and thus, but for the recitation of “a memory” and “one or more processors”, is also directed to an abstract idea. 
The additional elements of “a memory” and “one or more processors” are only recited at a high level of generality (i.e., as a generic computer components performing generic computer functions) such that they amount to no more than mere instructions to apply the exception using a generic computer component. Accordingly, for the reasons discussed above, these additional elements do not integrate the abstract idea into a practical application, or recite an inventive concept. Thus the claim is not patent eligible.
Claims 9-13 recite language similar to that of claims 2-6 and are thus similarly directed to an abstract idea.
Claim 15 recites “a computer readable storage medium having program instructions” for performing the steps of claim 1 and thus, but for the recitation of “a computer readable storage medium”, is also directed to an abstract idea.
The additional element of “a computer readable storage medium” is only recited at a high level of generality (i.e., as a generic computer component performing generic computer functions) such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Accordingly, for the reasons discussed above, these additional elements do not integrate the abstract idea into a practical application, or recite an inventive concept. Thus the claim is not patent eligible.
Claims 16-20 recite language similar to that of claims 2-6 and are thus similarly directed to an abstract idea.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-4, 6, 8-11, 13, 15-18 and 20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by US 10,901,708 to Reas et al. (Reas).

Claims 1, 8 and 15: Reas discloses a computer-implemented method comprising: 
determining, by a processor, a plurality of keywords based on a corpus of programming artifacts, the corpus of programming artifacts comprising source code corresponding to a software project (col. 8, lines 42-43 “obtaining a plurality of code files … generating an abstract syntax tree”); 

constructing a word embedding matrix based on the plurality of context/keyword pair sets (col. 9, lines 1-3 “generating one or more word embeddings using the co-occurrence”).

Claims 2, 9 and 16: Reas discloses claims 1, 8 and 15, wherein determining the plurality of keywords comprises:
determining a naming convention of the corpus of programming artifacts (col. 5, lines 54-58 “tokens following code conventions (e.g., camelCase can be divided into sub tokens: camel and case)”, note that determining the particular convention is necessary for this step); 
determining a plurality of tokens based on the determined naming convention (col. 5, lines 53-54 “embeddings can be trained at the sub-token level … (e.g., camelCase can be divided into sub tokens: camel and case)”);
constructing a manifest feature vector based on the plurality of tokens and the corpus of programming artifacts (col. 5, lines 57-58 “The resulting subtokens can be linked to the parent token through a node type or path”, col. 4, lines 43-46 “the pair of tokens … of each path”); 

selecting a subset of the plurality of tokens as keywords based on the manifest feature vector (e.g. col. 8, lines 55-64 “generating the co-occurrence matrix using the number of times 3each pair of terminal node values co-occur”, also see e.g. col. 9, lines 4-9 “filtering the plurality of paths”).

Claims 3, 10 and 17: Reas discloses claims 2, 9 and 16, wherein the naming convention comprises one of camel case, kebab case, and snake case (col. 5, lines 54-58 “camelCase”, note that kebab and snake case were well known conventions and thus would at least have been obvious over Reas’ discussion of naming conventions in general).

Claim 4, 11 and 18: Reas discloses claims 1, 8 and 15, wherein the context type corresponds to a type of a statement in the source code of the corpus of programming artifacts, wherein the first keyword and the second keyword co-occur in the statement (col. 7, line 67-col. 8, line 3 “the link form “id2” in the for loop to “true” in the if statement … tracing a path form the terminal node “id2” in the “foreach” node to the terminal node in the “if” node”, also see Figs. 4-5).

Claims 6, 13 and 20: Reas discloses claims 1, 8 and 15, wherein the context type corresponds to a common prefix or suffix of the first keyword and the second keyword in the corpus of .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 5, 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over US 10,901,708 to Reas et al. (Reas) in view of US 2014/0282373 to Garza (Garza).

Claims 5, 12 and 19: Reas discloses claims 1, 8 and 15, wherein the context type corresponds to a business rule corresponding to the corpus of programming artifacts, wherein the first keyword and the second keyword co-occur in the business rule.



Garza teaches discovering a business rule from an AST (e.g. par. [0038] “business rules may be extracted from the abstract syntax trees (AST)”).

It would have been obvious at the time of filing to determine context/keyword pair sets wherein the first and second keywords co-occur in a business rule (Garza par. [0038] “business rules may be extracted from the abstract syntax trees (AST)”, Reas col. 8, lines 53 “identifying pairs of terminal node values … of each of the plurality of paths”). Those of ordinary skill in the art would have been motivated to do so to further understanding of the code (see e.g. Garza par. [0021] “speed up development, reduce costs, and decrease human error … allow business analysts to verify and update business rules”, Reas col. 6, lines 40-46 “analyze the code, e.g., to identify potential errors”).

Claims are rejected under 35 U.S.C. 103 as being unpatentable over US 10,901,708 to Reas et al. (Reas) in view of “Enriching Word Vectors with Subword Information” by Bojanowski et al. (Bojanowski).

Claim 7 and 14: Reas discloses claims 1 and 8, wherein constructing the word embedding matrix based on the plurality of context/keyword pair sets comprises: 

stacking the latent embedding matrix with a manifest feature vector corresponding to the corpus of programming artifacts to construct the word embedding matrix (col. 4, lines 63-66 “the co-occurrence matrix 118 can be input to embeddings generator 120”).

Reas does not explicitly disclose:
wherein the word embedding matrix is used to train a recurrent neural network (RNN) to process source code.

Bojanowski teaches a word embedding matrix used to train a recurrent neural network (pg. 8, col. 1, last partial par. “Our model is a recurrent neural network”).

It would have been obvious at the time of filing to use the word embedding matrix (Reas col. 4, lines 62-66 “training word embeddings”) to train a recurrent neural network (Bojanowski pg. 8, col. 1, last partial par. “Our model is a recurrent neural network”). Those of ordinary skill in the art would have been motivated to do so as a known means of training a model which would have produced only the expected results (see e.g. Reas col. 4, lines 62-66 “implement one or more unsupervised NLP techniques”). 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
“Semantic Source Code Models Using Identifier Embeddings” by Efstathiou et al. discloses alternate methods of constructing a word embedding matrix based on a corpus of programming artifacts.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASON D MITCHELL whose telephone number is (571)272-3728. The examiner can normally be reached Monday through Thursday 7:00am - 4:30pm and alternate Fridays 7:00am 3:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lewis Bullock can be reached on (571)272-3759. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for 





/JASON D MITCHELL/Primary Examiner, Art Unit 2199