Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This Office action is in response to the amendment filed on June 10, 2022.
Claims 1-20 are pending. 
 Response to Arguments
Applicant’s arguments filed on June 10, 2022 have been fully considered, but they are not persuasive.
In the Remarks, Applicant argues:
Smith is not directed to tokenizing, by the one or more processors each of the identified function names and identified class names. It is beyond the reasonable broadest interpretation to include Smith the same space as this limitation. Further, a person having skill in the art would not view these Smith nor Gupta at the time of filing and combine them. See Remarks at pg. 2.
Examiner’s response:
Examiner respectfully disagrees. First, the tokenizing action is not defined in Applicant’s specification. Therefore, it is interpreted as a common and routine task performed in the computer field as essentially splitting a phrase, sentence, paragraph, or an entire text document into smaller units, such as individual words or terms. Each of these smaller units are called tokens. 
Second, Smith clearly discloses tokenizing, by the one or more processors, each of the identified function names and identified class names. More specifically, at Fig. 3 and ¶ 61, Smith discloses that Scanner 324 may accept as input the source code 310 and split expressions and language statements into tokens (i.e. by the scanner’s tokenizing action) that can be processed by the parser 326 to determine the grammatical structure of a program. A token may be a single element of a programming language such as a constant, identifier (e.g., function names and class names), operator, separator, reserved word, or other elements. And in Fig. 5A and ¶ 58, Smith uses an example to illustrate the tokenizing action: if the preceding tokens are a variable name and a “.” then in some languages the following token must be a method or variable of the class of the preceding variable. The tokenizing action disclosed by Smith is commensurate with Applicant’s scope of the claim as shown in Spec. ¶ 21: For example, a function named: read_file(file_name) may be tokenized to the following [Read, File]. 
Third, Applicant is required to provide very specific counter arguments as to why Smith is not directed to tokenizing, by the one or more processors, each of the identified function names and identified class names. In this regard, Applicant's argument fails to comply with 37 CFR 1.111 (b) because it amounts to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references. 
Therefore, for at least the reason set forth above, the rejections made under 35 U.S.C. § 103(a) with respect to claims 1, 8, and 15 are proper and therefore, maintained.

With respect to the remaining dependent claims, Applicant merely reiterates the argument made regarding claims 1, 8, and 15 and asserts that any additional references cited by Examiner fail to resolve the alleged deficiencies in the rejections of the independent claims (see Remarks at pg. 2).  Applicant’s arguments are unpersuasive for the same reasons articulated above with respect to claims 1, 8, and 15.  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 5, 7-8, 12, 14-15, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over US 2020/0097261  (hereinafter "Smith”) in view of US 2019/0228319 (hereinafter “Gupta”).
In the following claim analysis, Applicant’s claim limitations are shown boldfaced and Examiner’s explanations/notes/remarks are in square brackets and emphases are underlined.

Referring to Claim 1, Smith discloses:
a computer-implemented method for code refactor renaming (Smith, Abstract), the method comprising:
receiving, by one or more processors, a source code dataset, wherein the dataset is comprised of a plurality of functions and a plurality of classes (Smith, Fig. 3, ¶ 61, Scanner 324 may accept as input the source code 310 and split expressions and language statements [including classes] into tokens); 
identifying, by the one or more processors, function names and class names from the plurality of functions and the plurality of classes of the source code dataset (Smith, Fig. 5A, ¶ 58, if the preceding tokens are a variable name and a “.” then in some languages the following token must be a method or variable of the class of the preceding variable … The beam search may open up a look up table to identify all the methods and variables of the class; ¶ 97, the keywords may comprise the text of a function name); 
tokenizing, by the one or more processors, each of the identified function names and identified class names (Smith, Fig. 3, ¶ 61, Scanner 324 may accept as input the source code 310 and split expressions and language statements into tokens that can be processed by the parser 326 to determine the grammatical structure of a program. A token may be a single element of a programming language such as a constant, identifier [e.g., function names and class names], operator, separator, reserved word, or other elements; Fig. 5A, ¶ 58, if the preceding tokens are a variable name and a “.” then in some languages the following token must be a method or variable of the class of the preceding variable … The beam search may open up a look up table to identify all the methods and variables of the class; ¶ 97, the keywords may comprise the text of a function name); 
generating, by the one or more processors, features for the source code of the plurality of functions and the plurality of classes (Smith, Fig. 5C, ¶ 108, analyze or identify a set of features to determine code snippets …  Features that may be used include …  an identifier of the current function … and other features; ¶ 121,  the automatic refactoring that is detected is the renaming of a symbol, such as an identifier of a variable, function, class, or other entity);  
training, by the one or more processors, a machine learning model through regression to map the combined features to the corresponding tokenized function names (Smith, ¶ 57, machine learning model 20 … The probability of the completion may be computed based on inputting the existing code before the completion plus the proposed completion into the language model and receiving as output a probability value … to search across multiple candidate next tokens by trying a plurality of tokens and computing their probability … based on inputting the tokens into the language model; ¶ 44, the machine learning model 200 output may be, for example, a numerical value in the case of regression or an identifier of a category in the case of classifier. … the input object that may be considered by the machine learning model 200 in making its decision may be referred to as features);
receiving, by the one or more processors, a programming code with the same naming convention as the determined naming convention at the trained machine learning model (Smith, Fig. 5, ¶ 78, relevant code snippets are determined by machine learning model 200 … typing a particular token for a class automatically selects as potential code completions the sub-methods or variables of the class. In another example, typing the name of a function automatically selects as a potential code completion a code snippet representing a pattern for making a call to the class); and 
generating, by the one or more processors, a name recommendation for one or more functions with the trained machine learning model (Smith, Fig. 5, ¶ 108, the features are used as inputs to a machine learning model 200 in later steps in order to perform prediction; Fig. 6B, ¶ 123, Upon user selection of one of the choices for refactoring, the correct refactoring [a name recommendation] is determined in step 604 by machine learning model 200). 
Smith does not appear to explicitly disclose feature vectors and generating, by the one or more processors, feature vectors for the docstrings associated with the plurality of functions and the plurality of classes; combining, by the one or more processors, each feature vector of the docstrings with the corresponding feature vector of the source code in the at least one of the functions and the class; and training, by the one or more processors, a machine learning model to map the combined feature vectors. However, in an analogous art to the claimed invention in the field of processing software, Gupta teaches:
generating, by the one or more processors, feature vectors for encoded representations (Gupta, Fig. 3, ¶ 33, feature vector generation engine 308; ¶ 81, generating encoded representations for each feature using a plurality of neural networks, combining the encoded representations of each feature into a respective feature vector);
generating, by the one or more processors, feature vectors for the docstrings associated with the plurality of functions and the plurality of classes (Gupta, ¶ 82, generating a relevant training dataset and a non-relevant training dataset, the relevant training dataset including feature vectors); 
combining, by the one or more processors, each feature vector of the docstrings with the corresponding feature vector of the source code in the at least one of the functions and the class (Gupta, Fig. 5, ¶ 82, training the deep learning model with feature vectors from the relevant training dataset and feature vectors from the non-relevant training dataset); 
training, by the one or more processors, a machine learning model to map the combined feature vectors to code snippet (Gupta, Fig. 5, ¶ 60,  The relevant and non-relevant training datasets are used to train and test the deep learning model; Fig. 6, The deep learning model generates a probability for each review indicating the likelihood of the relevance of the review to the code snippet (block 612)).
Therefore, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention having the teaching of Smith and Gupta before him/her to modify Smith’s system to include generating, by the one or more processors, feature vectors for the source code of the plurality of functions and the plurality of classes; generating, by the one or more processors, feature vectors for the docstrings associated with the plurality of functions and the plurality of classes; combining, by the one or more processors, each feature vector of the docstrings with the corresponding feature vector of the source code in the at least one of the functions and the class; and training, by the one or more processors, a machine learning model through regression to map the combined feature vectors to the corresponding tokenized function names, with a reasonable expectation of success. The modification would be obvious because one of ordinary skill in the art would be motivated to automatically review source code from source code files by generating a corresponding encoded representation of each feature extracted from the source code files and the features are then assembled into feature vectors, which are then input into the deep learning model to generate a desired result (Gupta, ¶ ¶ 59-63).

Referring to Claim 5, the rejection of Claim 1 is incorporated. Smith as modified further discloses wherein generating feature vectors for the source code of the plurality of functions and the plurality of classes is performed by a code encoder (Gupta, Fig. 6, ¶ 62, The code snippet is paired with each of the reviews closely matching the features of the code snippet into respective LSTMs to generate encoded representation of the features (block 608). The encoded representations of the code snippet, upper context, lower context, and review are then combined into a single feature vector (block 610) which is then input into the deep learning model). The motivation to combine the references is the same as set forth in the rejection of Claim 1.

Referring to Claim 7, the rejection of Claim 1 is incorporated. Smith as modified further discloses wherein combining each feature vector of the docstrings with the corresponding feature vector of the source code in the at least one of the functions and the class is taking the average of the corresponding feature vector (Gupta, Fig. 6, ¶ 62, The code snippet is paired with each of the reviews closely matching the features of the code snippet into respective LSTMs to generate encoded representation of the features (block 608). The encoded representations of the code snippet, upper context, lower context, and review are then combined into a single feature vector (block 610) which is then input into the deep learning model. The deep learning model generates a probability for each review indicating the likelihood of the relevance of the review to the code snippet (block 612). Reviews are then selected based on the probability score and a target objective). The motivation to combine the references is the same as set forth in the rejection of Claim 1.

Referring to claims 8, 12, and 14, the claims are system claims corresponding to the method claims 1, 5, and 7. Therefore, they are rejected under the same rational set forth in the rejection of the method claims.

Referring to claims 15 and 19, the claims are product claims corresponding to the method claims 1 and 5. Therefore, they are rejected under the same rational set forth in the rejection of the method claims.

Claims 2-3, 9-10, and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over US 2020/0097261 (hereinafter "Smith”) in view of US 2019/0228319 (hereinafter “Gupta”) and further in view of US 2021/0240825 (hereinafter “Kutt”).

Referring to Claim 2, the rejection of Claim 1 is incorporated. Smith as modified does not appear to explicitly disclose wherein identifying function names and class names is based on an abstract syntax tree. However, in an analogous art to the claimed invention in the field of  electric digital data processing, Kutt teaches wherein identifying function names and class names is based on an abstract syntax tree (Kutt, Fig. 6A, ¶ 93, different source code representations are collected (e.g., the different representations of the JS source code file includes an Abstract Syntax Tree (AST), characters (Char), tokens (Token), and (optionally) hand-crafted features).
Therefore, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention having the teaching of Smith as modified and Kutt before him/her to modify Smith’s modified system to include identifying function names and class names is based on an abstract syntax tree, with a reasonable expectation of success. The modification would be obvious because one of ordinary skill in the art would be motivated to collect different source code representations including an Abstract Syntax Tree (AST), characters (Char), tokens (Token) for performing multi-representational learning applied in the process of static analysis of source code (Kutt, ¶ 93).

Referring to Claim 3, the rejection of Claim 1 is incorporated. Smith as modified further discloses wherein tokenizing function names and class names is performed by an n-gram generator model (Kutt, claim 10, transmit a copy of the received file to a security platform and perform the n-gram analysis). The motivation to combine the references is the same as set forth in the rejection of Claim 2.

Referring to Claims 9-10, the rejections of Claim 8 is incorporated and the claims are corresponding to the method claims 2-3. Therefore, they are rejected under the same rational set forth in the rejection of the method claims.

Referring to Claims 16 and 17, the rejections of Claim 15 is incorporated and the claims are corresponding to the method claims 2-3. Therefore, they are rejected under the same rational set forth in the rejection of the method claims.

Claims 4, 6, 11, 13, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over US 2020/0097261 (hereinafter "Smith”) in view of US 2019/0228319 (hereinafter “Gupta”) and further in view of US 2020/0175478 (hereinafter “Lee”).

Referring to Claim 4, the rejection of Claim 1 is incorporated. Smith as modified does not appear to explicitly disclose wherein the machine learning model is a sequence to sequence model. However, in an analogous art to the claimed invention in the field of  electric digital data processing, Lee teaches wherein the machine learning model is a sequence to sequence model (Lee, ¶ 23, one or more of the following may be applied to a received message in performing coarse and/or fine-grained processing of that message … general sequence-to-sequence models, generative models, recurrent neural networks for feature extractors (e.g., long short-term memory models, GRU models), deep neural network models for word-by-word classification, and latent variable graphical models))
Therefore, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention having the teaching of Smith as modified and Lee before him/her to modify Smith’s modified system to include that the machine learning model is a sequence to sequence model, with a reasonable expectation of success. The modification would be obvious because one of ordinary skill in the art would be motivated to identify a subset of sentences of the plurality of sentences for assisting with scheduling a meeting  (Lee, Abstract).

Referring to Claim 6, the rejection of Claim 1 is incorporated. Smith as modified further discloses generating feature vectors for the docstrings associated with the plurality of functions and the plurality of classes is performed by a sentence encoder (Lee, ¶ 37, each of those words, phrases, and/or sentences may have a relatively high encoding score associated with it. As such, the digital assistant service may pass each of those words, phrases, and/or sentences along for processing through application of a conditional random field model). The motivation to combine the references is the same as set forth in the rejection of Claim 4.

Referring to Claims 11 and 13, the rejections of Claim 8 is incorporated and the claims are corresponding to the method claims 4 and 6. Therefore, they are rejected under the same rational set forth in the rejection of the method claims.

Referring to Claims 18 and 20, the rejections of Claim 15 is incorporated and the claims are corresponding to the method claims 4 and 6. Therefore, they are rejected under the same rational set forth in the rejection of the method claims.

Conclusion
THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a). Applicant is reminded of the extension of time policy set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
US 2021/0234930 teaches tokenized information including one or more of: keywords associated with code characteristics of the microservice; variables and/or function names used by the microservice;
US 2021/0120034 teaches parsing the JavaScript code by tokenizing words, and  tokenizing delimiters, removing comments indicated by “// . . . ,” and replacing variable/function names with standardized names; and 
US 2020/0183681 teaches identifying code words of the source code by performing variable name and function name tokenization.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAXIN WU whose telephone number is (571)270-7721.  The examiner can normally be reached on M-F (7 am - 11:30 am; 1:30- 5 pm).
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wei Zhen can be reached at (571) 272-3708.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/DAXIN WU/
Primary Examiner, Art Unit 2191