Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is in response to a request for continued examination filed 9/10/21.
Claim 1, 4-6, 8 11-13, 15 and 18-20 are pending.

Response to Arguments
35 U.S.C. §112
The applicant’s amendments are sufficient to overcome the previous rejections which are consequently withdrawn.

35 U.S.C. §103
Applicant’s arguments have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4-6, 8, 11-13, 15 and 18-20are rejected under 35 U.S.C. 103 as being unpatentable over US 2019/004959 to Chen (Chen) in view of US 10,503,908 to Bellis et al. (Bellis) in view of “A Feature Extraction Method Based on Stacked Denoising Autoencode for Massive High Dimensional Data” by Xu et al. (Xu) in view of US 2019/0079741 to Makkar (Makkar).

Claims 1, 8 and 15: Chen discloses a system for cross-technology code analysis, the system comprising: 
at least one memory device (Fig. 4, Main Memory 404) with computer-readable program code stored thereon (Fig. 4, Instructions 408); 
at least one communication device (Fig. 4, Network Interface Controller 418); 
at least one processing device operatively coupled to the at least one memory device and the at least one communication device (Fig. 4, Processor 124), wherein executing the computer-readable code is configured to cause the at least one processing device to: 

autoencode the program data to obtain encoded program data, wherein the encoded program data comprises a numerical representation of the program data (par. [0022] “use of a DNN, such as the application of a sparse autoencoder”);
vectorize the encoded program data, wherein vectorizing the program code comprises converting the encoded program data into a vector containing multiple vector dimensions (par. [0018] “a square similarity matrix … N rows … N columns …”); 
compare vectorized program code of the first program and vectorized program code of one or more additional programs and calculate a mathematical distance between the vectorized program code for the first program and the vectorized program code for the one or more additional programs (par. [0017] “quantitatively determines a measure of similarity between each pair of call graphs … a matching metric, or distance”); 
determine that the mathematical distance between the vectorized program code for the first program and the vectorized program code for the one or more additional programs is sufficiently small (par. [0018] “a distance of “0” refers to complete identify”, par. [0026] “k-means clustering may then be performed”); and 
cluster the vectorized program code of the first program and vectorized program code of one or more additional programs based on determining that the mathematical distance between the vectorized program code for the first program and the vectorized program code for the one or more additional programs is sufficiently small (par. [0023] 
identification of the first program and of the one or more additional programs (see e.g. par. par. [0024] “grouping the given program code set with other program codes [sic] sets that are recognized as being malicious”);
an executable portion configured to notify a user of issues related to a particular function or particular program feature (par. [0013] “determine … that another program code set under evaluation shares features in common with … codes sets that are recognized as being malicious … notifying a system administrator”); and
based on the one or more vectorized program clusters, calculate a similarity for the first program and the one or more additional programs (par. [0023] “groups, or clusters, the program code sets”); wherein the similarity includes identifying inter-program dependency between programs of the vectorized program cluster based on known dependency between programs (par. [0016] “extract … “call graphs” … to compactly represent the relationship among subroutines or functions”); and
identification of the first program and the one or more additional programs (par. [0013] "notifying a system administrator, ... sending electronic communications to users").

Chen does not disclose: 

retrieving the identification of the first program and the one or more additional programs based on the attached metadata.

Bellis teaches 
attaching metadata to a first program and one or more additional programs, wherein attached metadata comprises an identification of the first program and of the one or more additional programs (col. 2, lines 49-53 “the set of metadata includes a program name”); and
retrieving the identification of the first program and the one or more additional programs based on the attached metadata (col. 2, lines 49-53 “extracts at least the program name”).

It would have been obvious at the time of filing to attach metadata to the vectorized program code (Chen par. [0018] “a square similarity matrix”) comprising identification of the programs (Bellis col. 2, lines 49-53 “the set of metadata includes a program name”) and to retrieve these identifications (Bellis col. 2, lines 49-53 “extracts at least the program name”). Those of ordinary skill in the art would have been motivated to do so in order to properly identify the respective programs (e.g. Chen par. [0024] “grouping the given program with other program codes [sic] sets that are recognized as being malicious”, par. [0013] “notifying a system administrator … sending communications to user associated with the program code set”).

Chen and Bellis do not teach:
wherein autoencoding the program data further comprises:
manipulating the program data by adding noise data to the program data resulting in artificially corrupted data;
encoding the artificially corrupted data and decoding the artificially corrupted data, wherein decoding the artificially corrupted data further includes removing the added noise data to obtain decoded output data; and
repeating the encoding and decoding of artificially corrupted data until the decoded output data converges on the value of the received program data, resulting in a trained autoencoding module, the trained autoencoding module comprising multiple layers of autoencoding and decoding that are executed simultaneously;

Xu teaches:
wherein autoencoding the program data further comprises:
manipulating the program data by adding noise data to the program data resulting in artificially corrupted data (pg. 208, col. 1, 1st par. “Step1: Add artificial noise”);
encoding the artificially corrupted data and decoding the artificially corrupted data, wherein decoding the artificially corrupted data further includes removing the added noise data to obtain decoded output data (pg. 207, col. 2, 
repeating the encoding and decoding of artificially corrupted data until the decoded output data converges on the value of the received program data, resulting in a trained autoencoding module, the trained autoencoding module comprising multiple layers of autoencoding and decoding (pg. 208, col. 1, 2nd par. “Step2: If there is a next layer … used as the input of the second hidden layer”) that are executed simultaneously (pg. 207, Fig. 1 note that this shows several encoders and several decoders executing in parallel);

It would have been obvious at the time of filing add noise to the program data (Xu pg. 208, col. 1, 2nd par. “Add artificial noise”) to repeatedly encode and decode the program data (pg. 208, col. 2, 2nd par. “a next layer”). Those of ordinary skill in the art would have been motivated to do so as a known alternate feature extraction method which would have produced only the expected results (e.g. more accurate feature extraction see e.g. Xu pg. 209 last par. “this feature extraction method is more effective”).

Chen, Bellis and Xu do not explicitly teach:
determining that the mathematical distance between the programs is below a threshold value;
based on the one or more vectorized program clusters, calculate a storage requirement for the first program and the one or more additional programs; and


Makkar teaches:
determining a mathematical distance between programs is below a threshold (par. [0029] “each source code vector 23 may be compared for similarity … using a similarity threshold value”);
based on the one or more vectorized program clusters, calculate a storage requirement for the first program and the one or more additional programs (par. [0070] “the recommended library function will reduce the size of the source code by four lines”, requires calculation of storage requirements for each); and
generate a recommendation for reducing functional redundancy and the storage requirement for the first program and the one or more additional program within the vectorized program cluster (par. [0033] “Once the matching library functions 26 are identified … present library function recommendations 27 … for swapping the validated code snippets 25 with the matching library functions 26”, par. [0070] “the recommended library function will reduce the size of the source code by four lines”).



Further, Chen does not explicitly disclose an executable portion configured to display a query option via a graphical user interface wherein the query option allows a user to query the system for a particular function or particular program feature to view the one or more vectorized program clusters.

Makkar teaches an executable portion configured to display a query option via a graphical user interface (Fig. 4A, user interface 400) wherein the query option allows a user to query the system for a particular function or particular program feature to view the program (par. [0071] “user interface controls … the developer may cause [the display of] a code reduction opportunity for the selected input source code file”, see e.g. Fig. 4B). 



Claims 4, 11 and 18: Chen, Billis, Xu and Makkar teach claims 1, 8 and 15, further comprising identifying redundancy and functional similarity of program attributes between the first program and the one or more additional programs based on the calculated mathematical distance between the vectorized program code for the first program and the vectorized program code for the one or more additional programs (Chen par. [0017] “a measure of similarity between each pair of call graphs”, par. [0018] “a distance of “0” refers to complete identity”, here a “0” distance indicates redundancy between the two call graphs, optimizations based on this identification are addressed below in view of Makkar).

Claims 5, 12 and 19: Chen, Billis, Xu and Makkar teach claims 4, 11 and 18, further comprising calculating possible reduction in storage requirements based on the identified redundancy of program attributes (Chen par. [0023] “groups, or clusters, the program code sets, pursuant to 

Claims 6, 13 and 20: Chen, Billis, Xu and Makkar teach claims 5, 12 and 19 further comprising providing recommendations for storage reduction based on the calculated possible reduction in storage requirements (Makkar par. [0070] “the recommended library function will reduce the size of the source code by four lines”).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US 2012/0240236 to Wyatt et al. discloses additional uses of metadata including an identification of a program.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASON D MITCHELL whose telephone number is (571)272-3728.  The examiner can normally be reached on Monday through Thursday 7:00am - 4:30pm and alternate Fridays 7:00am 3:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lewis Bullock can be reached on (571)272-3759.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/JASON D MITCHELL/Primary Examiner, Art Unit 2199