Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is in response to amendments filed 4/23/21.
Claim 1-6, 8-13 and 15-20 are pending.

Response to Arguments
A. 35 U.S.C. §101
The applicant’s arguments (e.g. pg. 13, 2nd full par.) that, e.g., the claimed “autoencoding” could not be performed in the human mind are persuasive. Accordingly, the rejection(s) are withdrawn.

B. 35 U.S.C. §103
Applicant's arguments have been fully considered but they are not persuasive.
With regard to the amended recitation of repeating the encoding and decoding of artificially corrupted data until the decoded output data converges on the value of the received program data, resulting in a trained autoencoding module, the trained autoencoding module comprising multiple layers of autoencoding and decoding that are executed simultaneously, the Office cites Xu as teaching a portion of this recitation. (Office Action, page 14). However, Xu does not appear to explicitly teach or suggest the trained autoencoding module comprising multiple layers of autoencoding and decoding that are executed simultaneously. In fact, Xu specifically states at multiple points throughout its disclosure that the SDAE used trains each layer of neural network by unsupervised “layer-by-layer.” (See Xu, Abstract, pages 208-09).

Initially it is noted that the claim is directed to “executing” the autoencoding and decoding simultaneously and not training the layers simultaneously. Further Xu’s figure 1 shows a 
Further, applicant’s specification does not appear to provide additional detail regarding this simultaneous execution. Instead all that is disclosed is:
[0007] In some embodiments, the trained autoencoding module further comprises multiple layers of autoencoding and decoding that are executed simultaneously.
[0046] … the autoencoding module 300 involves a stacked denoising autoencoding process, which is used to improve the speed of the denoising autoencoding process by stacking several layers of autoencoding that may be executed simultaneously. … 

This (nor for example the discussion of fig. 3) does not appear to enable any modifications to a prior art SDAE which would provide any additional functionality/flexibility which would enable (previously unknown) simultaneous execution. Accordingly, applicant’s arguments/amendments are not persuasive of a non-obvious distinction over the cited prior art. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

Claims 2-3, 9-10 and 16-17 are rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends. 
Specifically, the applicant’s amendments filed 4/23/21 appear to have included the limitations of claims 2-3, 9-10 and 16-17 into independent claims 1, 8 and 15 without canceling or amending the original dependent claims. Accordingly, claims 2-3, 9-10 and 16-17 fail to further limit amended claims 1, 8 and 15. 
Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4-6, 8, 11-13, 15 and 18-20are rejected under 35 U.S.C. 103 as being unpatentable over US 2019/004959 to Chen (Chen) in view of US 2019/0079741 to Makkar (Makkar).

Claims 1, 8 and 15: Chen discloses a system for cross-technology code analysis, the system comprising: 
at least one memory device (Fig. 4, Main Memory 404) with computer-readable program code stored thereon (Fig. 4, Instructions 408); 
at least one communication device (Fig. 4, Network Interface Controller 418); 
at least one processing device operatively coupled to the at least one memory device and the at least one communication device (Fig. 4, Processor 124), wherein executing the computer-readable code is configured to cause the at least one processing device to: 
receive program data of a first program for analysis (par. [0016] “code feature extraction engine 164 processes (block 204) … program code sets”); 
autoencode the program data to obtain encoded program data, wherein the encoded program data comprises a numerical representation of the program data (par. [0022] “use of a DNN, such as the application of a sparse autoencoder”);
vectorize the encoded program data, wherein vectorizing the program code comprises converting the encoded program data into a vector containing multiple vector dimensions (par. [0018] “a square similarity matrix … N rows … N columns …”); 
compare vectorized program code of the first program and vectorized program code of one or more additional programs and calculate a mathematical distance 
determine that the mathematical distance between the vectorized program code for the first program and the vectorized program code for the one or more additional programs is sufficiently small (par. [0018] “a distance of “0” refers to complete identify”, par. [0026] “k-means clustering may then be performed”); and 
cluster the vectorized program code of the first program and vectorized program code of one or more additional programs based on determining that the mathematical distance between the vectorized program code for the first program and the vectorized program code for the one or more additional programs is sufficiently small (par. [0023] “groups, or clusters, the program code sets, pursuant to block 216”, par. [0018] “a distance of “0” refers to complete identify”, par. [0026] “k-means clustering may then be performed”), producing one or more vectorized program clusters (par. [0023] “groups, or clusters, the program code sets”); and
an executable portion configured to notify a user of issues related to a particular function or particular program feature (par. [0013] “determine … that another program code set under evaluation shares features in common with … codes sets that are recognized as being malicious … notifying a system administrator”); and
based on the one or more vectorized program clusters, calculate a similarity for the first program and the one or more additional programs (par. [0023] “groups, or 

Chen does not disclose:
wherein autoencoding the program data further comprises:
manipulating the program data by adding noise data to the program data resulting in artificially corrupted data;
encoding the artificially corrupted data and decoding the artificially corrupted data, wherein decoding the artificially corrupted data further includes removing the added noise data to obtain decoded output data; and
repeating the encoding and decoding of artificially corrupted data until the decoded output data converges on the value of the received program data, resulting in a trained autoencoding module, the trained autoencoding module comprising multiple layers of autoencoding and decoding that are executed simultaneously;

Xu teaches:
wherein autoencoding the program data further comprises:
st par. “Step1: Add artificial noise”);
encoding the artificially corrupted data and decoding the artificially corrupted data, wherein decoding the artificially corrupted data further includes removing the added noise data to obtain decoded output data (pg. 207, col. 2, last par. “after it has been coded and decoded, an uncontaminated origin sample information is finally restored”); and
repeating the encoding and decoding of artificially corrupted data until the decoded output data converges on the value of the received program data, resulting in a trained autoencoding module, the trained autoencoding module comprising multiple layers of autoencoding and decoding (pg. 208, col. 1, 2nd par. “Step2: If there is a next layer … used as the input of the second hidden layer”) that are executed simultaneously (pg. 207, Fig. 1 note that this shows several encoders and several decoders executing in parallel);

It would have been obvious at the time of filing add noise to the program data (Xu pg. 208, col. 1, 2nd par. “Add artificial noise”) to repeatedly encode and decode the program data (pg. 208, col. 2, 2nd par. “a next layer”). Those of ordinary skill in the art would have been motivated to do so as a known alternate feature extraction method which would have produced only the expected results (e.g. more accurate feature extraction see e.g. Xu pg. 209 last par. “this feature extraction method is more effective”).

Chen does not explicitly disclose:
determining that the mathematical distance between the programs is below a threshold value;
based on the one or more vectorized program clusters, calculate a storage requirement for the first program and the one or more additional programs; and
generate a recommendation for reducing functional redundancy and the storage requirement for the first program and the one or more additional program within the vectorized program cluster, wherein the recommendation includes identifying inter-program dependency between programs of the vectorized program cluster based on known dependency between programs. 

Makkar teaches:
determining a mathematical distance between programs is below a threshold (par. [0029] “each source code vector 23 may be compared for similarity … using a similarity threshold value”);
based on the one or more vectorized program clusters, calculate a storage requirement for the first program and the one or more additional programs (par. [0070] “the recommended library function will reduce the size of the source code by four lines”, requires calculation of storage requirements for each); and
generate a recommendation for reducing functional redundancy and the storage requirement for the first program and the one or more additional program within the 

It would have been obvious at the time of filing to cluster the program code based on a determination that the mathematical distance is below a threshold value (Chen par. [0023] “groups, or clusters, the program code sets”, Makkar par. [0029] “using a similarity threshold value”) and to generate a recommendation for reducing functional redundancy and the storage requirement (Makkar par. [0070] “the recommended library function will reduce the size of the source code by four lines”) based on known dependency between programs (Chen par. [0016] “the relationship among subroutines or functions”). Those of ordinary skill in the art would have been motivated to do so as a known alternative means of performing the clustering which would have produced only the expected results (e.g. reducing complexity of the algorithm) and improve code reuse and maintainability (see e.g. Makkar par. [0003]). 

Further, Chen does not explicitly disclose an executable portion configured to display a query option via a graphical user interface wherein the query option allows a user to query the system for a particular function or particular program feature to view the one or more vectorized program clusters.



It would have been obvious to provide a query option (e.g. Makkar par. [0071] “user interface controls”) allowing the user to query the system for a program feature (Chen par. [0013] “determine … another program code set … notifying a system administrator”, Makkar par. [0070] “issue has been detected”). Those of ordinary skill in the art would have been motivated to do so as a known alternate means of “notifying an administrator” of the determined issues (par. [0013] “shares features in common with … codes sets that are recognized as being malicious … notifying a system administrator”, Makkar par. [0070] “issue has been detected”) which would have provided “additional information about the … library function” (Makkar par. [0071]).

Claims 4, 11 and 18: Chen and Makkar teach claims 1, 8 and 15, further comprising identifying redundancy and functional similarity of program attributes between the first program and the one or more additional programs based on the calculated mathematical distance between the vectorized program code for the first program and the vectorized program code for the one or more additional programs (Chen par. [0017] “a measure of similarity between each pair of call graphs”, par. [0018] “a distance of “0” refers to complete identity”, here a “0” distance 

Claims 5, 12 and 19: Chen and Makkar teach claims 4, 11 and 18, further comprising calculating possible reduction in storage requirements based on the identified redundancy of program attributes (Chen par. [0023] “groups, or clusters, the program code sets, pursuant to block 216”, Makkar par. [0033] “Once the matching library functions 26 are identified … present library function recommendations 27).

Claims 6, 13 and 20: Chen and Makkar teach claims 5, 12 and 19 further comprising providing recommendations for storage reduction based on the calculated possible reduction in storage requirements (Makkar par. [0070] “the recommended library function will reduce the size of the source code by four lines”).

Claims 2-3, 9-10 and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over US 2019/004959 to Chen (Chen) in view of US 2019/0079741 to Makkar (Makkar) in view of “A Feature Extraction Method Based on Stacked Denoising Autoencode for Massive High Dimensional Data” by Xu et al. (Xu). 

Claims 2, 9 and 16: Chen and Makkar teach claims 1, 8 and 15, but do not teach wherein the autoencoding of program data further comprises:

encoding the artificially corrupted data and decoding the artificially corrupted data, wherein decoding the artificially corrupted data further includes removing the added noise data to obtain decoded output data; and 
repeating the encoding and decoding of artificially corrupted data until the decoded output data converges on the value of the received program data, resulting in a trained autoencoding module.

Xu teaches autoencoding of data further comprises:
manipulating the data by adding noise data to the data resulting in artificially corrupted data (pg. 208, col. 1, 1st par. “Step1: Add artificial noise”); 
encoding the artificially corrupted data and decoding the artificially corrupted data, wherein decoding the artificially corrupted data further includes removing the added noise data to obtain decoded output data (pg. 207, col. 2, last par. “after it has been coded and decoded, an uncontaminated origin sample information is finally restored”); and 
repeating the encoding and decoding of artificially corrupted data until the decoded output data converges on the value of the received program data, resulting in a trained autoencoding module (pg. 208, col. 1, 2nd par. “Step2: If there is a next layer … used as the input of the second hidden layer”).

nd par. “Add artificial noise”) to repeatedly encode and decode the program data (pg. 208, col. 2, 2nd par. “a next layer”). Those of ordinary skill in the art would have been motivated to do so as a known alternate feature extraction method which would have produced only the expected results (e.g. more accurate feature extraction see e.g. Xu pg. 209 last par. “this feature extraction method is more effective”).

Claims 3, 10 and 17: Chen, Gong and Xu teach claims 2, 9 and 16, but do not disclose wherein the trained autoencoding module further comprises multiple layers of autoencoding and decoding that are executed simultaneously (see e.g. Xu, pg. 207, Fig. 1).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 2016/0093048 to Cheng et al. and US 10,664,999 to Gupta et al. disclose autoencoding/decoding layers executed simultaneously (see e.g. Fig. 1B and col. 16, line 51-col. 17, line 16, respectively). 
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASON D MITCHELL whose telephone number is (571)272-3728.  The examiner can normally be reached on Monday through Thursday 7:00am - 4:30pm and alternate Fridays 7:00am 3:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lewis Bullock can be reached on (571)272-3759.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-






/JASON D MITCHELL/Primary Examiner, Art Unit 2199