Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This communication is in response to Applicant's Amendment filed 8/25/2022.  Applicant has cancelled claims 1-33, and amended claims 34-54.  Currently, claims 34-54 are pending in the application.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 6/03/2022 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Applicant is respectfully reminded of the duty to disclose 37 C.F.R. 1.56 all pertinent information and material pertaining to the patentability of applicant’s claimed invention, by continuing to submitting in a timely manner PTO-1449, Information Disclosure Statement (IDS) with the filing of applicant’s application or thereafter.

Response to Amendments
Acknowledgement to applicant’s amendment to claim 41 have been noted.  The claims have been reviewed, entered and found obviating to previously raised 35 U.S.C. 112, second paragraph.  Rejection to claim 41 is hereby withdrawn.
Acknowledgement to Applicant’s amendment in claims 34-54 has been noted.  The claims has been reviewed, entered, and found obviating to previously raised Double Patenting rejection against US patent 10,607,004.  This Double Patenting rejection is hereby withdrawn.

Response to Arguments
Applicant traverses the double patenting rejection over claims 1-20 of US patent 10,915,627 in light of newly added amendments.  Examiner respectfully disagrees as the claims in the instant application and the patent are both related to creating output vector file after extracting features on a file.  The creating of multiple output vector files from multiple features found in multiple files, as amended in the claims of the instant application, is an obvious variation of the parent patent, as seen in the rejection section below.   
Regarding rejection under 35 USC 101, the arguments filed on 8/25/2022 have been considered but are not persuasive.
	Applicant argues on page 9, section A. Revised Step 2A that “humans cannot practically extract respective features from respective files of the plurality of files, the respective files in the second format, identify at least one respective group of contiguous characters in the respective features, or create a plurality of vector output files … including columns… including at least one number representative of an occurrence of the respective features without the aid of computers”.  First of all, Examiner reminds Applicant that although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. Applicant can give many examples as to what the features and formats of the files are, based on the specification, yet only the claimed language is subject to the broadest reasonable interpretation.  Secondly, a human can mentally and manually produce files of different formats that have columns that include different features (for example, files either in printed format or handwritten format, that contain images vs. files that contain math equations vs. files that contain text), count occurrences of certain features in the files (for example, how many images exist on each file), and produce a vector output file with columns according to the information obtained.    Additionally, Applicant argues on page 13 in Section “Prong 2 of Revised Step 2A” that “the Applicant’s specification confirms that claim 34 improves the functionality of a computer itself.  …Claim 34 is directed to a technique that improves the operation of a computer executing a feature engineering process by creating vector output files regardless of the format of the input files.”  Examiner notes that in the claims, there is no “practical” use of the vector output file as the claimed language only states that a vector output file is created.  By only just creating this output file, Examiner has noted above how this can be done in a human mind and manually, thus can be interpreted as an abstract idea.  Finally, the Applicant states in page 16 of the Section B. Step 2B that “claim 34 of the instant application results in a technical improvement in vector output files suitable for machine learning algorithms.  Such technical improvement improves the operation of a computer by creating vector output files suitable for machine learning algorithms regardless of the type of input file”  Examiner further notes that the claim language does not mention any machine learning algorithms nor does it mention how these files can improve the operation of a computer (i.e. how malware detection is achieved when using the created vector output files).  Therefore, Examiner still believes that the claimed language does not amount to significantly more since the claims only generally cite outputting of files and does not actively and positively recite a significant action in which a vector output file(s) acts upon or is acted upon in order to achieve the improvements mentioned in the specification.  Therefore, the rejection under 35 USC 101 is hereby maintained.
Regarding rejection of claim 34 under 35 USC § 103, the arguments filed 8/25/2022 have been considered but are not persuasive to overcome the references on record: Dalessandro et al. (US 2015/0269495 A1, hereinafter “Dalessandro”) in view of Avasarala et al. (US 2016/0203318 A1, hereinafter “Avasarala”).
In claim 34:
Applicant states that prior art of record does not teach or suggest on “convert a plurality of files from respective first formats to a second format and extract respective features from respective files of the plurality of files, the respective files in the second format”.  Examiner notes that Dalessandro, in par 52-53 teaches a conversion of data from one type of format to another type of format.  Furthermore, in figure 3, Dalessandro suggests that the log data (i.e. files) is analyzed (i.e. parsed) and features are extracted.  These features are relevant sets of predefined features  that become variables in machine learning algorithms (par 39).  According to par 52, Dalessandro teaches that the vectors of features or variables are converted to a format used by the machine learning algorithms, which is interpreted as features being in the second format.  Therefore, Examiner still believes that prior art of record teaches the limitations as claimed.  See rejection section below.
For claims 41 and 48: 
See response to arguments discussed for claim 34 above.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.   A nonstatutory obviousness-type double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and  In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. 
The USPTO internet Web site contains terminal disclaimer forms which may be used.  Please visit http://www.uspto.gov/forms/.  The filing date of the application will determine what form should be used.  A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission.  For more information about eTerminal Disclaimers, refer to http://www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 CFR 3.73(b).
Claims 34-54 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of US patent 10,915,627.  Although the claims at issue are not identical, they are not patentably distinct from each other because they are each drawn towards creating an output vector file after extracting features on a file.
Current Application
Patent10,915,627
Comments
34. An apparatus comprising: 
	


      interface circuitry to receive a plurality of files from a plurality of devices different than the apparatus, the plurality of files having respective first formats; 
	machine readable instructions; and one or more processor circuits to execute the machine readable instructions to: 
	convert the plurality of files from the respective first formats to a second format; 
	extract respective features from respective files of the plurality of files, the respective files in the second format; 
	identify at least one respective group of contiguous characters in the respective features; 
	








      create a plurality of vector output files, respective vector output files including columns, respective columns including at least one number representative of an occurrence of the respective features; and 
	output the plurality of vector output files
10. An apparatus to improve an efficiency of a machine learning algorithm, the apparatus comprising: 

a log file retriever to retrieve a log file in a first format, the log file containing behavior-related data to be analyzed by one or more machine learning algorithms; 





an operation flow builder to: convert the log file from the first format to a second format; prior to machine learning algorithm application, improve machine learning modeling efficiency by distinguishing candidate malicious features from non- malicious features by extracting respective behavior-related features from the behavior-related data of the log file in the second format based on the respective ones of the behavior-related features matching one or more patterns corresponding to malware; 

hash the extracted behavior-related features corresponding to malware; and 

format input data for the machine learning algorithm by creating a vector output file of the hashed respective ones of the behavior-related features extracted from the behavior-related data corresponding to malware; and 


a feature engineering system to improve the efficiency of the machine learning algorithm by transmitting the formatted vector output file to a system executing the machine learning algorithm
The claim on the patent is an obvious variation to the current application because both claims are creating an output vector file after extracting features on a file.  Even though the current application notes a plurality of files with plurality of features to create plurality of vector files, these are obvious variations and obvious to implement since it has been held that mere duplication of essential functions in a device involves only routine skill in the art.
Claims 41 (article of manufacture) and 48 (method)
Claims 1 (article of manufacture) and 19 (method)
Same rationale as above.



Claim Rejections - 35 USC § 101
Claim Rejections - 35 USC § 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claim 34-54 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
The claims 34, 41 and 48 recites the limitations of “Application No. 17/140,7971) extracting, by executing string manipulation code with one or more processors, a feature from a file retrieved from a device, the file in a first format; 2) identifying, by executing one or more instructions with the one or more processors, at least one group of contiguous characters in the feature; 3) creating a vector output file including columns, respective ones of the columns including at least one indicia representative of an occurrence of the feature; and 4) outputting the vector output file”.  The limitations of (paraphrasing) 1) extracting a feature from a file, 2) identifying characters in the feature, 3) creating a file with columns that include the occurrences of the feature, 4) and create an output file, as drafted, are processes that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components.  For claim limitations on claims 41 and 48, that is, other than reciting “device” and “processors”, nothing in the claim element precludes the steps from practically being performed in the mind. For example, in the context of this claim a person can manually extract a feature from a file.  A person can also mentally identify characters in the feature.  A person can mentally and manually create a file with columns that categorize occurrences of such feature, to then have a finalized product of a file.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  In claim 34, that is, other than reciting “device” and “circuits/circuitry”, nothing in the claim element precludes the steps from practically being performed in the mind, and thus the method steps could be performed, under its broadest reasonable interpretation, in the mind, equally as it has been discussed from claims 41 and 48, above.  Accordingly, the claims recite an abstract idea. 
This judicial exception is not integrated into a practical application because for  claims 34, 41 and 48, the claims only recites additional element –device, processor(s), circuitry/circuits to perform the functions recited in the claims, as mentioned above.  The processor performing those steps is recited at a high-level of generality (i.e., as a generic processor performing a generic computer function of extracting or receiving feature of a file, identifying characters in the feature, creating an output file with columns categorizing or identifying the occurrences of such feature) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, the additional element(s) does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.  Therefore, the claims are directed to an abstract idea.  
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because, as discussed above with respect to integration of the abstract idea into a practical application, the additional element(s) of device, processor(s), circuitry/circuits to perform the claimed steps amounts to no more than mere instructions to apply the exception using generic computer component. Specifically, the step of “output the vector output file”, does not amount to significantly more as it does not produce a significant outcome, specifically in the context of Security Systems.  The claims generally cite an outputting of a file, thus not actively and positively reciting a significant action in which the particular instance acts upon or is acted upon in order for it to actively aid in early malware detection (Specification, par. [0009]), therefore no significant action is performed.  Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.
Claims 35-40, 42-47, and 49-54 rejected by virtue of dependency and because they do not obviate the above-recited deficiencies. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.
Claims 34-36, 38-43, 45-50, 52-54 rejected under 35 U.S.C. 103 as being unpatentable over Dalessandro et al. (US 2015/0269495 A1, hereinafter “Dalessandro”) in view of Avasarala et al. (US 2016/0203318 A1, hereinafter “Avasarala”).

Regarding claim 34, Dalessandro teaches:
34. (New) An apparatus comprising: 
interface circuitry to receive a plurality of files  (par 36 – log data collection) from a plurality of devices different than the apparatus, the plurality of files having respective formats;
machine readable instructions (par 8); and 
one or more processor circuits to execute the machine readable instructions (par 8) to: 
convert the plurality of files from the respective first formats to a second format (par 52 – mapping data from any unstructured or structured format to a format that machine learning algorithm can process);
extract respective features from respective files of the plurality of files, the respective files in the second format (par 36 – log data include elements, for example,  events and other user data, see also par 52-53- a vector of features is constructed for feature engineering process for machine learning, see also fig. 3, feature extractors 305, 307); 
identify at least one respective group of contiguous characters in the respective features (par 40 – data is sampled by USS for features (shown in Fig. 2)); 
create a [plurality of] vector output files, respective vector output files including columns (Figs. 2-3, par 64 – final output is a structured vector of all features; Examiner notes that a “structured” vector of all features implies an organization format, such as columns.); and 
output the [plurality] of vector output files (Figs. 2-3, par 64 – final output is a structured vector of all features).  
	Dalessandro discloses the claimed invention except for the creating and outputting of plurality of vector output files.  However, it would have been obvious to one having ordinary skill in the art before the effective filing date of the invention to have created multiple vector output files and output them, since it has been held that mere duplication of the essential working parts involves only routine skill in the art.
Also, Dalessandro teaches file formats (par 54, such as XML), yet Avasarala more explicitly suggests:
	the file in a first format (Avasarala: par 49, files in pdf format); and
	respective [columns including at least one] number representative of an occurrence of the respective features (Avasarala: par 60, features of the object is compiled in EFV, which is a concatenation of a number of features).
	Accordingly, it would have been obvious to one having ordinary skill in the art before the effective filing date of the invention to have implemented a vector file output that includes number representative of feature occurrences, as taught by Avasarala, to Dalessandro’s invention.  The motivation to do so would have been in order to provide practical classification of large number of attributes for each feature (Avasarala: par 56). 

	
Regarding claim 35, the combination of Dalessandro and Avasarala teach:
35. (New) The apparatus of claim 34, wherein the plurality of files is representative of a plurality of potentially malicious files (Avasarala: par 38; i.e. malicious files).  

Regarding claim 36, the combination of Dalessandro and Avasarala teach:
36. (New) The apparatus of claim 34, wherein the respective features are represented by respective portions of the respective files (Avasarala: par 60, feature as such as pdf object) in the respective first formats (Avasarala: par 37, pdf file format), and the respective features include respective strings (Avasarala: par 49, i.e. file type as descriptive string).  

Regarding claim 38, the combination of Dalessandro and Avasarala teach:
38. (New) The apparatus of claim 34, wherein the one or more processor circuits are to execute the machine readable instructions to identify respective numbers of occurrences of a window of characters in the respective features (Avasarala: par 60: “EFV may be a concatenation of a number of feature vectors corresponding to different types of features”).  

Regarding claim 39, the combination of Dalessandro and Avasarala teach:
39. (New) The apparatus of claim 34, wherein the respective columns of the respective vector output files correspond to unique features (Avasarala: par 60: “EFV may be a concatenation of a number of feature vectors corresponding to different types of features”).  

Regarding claim 40, the combination of Dalessandro and Avasarala teach:
40. (New) The apparatus of claim 34, wherein the plurality of devices are  
first devices (Avasarala: par 62, fig. 4A-4B, two components), and 
the one or more processor circuits are to execute the machine readable instructions to output the plurality of vector output files to a machine learning algorithm executed by a second device (Avasarala: par 62, fig. 4A-4B, two components to the machine-learning system).  

Regarding claim 41, the claim limitations are set forth and rejected as discussed in claim 34 above.  Furthermore, the combination of Dalessandro and Avasarala teach the additional limitations as follows:
41. (New) A non-transitory computer readable medium comprising instructions that, when executed, cause one or more processors to (Dalessandro: par 8)….  

Regarding claim 42 and claim 49, the claim limitations are set forth and rejected as discussed in claim 35 above.

Regarding claim 43 and claim 50, the claim limitations are set forth and rejected as discussed in claim 36 above.

Regarding claim 45 and claim 52, the claim limitations are set forth and rejected as discussed in claim 38 above.

Regarding claim 46 and claim 53, the claim limitations are set forth and rejected as discussed in claim 39 above.

Regarding claim 47 and claim 54, the claim limitations are set forth and rejected as discussed in claim 40 above.

Regarding claim 48, the claim limitations are set forth and rejected as discussed in claim 34 above.  Furthermore, the combination of Dalessandro and Avasarala teach the additional limitation (emphasis added) as follows:
48. (New) A method comprising: -4-Preliminary AmendmentAttorney Docket No. P107664-C2 Application No. 17/140,797 [extracting], by executing string manipulation code with the one or more processors... (Avasarala: fig. 5, par 72, i.e. attribute-relation file format, which categorizes by attributes (numeric, string, etc.))

Claims 37, 44, 51 rejected under 35 U.S.C. 103 as being unpatentable over Dalessandro et al. (US 2015/0269495 A1, hereinafter “Dalessandro”) in view of Avasarala et al. (US 2016/0203318 A1, hereinafter “Avasarala”) in further view of Chang et al. (US 8,463,591 B1, hereinafter “Chang”).


Regarding claim 37, the combination of Dalessandro and Avasarala teach:
37. (New) The apparatus of claim 34, wherein the plurality of vector output files includes a plurality of feature vectors representative of a plurality of potentially malicious files (Avasarala: par 52, file type is categorized as malicious).
Dalessandro and Avasarala does not teach yet Chang suggests:
the respective potentially malicious files identified by respective hash values (Chang: col. 4 lines 25-32: “The index is determined from a hash table, e.g., associative array, using the indices of the corresponding elements in the feature vector x.).  
Accordingly, it would have been obvious to one having ordinary skill in the art before the effective filing date of the invention to have implemented a mechanism to identify the malicious files with a hash value, as taught by Chang, to Dalessandro and Avasarala’s invention.  The motivation to do so would have been to track the features that are being combined in the vectors (Chang: col. 4 lines 38-54).

Regarding claim 44 and claim 51, the claim limitations are set forth and rejected as discussed in claim 37 above.

Conclusion
	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LIZBETH TORRES-DIAZ whose telephone number is 571-272-37391787.  The examiner can normally be reached on 9:00a-4:30p.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Farid Homayounmehr can be reached on 571-272-37393739.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/LIZBETH TORRES-DIAZ/           Primary Examiner, Art Unit 2495                                                                                                                                                                                                        
September 26, 2022