DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1 to 21 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement.  The claims contain subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventors, at the time the application was filed, had possession of the claimed invention.
Independent claims 1 and 11 to 12 set forth a limitation directed to “corresponding to a word of a natural language”, which is misdescriptive of the invention.  Applicants’ Specification, ¶[008] - ¶[009], does include a description of furthering the goal of anomaly detection using NeLP and tools from the natural language processing discipline, and a language-based approach to anomaly detection i.e., not a computer language --, but is directed to a ‘network language’ as illustrated in Figure 3.  This is a somewhat important distinction because one of Applicants’ main arguments against the current rejection is that Song et al. (U.S. Patent Publication 2009/0254501) teaches extracting information of a word in a natural language, but the claim language is directed to a machine-to-machine communication that is not a natural language, and that the combination is then not proper because this reference involves ‘something else entirely’, and to a different technical field.  Applicants’ invention may be utilizing techniques that are analogous to those used in natural language processing in an application to network language processing, but it is misdescriptive to claim that their n-grams are “corresponding to a word of a natural language”. 
Independent claims 1 and 11 to 12 set forth a limitation of “a unitary element”, which is new matter.  Applicants’ Specification, as originally filed, does not provide support with any written description of this terminology of “a unitary element”.  Conceivably, there may be some significance to a limitation of “a unitary element”, but it is not clear what this may be because this is not described in the originally-filed Specification.  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 to 2, 5 to 13, and 16 to 21 are rejected under 35 U.S.C. 103 as being unpatentable over Mekky et al. (U.S. Patent No. 10,038,706) in view of Song et al. (U.S. Patent Publication 2009/0254501).
Concerning independent claims 1 and 11 to 12, Mekky et al. discloses a method, system, and computer program product for classifying malware based on traffic data, comprising:
“receiving communication data that is being communicated between two machines using a communication protocol wherein for at least one of the two machines, the communication data relates to functionality of a mission-critical environment, the communication data being comprised of at least one formatted data unit, each of the at least one formatted data unit comprising at least one information element” – classifying malware facilitates in identifying an appropriate process for containing, removing, or neutralizing the effects of the malware (column 1, lines 23 to 26); a process can begin when a computing device receives traffic data (“receiving communication data”); the traffic data can be from one or more devices that are known or suspected to have executed an unidentified malware application; this traffic data can include events corresponding to the unidentified malware application; traffic data can be received from 
 “constructing at least one N-gram from the received communication data, wherein the at least one N-gram is at least one of the at least one information elements which is extracted from at least one of the at least one formatted data unit of the received communication data, wherein each extracted at least one information elements is treated in constructing the at least one N-gram as a unitary element corresponding to a word [of a natural language]” – traffic data is converted into a feature vector by e.g., 1-gram to 5-grams (“constructing at least one N-gram from the received communication data”), and a feature vector is constructed based on the n-gram analysis (column 3, lines 33 to 49: Figure 1: Step 110); sample traffic data and sample background analysis can be performed on words to construct the sample feature vectors; then, the top discriminating features, i.e., n-gram features, between the malware family and the background traffic can be selected (column 4, lines 22 to 28); a sequence of words ‘w1w2w3w4’ can be used to generate bi-grams {(w1 w2), (w1 w3), (w2 w3), (w2 w4), (w3 w4)} (“wherein the at least one N-gram is at least one of the at least one information elements which is extracted from at least one of the at least one formatted data unit of the received communication data”) (column 4, lines 37 to 43); feature vector 320 represents mixed traffic data that includes events from malware in a malware family and background noise; feature vector 320 can be created by converting the events in received traffic data to words and performing an n-gram analysis on the words to generate feature vector 320 (column 7, lines 45 to 50); broadly, each of bi-grams (w1 w2), (w1 w3), (w2 w3), (w2 w4), (w3 w4) is “a unitary element corresponding to a word” because these bi-grams are generated from a sequence of words ‘w1w2w3w4’;

Concerning independent claims 1 and 11 to 12, Mekky et al. discloses a basic idea of using n-gram analysis to detect malware from formatted network packages in data traffic having a communication protocol by comparing feature vectors constructed from n-grams to feature vectors of malware samples of malware families.  Implicitly, detecting and removing malware for any organization is a priority that is “mission-critical”.  Mekky et al. converts network packages captured using a packet capture program to output a score based on a similarity between a malware feature vector and malware samples “to identify conditional probabilities of certain characteristics”, i.e., that traffic data contains malware, at least because these scores are produced using “a conditional probability” of a conditional entropy H(X|Y), which represents an uncertainty in X under a condition of an observation of Y.  That is, H(X|Y) is known to those skilled in the art as a ‘conditional probability’, where X is conditioned on the observation of Y.  Mekky et al. does not expressly disclose the limitations of an N-gram as a unitary element corresponding to a word “of a natural language” and “generating anomaly-detection rules with regard to the communication protocol being communicated between the two machines based on the N-gram analysis.”  Still, Mekky et al. does disclose that a signal processing algorithm for a specific malware family can be generated by mixing top discriminating features to build a model for a specific malware family.  (Column 4, Lines 29 to 36)  Here, malware in network traffic data is an ‘anomaly’.  Mekky et al., then, can be construed to disclose generating a model for anomaly detection with 
Concerning independent claims 1 and 11 to 12, Song et al. teaches whatever limitations might be omitted by Mekky et al. as directed to using n-gram analysis “corresponding to a word of a natural language” and “generating anomaly-detection rules . . . based on the N-gram analysis.”  Error correction rules are created by applying probability information to a corpus of incorrect words.  (Abstract)  Applicants’ Specification, ¶[009], describes the invention as employing network language processing (NeLP) using tools from natural language processing (NLP) to solve an anomaly detection problem.  Generally, Song et al. teaches creating error correction rules using n-grams and a conditional probability.  Error correction rule generator 124 produces an error correction rule by using errors found in a corpus of first-spaced words.  Error correction rule generator 124 creates candidate rules for an error correction rule, and selects an error correction rule from among candidate rules. (¶[0035]: Figure 1)  Implicitly, Song et al. is directed to natural language processing because error correction is applied to words in a human language (“corresponding to a word of a natural language”), e.g., ‘A father enters a room’.  (¶[0055]: Figure 4A)  A probability model calculates a conditional probability of a specified output when a variety of inputs are given (“to identify conditional probabilities of certain characteristics”), and probability information 122 learns a conditional random field (CRF) probability model.  (¶[0065]: Figure 1)  Error correction rule generator 124 creates an error correction rule by using a corpus of first-spaced words.  An error correction rule is extracted as an n-gram of 2 or more size.  (¶[0075] - ¶[0076]: Figure 3: Step S15)  After confidence scores Song et al., then, teaches whatever limitations are omitted by Mekky et al. as directed to “N-grams . . . corresponding to a word of a natural language” and “generating anomaly-detection rules . . . based on the N-gram analysis” because errors are ‘anomalies’ and error correction rules are analogous to anomaly-detection rules.  An objective is to acquire probability information of an n-gram model using learning data that can obtain more reliable probability that may be applied in a mobile device with lower computing power.  (¶[0009])  It would have been obvious to one having ordinary skill in the art to generate analogous rules obtained from natural language processing as taught by Song et al. to perform n-gram analysis on features of malware Mekky et al. for a purpose of acquiring more reliable probability information of an n-gram model using learning data that can be applied in a mobile device with lower computing power.

Concerning claims 2 and 13, Mekky et al. discloses that a process begins when a computing device receives traffic data from one or more devices.  Traffic data can include events corresponding to an unidentified malware application.  (Column 2, Line 65 to Column 3, Lines 10: Figure 1)  Here, traffic data that is received at one computing device from one or more devices represents “receiving communication data . . . and processing a communication destined for an application on at least one of the two 
Concerning claims 5 and 16, Mekky et al. discloses converting events from traffic data into words and performing n-gram analysis on the words (“constructing at least one N-gram from at least one of content information . . .”).  (Column 1, Lines 57 to 63)  Similarly, Song et al. teaches “constructing at least one N-gram from the received communication data includes constructing at least one N-gram from at least one of content information . . . .”  Broadly, text of words is “content information” and unigram or n-gram models are used to extract features.
Concerning claims 6 to 7 and 17 to 18, Mekky et al. discloses that a sequence of words can be decomposed into n-grams of one to five letters, i.e., 1-gram (“constructing a unigram”) to 5-grams.  (Column 3, Lines 44 to 49: Figure 1: Step 110)  A sequence of words ‘w1w2w3w4’ can be used to generate bi-grams (“wherein constructing at least one N-gram from the received communication data includes constructing a bigram”) (w1 w2), (w1 w3), (w2 w3), (w2 w4), (w3 w4).  (Column 4, Lines 37 to 43: Figure 1)  Similarly, Song et al. teaches that, preferably, probability information generator 122 uses a 1-gram model, or unigram model to extract features from a corpus of correct words (¶[0032]: Figure 1); features are extracted from a corpus of correct words M1, where the extraction of features may use a unigram model; probability information generator 122 extracts a unigram feature (¶[0054] - ¶[0055]: Figure 3: Step S11); instead of using a unigram model, a 2-gram model or a 3-gram model may alternatively be used for extracting features (¶[0058]: Figures 3 and 4); an error correction rule is 
Concerning claims 8 and 19, Song et al. teaches that probability information generator 122 creates probability information by applying the extracted features and a probability model to a corpus of incorrect words M2 from which all spaces between words of a corpus of correct words M1 are removed (¶[0033]: Figure 1); instead of using a unigram model, a 2-gram model or a 3-gram model may alternatively be used for extracting features (¶[0058]: Figures 3 and 4); an error correction rule is extracted as an n-gram of 2 or more size (¶[0076]); error correction rule generator 124 extracts sentences from a corpus of first-spaced words and from a corpus of correct words M1, and compares the extracted sentences to each other (¶[0078]: Figure 7: Steps S151 to S152); error correction rule generator 124 creates error correction candidate rules of n-grams of 2 or more size; error correction rule generator 124 extracts four rule patterns (“based on patterns observed in analyzing the repository bigrams”) (¶[0080]: Figure 7: Step S154).  Here, Song et al. generates error correction rules from conditional probabilities obtained by comparing features of a corpus of correct words to a corpus of incorrect words, and these features in one embodiment may be 2-grams (“comparing the constructed bigram with a repository of bigrams”). 
Concerning claims 9 and 20, Mekky et al. discloses that a sequence of words can be decomposed into n-grams of one to five letters, i.e., 1-gram to 5-grams (“a higher order N-gram”).  (Column 3, Lines 44 to 49: Figure 1: Step 110)  Similarly, Song et al. teaches an error correction rule is extracted as an n-gram of 2 or more size (¶[0076]); error correction rule generator 124 creates error correction candidate rules of n-grams of 2 or more size.  (¶[0080]: Figure 7: Step S154)  Here, an n-gram of more than 2 is “a higher-order N-gram.”
Concerning claims 10 and 21, Mekky et al. discloses that a sequence of words can be decomposed into n-grams of one to five letters, i.e., 1-gram (“at least one of unigrams”) to 5-grams (“at least one of . . . higher-order N-grams”).  (Column 3, Lines 44 to 49: Figure 1: Step 110)  A sequence of words ‘w1w2w3w4’ can be used to generate bi-grams (“at least one of . . . bigrams”) (w1 w2), (w1 w3), (w2 w3), (w2 w4), (w3 w4).  (Column 4, Lines 37 to 43: Figure 1)  Similarly, Song et al. teaches that, preferably, probability information generator 122 uses a 1-gram model, or unigram model to extract features from a corpus of correct words (¶[0032]: Figure 1); features are extracted from a corpus of correct words M1, where the extraction of features may use a unigram model; probability information generator 122 extracts a unigram feature (¶[0054] - ¶[0055]: Figure 3: Step S11); instead of using a unigram model, a 2-gram model or a 3-gram model may alternatively be used for extracting features (¶[0058]: Figures 3 and 4); a denotation Fl,m means a unigram feature at a specific point for determining word-spacing information Sl,m (¶[0066]: Equation (1)); an error correction rule is extracted as an n-gram of 2 or more size (¶[0076]); error correction rule generator 124 creates error correction candidate rules of n-grams of 2 or more size (¶[0080]: Figure 7: Step S154).  Here, a 2-gram model is “a bigram” and an n-gram of more than 2 is “a higher-order N-gram.”  Song et al., then, discloses analyzing n-grams including “unigrams”, “bigrams”, and “higher-order N-grams”.   

Claims 3 to 4 and 14 to 15 are rejected under 35 U.S.C. 103 as being unpatentable over Mekky et al. (U.S. Patent No. 10,038,706) in view of Song et al. (U.S. Patent Publication 2009/0254501) as applied to claims 1 and 12 above, and further in view of Andress et al. (U.S. Patent Publication 2007/0174469).
Mekky et al. teaches “intercepting . . . a communication between two machines”, but omits “duplicating a communication”, “processing the duplicate communication”, and “returning the original communication to the data stream” of claims 3 and 14.  Similarly, Mekky et al. omits “receiving a communication data log, containing records of at least one communication, from a plug-in” of claims 4 and 15.  However, Andress et al. teaches intercepting communications between a client and a service, where a proxy invokes an interceptor plug-in that is plugged into the proxy.  (Abstract)  One prior art embodiment for intercepting IP data traffic is to log all IP datagrams of several user sessions at specific interception points (“receiving a communication data log”), and doing filtering analysis in order to regenerate a complete user session.  (¶[0006])  An incoming or outgoing call for a certain telephone number is intercepted at a switch, and the switch is duplicating the communication content (“duplicating a communication between two machines”).  The transmission between caller and callee is transferred to a law enforcement agency via a mediation device.  (¶[0007])  A request and response are stored on a message queue, where a message queue is an interceptor plug-in, or wherein the request and the response are stored on the interceptor plug-in.  The request and response are transferred from the message queue or from the interceptor plug-in to an interceptor manager (“receiving a communication log, containing records of Andress et al., then, teaches these limitations of “duplicating a communication” and “a plug-in” that stores information of “a communication log data”.  Implicitly, if a message is duplicated, then an original message continues to a recipient, which is equivalent to “returning the original communication to the data stream.”  An objective is to enable interception of a customer’s communication for law enforcement agencies, and to provide an improved method for intercepting data traffic.  (¶[0002] and ¶[0011])  It would have been obvious to one having ordinary skill in the art to provide a plug-in for interception of logged data and to duplicate a communication as taught by Andress et al. to determine a score for malware events in traffic data according to a communication protocol of Mekky et al. for a purpose of providing an improved method for intercepting data traffic for law enforcement agencies.

Response to Arguments
Applicants’ arguments filed 11 February 2022 have been considered but are moot in view of new grounds of rejection as necessitated by amendment.
Applicants provide some significant amendments to independent claims 1 and 11 to 12, and include arguments directed against the prior rejection of these independent claims as being obvious under 35 U.S.C. §103 over Song et al. (U.S. Patent Publication 2009/0254501) in view of Allouche et al. (U.S. Patent Publication 2017/0200323).  These significant amendments include extensive revision of the independent claims directed to new limitations of receiving communication data “that is being communicated 
Applicants’ amendments overcome the objections to the drawings and the Specification.  Applicants’ Replacement Sheets for the drawings are being approved.
Applicants’ amendment raises new issues under 35 U.S.C. §112(a) for the limitations of constructing at least one N-gram “as a unitary element corresponding to a word of a natural language”.  Applicants’ limitation of “a word of a natural language” is misdescriptive of the invention.  Here, Applicants’ words are not words of “a natural language”.  Specifically, “a natural language” is defined as a human language as it is spoken or written and does not encompass a machine language that is understood only by computers.  Applicants’ embodiments as expressed by the claim language of “data 
New grounds of rejection are set forth as directed to independent claims 1 and 11 to 12 being obvious under 35 U.S.C. §103 over Mekky et al. (U.S. Patent No. 10,038,706) in view of Song et al. (U.S. Patent Publication 2009/0254501).  The rejection no longer relies upon Allouche et al., but Mekky et al. is maintained to disclose the new limitations directed to “receiving communication data that is being communicated between two machines using a communication protocol . . . the communication data being comprised of at least one formatted data unit”.  Here, Mekky et al. is directed to detecting malware in communications between two computing devices that are communicating according a communication protocol using network packages.  (Column 2, Line 65 to Column 3, Line 44)  Specifically, Mekky et al. discloses a communication protocol for data traffic that is ‘formatted’ as network Mekky et al. uses an n-gram analysis to detect and classify malware in communications between two computers.  Mekky et al. compares feature vectors derived from the n-grams to a feature vector for malware samples to output a classification score for traffic data.  (Column 4, Lines 19 to 56)  Broadly, detection and classification of malware “relates to a functionality of a mission-critical environment” because it is critical to a mission of every organization to detect and remove malware.  The rejection of some dependent claims continues to rely upon Andress et al. (U.S. Patent Publication 2007/0174469).
Mainly, Applicants’ arguments are moot.  Applicants cite the new limitations, and argue that the prior references do not teach all of the new elements, and argue that there is a lack of motivation to make a combination to support a proper rejection.  Specifically, Applicants argue that Song et al. is directed to words that are entered by the user, and that this is not equivalent to communication data as set forth by the claim language to make an equivalence.  Applicants point out that the new claim language requires communication data that is being communicated between two machines using a communication protocol and relates to functionality of a mission critical environment, and that text being typed on an electronic device does not relate to a mission-critical environment.  Moreover, Applicants allege that only techniques applicable to a natural language are disclosed by Song et al., but that their claim language includes an element corresponding to a machine-to-machine communication, and not a word of a natural language.  Applicants maintain that equating a space that was not typed by a user is not Song et al. is directed to something else entirely in an entirely different technical field, and this has nothing to do with Allouche et al.
These arguments are being considered to the extent that they be relevant to the new grounds of rejection under 35 U.S.C. §103 over Mekky et al. in view of Song et al.  The examiner has already described herein how the new limitations of “communication data that is being communicated between two machines using a communication protocol . . . the communication data being comprised of at least one formatted data unit” are expressly disclosed by Mekky et al.  Similarly, “a functionality of a mission-critical environment” is implicit in malware detection and classification of Mekky et al.  The only significant remaining points raised by Applicants are if the references are properly combinable in a rejection, and if the references disclose and teach “generating anomaly-detection rules”. 
Mekky et al. and Song et al. are analogous art references as both involve using n-gram analysis to make a determination of some undesirable condition.  Mekky et al.’s undesirable condition is a presence of malware and Song et al.’s undesirable condition is that a word is incorrect because it does not have proper spacing.  However, Applicants’ Specification, ¶[008] - ¶[009], provides a rationale for connection of natural language processing to network language processing because it states that tools from natural language processing can be used in network language processing.  Applicants appear to admit that Song et al. is directed to natural language processing.  The examiner maintains that network language processing with n-grams is disclosed by Mekky et al.  This combination is not based on hindsight.  It must be recognized that In re McLaughlin, 443 F.2d 1392, 170 USPQ 209 (CCPA 1971).  Here, it would be obvious to combine Mekky et al. and Song et al. because Applicants’ claim amendments expressly require words in a natural language, and these references both relate to n-gram processing.  Applicants’ invention can be understood under rationales of (A) Combining prior art elements according to known methods to yield predictable results; (C) Use of known technique to improve similar devices (methods, or products) in the same way; or (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results.  KSR Int'l Co. v. Teleflex Inc., 550 U.S. 398, 418, 82 USPQ2d 1385, 1396 (2007)  That is, one skilled in the art could understand that it merely represents a known technique of generating rules as taught by Song et al. to a known technique of detecting anomalies of malware using n-grams in a predictable way as disclosed by Mekky et al.
Applicants’ amendment necessitates these new grounds of rejection under 35 U.S.C. §112(a) and under 35 U.S.C. §103 over Mekky et al. (U.S. Patent No. 10,038,706) in view of Song et al. (U.S. Patent Publication 2009/0254501).  Mainly, Applicants’ arguments are moot in light of these new grounds of rejection.  Accordingly, this rejection is properly FINAL.



Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicants’ disclosure.
Fan et al. discloses prior art directed to generating a rule set for anomaly detection using adaptive learning for intrusion detection.
Gomez et al. discloses detection and classification of anomalous or unwanted objects in malware detection.
Bhat et al. discloses performing analytics on data to identify a set of edge device rules.
Applicants’ amendment necessitated the new grounds of rejection presented in this Office Action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP §706.07(a).  Applicants are reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of 
this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571) 272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center.  Unpublished application information in Patent Center is available to registered users.  To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov.  Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format.  For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MARTIN LERNER/Primary Examiner
Art Unit 2657                                                                                                                                                                                                        February 25, 2022