Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Detail Action
This office action is response to the application 16/596,139 filed on 10/08/2019. Claims 1-17 and 21 are pending in this communication. Claims 18-20 have been withdrawn after requirement for restriction/election office action of 05/17/2021.

Priority
This application claims priority from AUSTRALIA 2018247212 10/09/2018. Priority date has been accepted.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 06/30/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner. 

Examiner’s Note
The Examiner used figures, paragraph and line numbers from the instant application’s pre-grant publication or pdf copy of allowance. In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 
Generally the text that are italicized are claims; the text that are in bold are reference citations (with some obvious exception); the text is neither italicized nor bolded are by the examiner.

Claim Rejections - 35 USC § 103
The following is a quotation of AIA  35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 13, 16 and 21 are rejected under AIA  35 U.S.C. 103 as being unpatentable over EDWARDS; Jonathan L.et al., Pub. No.: US 2016/0180087 A1 in view of MORE; Ajinkya, Pub. No.: US 2016/0379289 A1.

Regarding Claim 1, EDWARDS discloses a method, for automatically creating a honeyfile for a file system, comprising the steps of:
surveying a file set of the file system to identify tokenisable data in the file set {Fig. 1 element 120, 122 & [0030], “the memory 110 may store data files 120 (or other resources). The data files 120 may include files that can be used by one or more of the applications 118. The data files 120 may be organized in a file library (or database) 122. The file library 122 may include folders 124 including collections (or sets) of data files 120. For example, in an illustrative embodiment, the file library 122 may include a "My Documents" folder 124a … for holding general data files 120 for a user … The file library 122 may include a system level folder 124e (e.g., C:\Windows) holding system registry data files 120 (e.g., registry data (DAT) files (*.dat file), registration entry (REG) files (*.reg files), and/or the like)”};
tokenising the identified tokenisable data {[0035], “the honey tokens 138 may include pre-stored honey tokens 138 and/or may include dynamically generated honey tokens 138 (e.g., honey tokens 138 generated by the anti-malware application 118a in response to a corresponding request)”} … ; 
…
applying a substitution method to substitute the tokens of the exemplar token sequence or signature with replacement tokenisable data {Figs. 2, 4A-4C & [0053], “one or more honey tokens 138 can be interspersed (or scattered) within a sequence of the honey token data 208 provided in response to an access request 200. If, … the user's “My Pictures” file folder 124c includes ten real JPEG files 120, then the honey token data 208 provided in response to the access request 200c may include an enumerated sequence of data including a first false JPEG file (e.g., a first honey token 138) followed by three of the real JPEG files 120, a second false JPEG file (e.g., a second honey token 138) followed by four of the real JPEG files 120, and a third false JPEG file (e.g., a third honey token 138) followed by the last three of the real JPEG files 120. That is, the enumerated sequence of data provided in response to the access request 200c may include the following sequence: HT-R-R-R-HT-R-R-R-R-HT-R-R-R, where “HT” represents a honey token image file and “R” represents a real image file”. Examiner’s note: cited honey token (HT) and real token (R) positioning sequence of pattern is a substitution method for a surveyed/ identified data set. The cited honey token (HT) and real token (R) positioning sequence is ‘exemplar token sequence’, which is a signature, in a honeyfile to be used to decoy malware};
wherein the substitution method includes comparing the attribute(s) of an exemplar token with the attribute(s) of the replacement tokenisable data {Fig. 3 & [0055], “the honey tokens 138 can have one or more characteristics (e.g., a type, a name, and/or data) that are expected to entice or bait a malicious process 202 into performing suspicious or malicious activity on the honey tokens 138”. … [0049], a comparison of the requesting process 202 to the elements listed“. Examiner’s note: cited ‘characteristics (e.g., a type, a name, and/or data)’ are attributes}; and packaging the replacement tokenisable data into a honeyfile {Fig. 2 element 208 – ‘honey token data’ & [0035], “dynamically generated honey tokens 138 (e.g., honey tokens 138 generated by the anti-malware application 118a in response to a corresponding request)”. Examiner’s note: the package of honey decoy data is created in cited paragraph portion of [0035]}.
EDWARDS, however, does not explicitly disclose
… to form a plurality of token sequences;
wherein each token in a token sequence has a token tag which identifies an attribute of the token and each token includes a token string which comprises tokenisable data;
and a token sequence is represented by a sequence of token tags, called a signature; and
either:
selecting one of the plurality of token sequences or signatures; or generating a token sequence or signature to operate as an exemplar token sequence;
In an analogous reference MORE discloses
… to form a plurality of token sequences {ABS., “The product titles are separated into a sequence of tokens, the tokens being determined by the presence of a separator character. The sequences of tokens are labeled according to a specific encoding scheme to denote attributes of a title, such as brand name and other features”};
wherein each token in a token sequence has a token tag which identifies an attribute of the token and each token includes a token string which comprises tokenisable data {[0018], “a method might comprise: receiving a title for a product; dividing the title into a sequence of tokens; encoding each token of the sequence of tokens to indicate a label for each token, each token having an associated label; determining a type of each token of the sequence of tokens based on the label associated with each token of the sequence of tokens”. Examiner’s note: sequence of token is token string};
and a token sequence is represented by a sequence of token tags, called a signature {[0070], “The process of assigning a label to each token in a sequence can be termed “sequence labeling.” For example, as described above, sequence labeling can refer to assigning the labels “B-brand”, “I-brand”, and “O-labels” to each token in a sequence (such as a title). An input sequence X can comprise multiple tokens x1. . . xm. A label sequence Y can comprise multiple elements y1. . . ym. Each token xj has an associated label yj”. … ABS. “a database entry can be made to associate the attribute with the item. A training set can be used to initialize the learning mode”. Examiner’s note: labeled sequences X and Y are signatures}; and either:
selecting one of the plurality of token sequences or signatures; or generating a token sequence or signature to operate as an exemplar token sequence {[0018], “determining an attribute from each token of the sequence of tokens using the label for each token of the sequence of tokens; normalizing the attribute to create standardized representations of the attributes; writing the attributes to database entries associated with the product”};
Before the effective filing date of the claimed invention, it would have been obvious to one with ordinary skill in the art to modify EDWARDS’ technique of ‘generating a honey trap system for a malware intruder by identifying target data and tokenizing the object data in a decoy file to create algorithm for malware prevention’ for ‘a technique of tokenizing target data with labels and creating a sequence of tokens to use as a signature for further data processing’, as taught by MORE, in order to create a malware identification and remediation technique. The motivation of using honey trap is to identify malware intrusion by exchanging data with the intruder to create a malware remediation plan from the intruder’s cyber-attack intentions with exchanged data that the intruder is unaware of. A honeypot is a controlled and safe environment for showing how attackers work and examining different types of threats. With a honeypot, security staff won't be distracted by real traffic using the network - they'll be able to focus 100% on the threat. Honeypots can also catch internal threats.


Regarding Claim 2, EDWARDS as modified by MORE discloses all the features of claim 1. The combination further discloses wherein:
a plurality of tokens of the exemplar token sequence are substituted with replacement tokenisable data based on the collective attribute(s) of that plurality of exemplar tokens {EDWARDS: Fig. 3 & [0055], “the honey tokens 138 can have one or more characteristics (e.g., a type, a name, and/or data) that are expected to entice or bait a malicious process 202 into performing suspicious or malicious activity on the honey tokens 138”. … [0049], a comparison of the requesting process 202 to the elements listed“. Examiner’s note: cited ‘characteristics (e.g., a type, a name, and/or data)’ are generalized or collective attributes, e.g. a data ‘type’ attribute is not for a one data but for all collected data which is used in forming a decoy file}.

Regarding Claim 3, EDWARDS as modified by MORE discloses all the features of claim 1. The combination further discloses wherein:
the replacement tokenisable data is selected from a set of replacement strings having one or more attributes in common with the exemplar token(s) to be substituted {MORE: [0018], “determining an attribute from each token of the sequence of tokens using the label for each token of the sequence of tokens; normalizing the attribute to create standardized representations of the attributes; writing the attributes to database entries associated with the product”. Examiner’s note: common attributes are selected to create the template as exemplary token as cited as ‘standardized representations of the attributes’}.

Regarding Claim 4, EDWARDS as modified by MORE discloses all the features of claim 1. The combination further discloses wherein:
the substitution method operates by substituting exemplar tokens with randomly chosen replacement tokenisable data {EDWARDS: [0015], “the honey tokens can be interspersed ( or scattered) randomly such that it is difficult for a malicious process to detect or predict what is real data and what is false data. This inhibits a malicious process from learning a pattern and adapting to the honey tokens by skipping known locations of the honey tokens”}.

Regarding Claim 13, EDWARDS as modified by MORE discloses all the features of claim 1. The combination, however, does not explicitly disclose wherein:
a string is any one or more of the group consisting of: a word; punctuation; a symbol; a character; a paragraph; an image; a graphical element; a table; and a text {EDWARDS: Fig. 1 element 124 & [0030], “The data files 120 may include files that can be used by one or more of the applications 118. The data files 120 may be organized in a file library (or database) 122. The file library 122 may include folders 124 including collections (or sets) of data files 120. … The file library 122 may include a “My Pictures” folder 124c (e.g., having a file path “C:\Users\Mike\Libraries\Pictures”) for holding picture (or image) data files 120 for a user (e.g., JPEG files (*.jpg files), PDF files (*.pdf files), TIFF files (*.tiff files), GIF files (*.gif files), BMP files (*.bmp files), RAW files, and/or the like)”}.

Regarding Claim 16, EDWARDS as modified by MORE discloses all the features of claim 1. The combination further discloses wherein:
the method further comprises at least one of the following steps: deploying the created honeyfile; evaluating a honeyfile; and managing the lifecycle of the created honeyfile {EDWARDS: [0025], “honey tokens can operate as a "tripwire" that provides an alert with regard to suspicious activity by one or more processes. … when a process engages in suspicious activity with one or more honey tokens, that activity can be detected, and remedial action can be taken”}.

Regarding claim 21, claim 21 is claim to a system using the method of claim 1. Therefore, claim 21 is rejected for the reasons set forth for claim 1.

Claim 5 is rejected under AIA  35 U.S.C. 103 as being unpatentable over EDWARDS; Jonathan L.et al., Pub. No.: US 2016/0180087 A1 in view of MORE; Ajinkya, Pub. No.: US 2016/0379289 A1 and further in view of MATTSSON; Ulf, Pub. No.: US 2009/0249082 A1.



Regarding Claim 5, EDWARDS as modified by MORE discloses all the features of claim 1. The combination, however, does not explicitly disclose wherein:
the substitution method operates by a frequency proportional substitution method that substitutes replacement tokenisable data proportional to the appearance frequency of that replacement tokenisable data on one or more of the group consisting of: the file system, the surveyed file set, and an external repository.
In an analogous reference MATTSSON discloses wherein:
the substitution method operates by a frequency proportional substitution method that substitutes replacement tokenisable data proportional to the appearance frequency of that replacement tokenisable data on one or more of the group consisting of: the file system, the surveyed file set, and an external repository {Fig. 2 & [0061], “As an extra security measure, the processor 21 may also comprise a velocity checker for monitoring the frequency of replacing a part of CCNs (Credit Card Numbers) with a token to form tokenized sets of characters. In particular, the velocity checker can be used to detect a peek in the frequency of requests from a certain user/client. The velocity checker may be used to issue an alarm if a determined threshold level is exceeded”. … [0102], “As a further protective measure, a trap database may also be provided at the central server(s) comprising information about the original CCNs. Such a trap database preferably comprises fake CCNs, and is used as a "honey pot" to attract intruders. This may be used both to fool intruders, and for detecting attempts to break into the database systems”. Examiner’s note: if ratio or proportion of character frequency exceeded a threshold a notification is generated in addition to other steps}.
Before the effective filing date of the claimed invention, it would have been obvious to one with ordinary skill in the art to further modify EDWARDS’ technique (as modified by MORE) of ‘generating a honey trap system for a malware intruder by identifying target data and motivation is a controlled, manageable and customizable tokenization of data objects.
All references are inventions in analogous area but each invention teaches specific claimed limitation specifically and other references mutually cure each other’s deficiencies. When all claimed techniques are combined they teach claimed invention. 

Claims 11, 12, 14 and 15 are rejected under AIA  35 U.S.C. 103 as being unpatentable over EDWARDS; Jonathan L.et al., Pub. No.: US 2016/0180087 A1 in view of MORE; Ajinkya, Pub. No.: US 2016/0379289 A1 and further in view of STOLFO; Salvatore J. et al., Pub. No.: US 20170104785 A1.


Regarding Claim 11, EDWARDS as modified by MORE discloses all the features of claim 1. The combination, however, does not explicitly disclose wherein: 
natural language processing techniques are used during tokenisation and the token tag identifies a language-related feature of the token.
In an analogous reference STOLFO discloses
natural language processing techniques are used during tokenisation and the token tag identifies a language-related feature of the token {Figs. 1, 2 & [0016], “The DGS (Decoy system may be implemented in Python and may make extensive use of the Natural Language Toolkit (NLTK), a popular platform for building Python applications that process natural language”}.
Before the effective filing date of the claimed invention, it would have been obvious to one with ordinary skill in the art to further modify EDWARDS’ technique (as modified by MORE) of ‘generating a honey trap system for a malware intruder by identifying target data and tokenizing the object data in a decoy file to create algorithm for malware prevention for a technique of tokenizing target data with labels and creating a sequence of tokens to use as a signature for further data processing’ for a technique of ‘mechanism to tokenizing natural language features for malware decoying purpose’ by STOLFO to create a honey trap for a specific language related features. The motivation is to enable a honey trap to identify an intruder which is attacking documents such as English language email. The decoying technique can be used for spearphishing intrusions via email.
All references are inventions in analogous area but each invention teaches specific claimed limitation specifically and other references mutually cure each other’s deficiencies. When all claimed techniques are combined they teach claimed invention. The Examiner notes that this motivation applies to all dependent and/or otherwise subsequently addressed claims unless addressed separately. 

Regarding Claim 12, EDWARDS as modified by MORE and further modified by STOLFO discloses all the features of claims 11 & 1. The combination further discloses wherein: 
the token tag is any one or more of the group consisting of: a part-of-speech characterisation of the token; and a tag representing a part of a sentence decided by a dependency relationship {[0026], “When the DGS system processes e-mails, attachments, or other files, it may first extract the textual content from the document, then segment the text into sentences, then tokenizes each sentence (i.e., split the sentence into words plus important punctuation), then compute the part-of-speech (i.e., syntactic category) for each token”}.

Regarding Claim 14, EDWARDS as modified by MORE discloses all the features of claim 1. The combination, however, does not explicitly disclose: a token tag represents one or more attributes selected from the group of attributes consisting of:
a paragraph; an image; a graphical element; a table; text, including a separator, a number or a letter; formatting; a character; a word; punctuation; logical structure of a document; structural features of a file; a language-related component; identified in the tokenizable data.
In an analogous reference STOLFO discloses
a paragraph; an image; a graphical element; a table; text, including a separator, a number or a letter; formatting; a character; a word; punctuation; logical structure of a document; structural features of a file; a language-related component; identified in the tokenizable data {[0026], “When the DGS system processes e-mails, attachments, or other files, it may first extract the textual content from the document, then segment the text into sentences, then tokenizes each sentence (i.e., split the sentence into words plus important punctuation), then compute the part-of-speech (i.e., syntactic category) for each token”}.

Regarding Claim 15, EDWARDS as modified by MORE discloses all the features of claim 1. The combination further discloses wherein: 
the substitution method … to generate novel signatures based on the signatures which have been formed during tokenisation {EDWARDS: [0070], “The process of assigning a label to each token in a sequence can be termed “sequence labeling.” For example, as described above, sequence labeling can refer to assigning the labels “B-brand”, “I-brand”, and “O-labels” to each token in a sequence (such as a title). An input sequence X can comprise multiple tokens x1. . . xm. A label sequence Y can comprise multiple elements y1. . . ym. Each token xj has an associated label yj”. … ABS. “a database entry can be made to associate the attribute with the item. A training set can be used to initialize the learning mode”. Examiner’s note: labeled sequences X and Y are signatures}.
However, the combination does not explicitly disclose
… applies a learning algorithm …
 In an analogous reference STOLFO discloses
… applies a learning algorithm {[0023], “The user can select the type of chunk from a ‘Chunk’ menu, and then select portions of text that match that chunk”. … [0024], “Once enough documents have been labeled to constitute a training set, a user can train a chunker using a Python script”} …

EDWARDS; Jonathan L.et al., Pub. No.: US 2016/0180087 A1 in view of MORE; Ajinkya, Pub. No.: US 2016/0379289 A1 and further in view of SHEYMOV; Victor I, Pub. No. US 2012/0246724 A1.

Regarding Claim 17, EDWARDS as modified by MORE discloses all the features of claim 1. The combination, however, does not explicitly disclose wherein:
the managing the lifecycle of a deployed honeyfile includes the step of maintaining the fidelity of the replacement tokenisable data in the honeyfile with the file system data as the file system data changes.
In an analogous reference SHEYMOV disclose wherein:
the managing the lifecycle of a deployed honeyfile includes the step of maintaining the fidelity of the replacement tokenisable data in the honeyfile with the file system data as the file system data changes {Fig. 2 element 10 & [0030], “Upon completion of creating the dynamic decoy machine 40, the code inspection system can verify the integrity of the dynamic decoy machine 40. For example, the code inspection management module 10 can run a comparison between the protected system 20 and the dynamic decoy machine 40 to ensure the systems are substantially identical”}.
Before the effective filing date of the claimed invention, it would have been obvious to one with ordinary skill in the art to further modify EDWARDS’ technique (as modified by MORE) of ‘generating a honey trap system for a malware intruder by identifying target data and tokenizing the object data in a decoy file to create algorithm for malware prevention for a technique of tokenizing target data with labels and creating a sequence of tokens to use as a 
All references are inventions in analogous area but each invention teaches specific claimed limitation specifically and other references mutually cure each other’s deficiencies. When all claimed techniques are combined they teach claimed invention. 

Allowable subject matter
Claims 6 and 7 will be allowable if written in independent form with base method claim 1. For allowability, the independent system claim 21 is required to be in same scope with limitations of claims 6 & 7 as proposed amended claim 1. The applicant has option to write only claims 6 & 7 in independent form in base claim 1 and change the dependency of claims 8-10 (which are objected because those are dependent claims of claim 7) and keep as dependent claims of independent claim 1.
Reasons of allowance: what is missing from the prior art is generating a honey trap system for a malware intruder by identifying target data and tokenizing the object data with a customizable attribute remediation substitution technique with tokens for decoy intrusion..

Conclusion

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-flee). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/QUAZI FAROOQUI/
Examiner, Art Unit 2491