DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of U.S. Patent No. 9,965,458. Although the claims at issue are not identical, they are not patentably distinct from each other:

Application No. 16/832,632
Patent No. 9,965,458
Claim 1: A method for tokenizing text for natural language processing, the method comprising: generating, by one or more processors in a natural language processing platform, and from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving, by the one or more processors, a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming, by the one or more processors, one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving, by the one or more processors, a document to be processed; dividing, by the one or more processors, the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting, by the one or more processors, the divided tokens for natural language processing.
Claim 1: A method for tokenizing text for natural language processing, the method comprising: generating, by one or more processors in a natural language processing platform, and from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving, by the one or more processors, a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming, by the one or more processors, one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving, by the one or more processors, a document to be processed; dividing, by the one or more processors, the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting, by the one or more processors, the divided tokens for natural language processing.
Claim 18: An apparatus for tokenizing text for natural language processing, the apparatus comprising one or more processors configured to: generate from a pool of documents a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receive a set of rules comprising rules that identify character/letter sequences as valid tokens; transform one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receive a document to be processed; divide the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and output the divided tokens for natural language processing.
Claim 18: An apparatus for tokenizing text for natural language processing, the apparatus comprising one or more processors configured to: generate from a pool of documents a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receive a set of rules comprising rules that identify character/letter sequences as valid tokens; transform one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receive a document to be processed; divide the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and output the divided tokens for natural language processing.
Claim 20: 20. A non-transitory computer readable medium comprising instructions that, when executed by a processor, cause the processor to: generate from a pool of documents a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receive a set of rules comprising rules that identify character/letter sequences as valid tokens; transform one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receive a document to be processed; divide the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and output the divided tokens for natural language processing.
Claim 20: A non-transitory computer readable medium comprising instructions that, when executed by a processor, cause the processor to: generate from a pool of documents a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receive a set of rules comprising rules that identify character/letter sequences as valid tokens; transform one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receive a document to be processed; divide the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and output the divided tokens for natural language processing.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3-11, 13-18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Brun (US 2012/0035914) in view of Segond et al. (EP 1814047).

Claims 1 18 and 20,
Brun teaches a method for tokenizing text for natural language processing, the method comprising: generating, by one or more processors in a natural language processing platform, and from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents ([Fig. 1] [0033-0034] the input document is tokenized by the system by the parser or by a separate tokenizer; language guessers is used to identify the language of a text, based on statistical methods (trigrams), or on the presence and/or frequencies of certain words, word endings, and the like); 
receiving, by the one or more processors, a set of rules comprising rules that identify character/letter sequences as valid tokens ([0025] grammar rules associated with that language);
transforming, by the one or more processors, one or more entries in the statistical models into new rules indicate a high likelihood ([0018] determining whether the sequence of words in the secondary language should be expanded beyond the first word to include adjacent words); 
receiving, by the one or more processors, a document to be processed ([Figs. 1-2] [0025] [0033]  input text (document)); 
dividing, by the one or more processors, the document to be processed into tokens based on the set of statistical models and the set of rules ([0033] tokenized input document), wherein the statistical models ([0034] language guesser) are applied where the rules fail to unambiguously tokenize the document ([0025] not recognized words); and outputting, by the one or more processors, the divided tokens for natural language processing ([0025] the main language of an input text document is generally the natural language in which the majority of words are recognized and follows the grammar rules associated with that language; input text in a main language may include one or more sequences in one or more secondary languages; each secondary language can be any natural language other than the main language; each sequence in a secondary language can include one or more words in that language;  One or more of the words of a sequence which are recognized in that secondary language are words which are not recognized in the main language. By "recognized" or "known" it is meant that the word or words is automatically attributable to that language, e.g., by virtue of being represented in a respective lexicon for that language).
	The difference between the prior art and the claimed invention is that Brun does not explicitly teach transforming, by the one or more processors, one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood.
Segond teaches Brun teaches transforming, by the one or more processors, one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood ([S320] [Fig. 8] if the basic grammar does not yield any relations between the subject of a sentence and a location, as in "Pierre Dupont was born in Paris," a new rule can be created that will compute this information for the next embodiments of the grammar; this new rule, for example, creates a PEOPLE_LOCATION dependency, which connects a person to a place: PEOPLE_LOCATION("Pierre Dupont", "Paris")).
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Brun with teachings of Segond by modifying the system and method for handling multiple language in text of document as taught by Brun to include transforming, by the one or more processors, one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood as taught by Segond for the benefit of dependency being available for future queries and will be selectable by users to better access the information (Segond).

Claim 3,
Brun further teaches the method of Claim 1, wherein: the document to be processed is in one or more languages; and the divided tokens are outputted in a language agnostic format ([0025] [0033] main and secondary language; tokenizing decomposes the text into a sequence of tokens, each token including a word or punctuation).

Claim 4,
Burn further teaches the method of Claim 3, wherein: the document to be processed is in more than one language ([0025] main and secondary languages); the set of rules further comprises a rule that divides portions of the document in different languages into different segments ([0034] language guesser); and the segments of the document in different languages are divided into tokens based on a different combination of rules and statistical models ([0034] language guessers are tools identifying the language of a text, based on statistical methods (trigrams), or on the presence and/or frequencies of certain words, word endings, and the like).

Claim 5,
Brun further teaches the method of Claim 1, wherein the set of rules further comprises a rule that triggers the application of rules and/or statistical models for further tokenization ([0018] determining whether the sequence of words in the secondary language should be expanded beyond the first word to include adjacent words).

Claim 6,
Brun further teaches the method of Claim 1, wherein at least one of the divided tokens contains a morpheme ([0038] main language lexicon 54 provides parts of speech for words in the main language, enabling morphological analysis of the main language text).

Claim 7,
Brun further teaches the method of Claim 1, wherein at least one of the divided tokens contains a group of words ([0033] the input document 14 is tokenized by the parser 50 or by a separate tokenizer; tokenizing decomposes the text into a sequence of tokens, each token including a word or punctuation).

Claim 8,
Brun further teaches the method of Claim 7, wherein at least one of the divided tokens contains a turn in a conversation ([0018] determining whether the unknown sequence includes a first word recognized in the secondary language and, if so, identifying a sequence of words in the secondary language which includes at least the first word, the identifying of the sequence of words in the secondary language including determining whether the sequence of words in the secondary language should be expanded beyond the first word to include adjacent words.).

Claim 9,
Burn further teaches the method of Claim 1, wherein dividing the document to be processed into tokens based on the set of statistical models comprises comparing statistical likelihood of more than one candidate set of tokens  ([Abstract] The identifying of the sequence of words in the secondary language includes applying an algorithm for determining whether the sequence of words in the secondary language is expandable beyond the first word to include adjacent word).

Claim 10,
Brun further teaches the method of Claim 9, wherein the candidate set of tokens that contains tokens with smallest sizes is preferred ([0036] language guesser generally function best with a minimum length of a sequence of words (e.g., 7-9 words)).

Claim 11,
Brun further teaches the method of Claim 9, wherein more than one candidate set of tokens is outputted for natural language processing ([0027] the text is generally in a form which can be extracted (e.g., directly or by OCR processing) and processed using natural language processing (NLP)).

Claim 13,
Segond further teaches the method of Claim 1, wherein: the set of statistical models further comprises one or more statistical models for adding tags to the tokens and/or the set of rules further comprises one or more rules for adding tags to the tokens; and the method further comprises adding tags to the tokens based on the statistical models and/or the rules ([S320] [Fig. 8] the processor uses the selected nodes and features to enrich the grammar in the database 26 so that the information will be available for new queries. The goal of this step is not to query the database 26 but to extract information that otherwise would be lacking. For example, if the basic grammar does not yield any relations between the subject of a sentence and a location, as in "Pierre Dupont was born in Paris," a new rule can be created that will compute this information for the next embodiments of the grammar. This new rule, for example, creates a PEOPLE_LOCATION dependency, which connects a person to a place: PEOPLE_LOCATION("Pierre Dupont", "Paris")).

Claim 14,
Brun further teaches the method of Claim 13, wherein the tags are based on semantic information and/or structural information ([0054] annotates the text strings of the document with tags (labels) which correspond to grammar rules, such as lexical rules, syntactic rules, and dependency (semantic) rules).

Claim 15,
Brun further teaches the method of Claim 1, wherein the set of rules further comprises one or more rules that identify markup language content, an Internet address, a hashtag, or an emoji/emoticon ([0119] blogs and forms).

Claim 16,
Brun further teaches the method of Claim 1, wherein the set of statistical models and/or the set of rules are adjusted based at least in part on an author of the document ([0119] the focus of opinion mining applications which attempt to determine the author's opinion using natural language processing of the text).

Claim 17,
Brun further teaches the method of Claim 1, wherein the set of statistical models and/or the set of rules are based at least in part on intra-document information (0025] the main language of an input text document is generally the natural language in which the majority of words are recognized and additionally, generally follows the grammar rules associated with that language).

Claims 2, 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Brun (US 2012/0035914) in view of Segond et al. (EP 1814047) and further in view of Reiser et al. (US 2015/0356647).

Claims 2 and 19,
Brun and Segond teach all the limitations in claim 1. The difference between the prior art and the claimed invention is that Brun nor Segond explicitly teach wherein the set of statistical models further comprises statistical models based on human annotation, and the method further comprises: generating, by the one or more processors, one or more human readable prompts configured to elicit annotations of one or more documents in the pool of documents, wherein the annotations comprise identification of one or more character/letter sequences in the documents as valid tokens; receiving, by the one or more processors, one or more annotations elicited by the human readable prompts; and generating, by the one or more processors, statistical models based on the received annotations, wherein the statistical models comprise one or more entries each indicating a likelihood of appearance of a character/letter sequence in the character/letter sequences annotated as valid tokens.
Reiser teaches wherein the set of statistical models further comprises statistical models based on human annotation, and the method further comprises: generating, by the one or more processors, one or more human readable prompts configured to elicit annotations of one or more documents in the pool of documents, wherein the annotations comprise identification of one or more character/letter sequences in the documents as valid tokens; receiving, by the one or more processors, one or more annotations elicited by the human readable prompts; and generating, by the one or more processors, statistical models based on the received annotations, wherein the statistical models comprise one or more entries each indicating a likelihood of appearance of a character/letter sequence in the character/letter sequences annotated as valid tokens ([0077] a relation model may be trained statistically using methods similar to those described above for training the statistical entity detection model. For example, in some embodiments, training texts may be manually labeled with various types of relations between entity mentions and/or tokens within entity mentions. For example, in the training text, “Patient has sinusitis, which appears to be chronic,” a human annotator may label the “Problem” mention “chronic” as having a relation to the “Problem” mention “sinusitis,” since both mentions refer to the same medical fact. In some embodiments, the relation annotations may simply indicate that certain mentions are related to each other, without specifying any particular type of relationship. In other embodiments, relation annotations may also indicate specific types of relations between entity mentions. Any suitable number and/or types of relation annotations may be used, as aspects of the invention are not limited in this respect. For example, in some embodiments, one type of relation annotation may be a “split” relation label. The tokens “sinusitis” and “chronic,” for example, may be labeled as having a split relationship, because “sinusitis” and “chronic” together make up an entity, even though they are not contiguous within the text. In this case, “sinusitis” and “chronic” together indicate a specific type of sinusitis fact, i.e., one that it is chronic and not, e.g., acute).
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Brun and Segond with teachings of Reiser by modifying the language guesser as taught by Brun to include wherein the set of statistical models further comprises statistical models based on human annotation, and the method further comprises: generating, by the one or more processors, one or more human readable prompts configured to elicit annotations of one or more documents in the pool of documents, wherein the annotations comprise identification of one or more character/letter sequences in the documents as valid tokens; receiving, by the one or more processors, one or more annotations elicited by the human readable prompts; and generating, by the one or more processors, statistical models based on the received annotations, wherein the statistical models comprise one or more entries each indicating a likelihood of appearance of a character/letter sequence in the character/letter sequences annotated as valid tokens as taught by Reiser for the benefit of improving the interactive process between an automated NLU system and a human coder (Reiser [0024]).

Claim 12,
Brun and Segond teach all the limitations in claim 1. The difference between the prior art and the claimed invention is that Brun nor Segond explicitly teach the set of statistical models further comprises one or more statistical models for normalizing variants of a token into a single token and/or the set of rules further comprises one or more rules for normalizing variants of a token into a single token; and the method further comprises normalizing variants of a token into a single token based on the statistical models and/or the rules.
Reiser teaches the set of statistical models further comprises one or more statistical models for normalizing variants of a token into a single token and/or the set of rules further comprises one or more rules for normalizing variants of a token into a single token; and the method further comprises normalizing variants of a token into a single token based on the statistical models and/or the rules ([0068] a statistical model may be trained to determine the most likely section for a portion of text based on its semantic content, the semantic content of surrounding text portions, and/or the expected semantic content of the set of normalized sections; once a normalized section for a portion of text has been identified, the membership in that section may be used as a feature of one or more tokens in that portion of text).
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Brun and Segond with teachings of Reiser by modifying the language guesser as taught by Brun to include the set of statistical models further comprises one or more statistical models for normalizing variants of a token into a single token and/or the set of rules further comprises one or more rules for normalizing variants of a token into a single token; and the method further comprises normalizing variants of a token into a single token based on the statistical models and/or the rules as taught by Reiser for the benefit of improving the interactive process between an automated NLU system and a human coder (Reiser [0024]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Chang et al. (US 2016/0132572) – [0066] At operation 406, a triple is extracted from the content of the source file based on the type of the source file and/or the file-type-specific triple extraction technique. For example, a triple extractor of the data knowledge system can receive and process the stratified data. If natural language text is processed, the triple extractor can tokenize, parse, and speech tag the text to determine sentences, words within the sentences, word types (e.g., noun, verb, adjective, etc.), and phrase expressions (e.g., noun phrases containing a noun and a proximate such as a consecutive word). The triple extractor can consider a set of words or phrase expressions (e.g., within a sentence) and apply a set of rules to generate a triple (or a number of triples). The triple can include a subject, predicate, and object based on the applied rules. This operation would be similarly applied also to hierarchical and tabular data to extract the triples from the actual text in this data. In this case also, the triple extractor can infer and generate triples from any structural relationships defined in the hierarchy or table.
Ohta et al. (US 2003/0120640) – [0006] An increasing number of approaches to extracting any information from literature databases have lately been made. Most of these approaches are divided into those using natural language processing and those using keywords and formal rules. A typical approach using NLP is such that all words in a document are tagged with grammatical labels through syntax analysis using NLP of text obtained from a public database such as MEDLINE and binary relations are extracted by searching for the subject and object word of a verb representing binary relation. A typical approach using keywords is as follows. First, find out keywords that are often used to express interaction between substances. Next, analyze sentence structure to determine what pattern of sequence in which the keyword, substance names, preposition, etc. are put. Finally, search for sentences in which they appear in a certain pattern, using a dictionary of substance names and the defined patterns.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHREYANS A PATEL whose telephone number is (571)270-0689. The examiner can normally be reached Monday-Friday 8am-5pm PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

SHREYANS A. PATEL
Examiner
Art Unit 2657



/SHREYANS A PATEL/Examiner, Art Unit 2656