DETAILED ACTION

Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claim(s) 1-3, 7-12, 16-20 are rejected under 35 U.S.C. 102 (a)(2) as being anticipated by Venkatapathy (20140207439).

Venkatapathy (20140207439) teaches a method comprising: determining, by at least one processor, a plurality of candidate text entities comprised in a target text (as determining and displaying candidate phrases – para 0037);
combining, by the at least one processor, portions of the candidate text entities, to generate a plurality of candidate segmentation combinations corresponding to the target text, the candidate text entities comprised in each candidate segmentation combination being different (as generating bi-phrases and bi-sentences, -- para 0028 – 0031, para 0037);
calculating, by the at least one processor, a combination probability corresponding to each candidate segmentation combination, the combination probability being a probability that grammar is correct when the target text uses the candidate segmentation combination (as, performing weighting functions in scoring – para 0012, 0014, as a translability score, as well as taking into account grammar/semantics – para 0006, 0046);
determining, by the at least one processor according to the combination probabilities, a target segmentation combination corresponding to the target text; and extracting, by the at least one processor, a text entity from the target text according to the target segmentation combination (as developing/deriving a final, target text – para 0064, 0065);
wherein the candidate segmentation combination includes a portion of the candidate text entities (as, the segmentation probabilities – para 0094 – from the translation table) and text content (as, in the last half of para 0099, starting with “Similarity can be…”, the text content/similarity is used to determine a semantic similarity based on the typed text)  other than the candidate text entities (see first half of para 0099, wherein unsupervised LDA models are used – ie, the ‘unsupervised’ means that the source documents are trained/clustered regardless of the text input; in the example given, if a potential document has dog/cat, the topic of ‘animal’ is 
 and the combination probability of the candidate segmentation combination is calculated based on different groupings of text in the candidate segmentation combination (as, calculating a probability based on the topics, and generating a ‘high probability’ based on the latent match (first half of para 0099, as well as the typed-in-text matching – last half of para 0099; as well as aforementioned text pairing/matching – para 0046).  

As per claim 2, Venkatapathy (20140207439) teaches the method according to claim 1, wherein the method further comprises: obtaining a preset corpus resource comprising a plurality of preset templates and/or corpus data including an annotation (as dictionary/corpus of stored data – para 0014, 0028; with contemplation of templates – para 0006); and training an N-Grammar (N-Gram) model according to the preset corpus resource, the N-Gram model indicating a probability that N text elements are combined in order, each text element being a word or a phrase in a text, wherein N>=2, and N is a positive integer (as using N gram models, with N>=2 – para 0097, in calculating similarities/probabilities between the original text and target text – para 0099). 

As per claim 3, Venkatapathy (20140207439) teaches the method according to claim 2, wherein the combination probability is calculated according to the N-Gram model (as, using n-gram models – para 0099; for the probability calculations – para 0082-0086). 

Venkatapathy (20140207439) teaches the method according to claim 1, wherein the determining the plurality of candidate text entities comprises: determining a target field to which the target text belongs (as, determining a domain for the target sentences – para 0027, 0099); and determining, according to an entity library corresponding to the target field, the plurality of candidate text entities comprised in the target text (as, choosing the target phrases/sentences – para 0099, using latent topics – para 0099; para 0107), the entity library comprising vocabularies that belong to the target field (as, the library contains sentence definitions for a particular domain – para 0107). 

As per claim 8, Venkatapathy (20140207439) teaches the method according to claim 1, wherein a candidate segmentation combination corresponding to a largest one of the combination probabilities is determined as the target segmentation combination (as, calculating a maximum score – para 0079, comprising of a probability calculation – para 0080-0083l; and using a threshold score to maintain the phrase – para 0054). 

As per claim 9, Venkatapathy (20140207439) teaches the method according to claim 1, wherein the determining the target segmentation combination comprises: detecting whether a largest one of the combination probabilities is greater than a probability threshold (as, calculating a maximum score – para 0079, comprising of a probability calculation – para 0080-0083l; and using a threshold score to maintain the phrase – para 0054); and in response to the largest combination probability being greater than the probability threshold, determining a candidate segmentation combination corresponding to the largest combination probability as the target segmentation combination (as, after the threshold score is met – para 0054, and thereby 

	Claims 10-12, 16-18 are apparatus claims that perform the method steps of claims 1-3, 7-9 and as such, claims 10-12,16-18 are similar in scope and content to claims 1-3,7-9 above, and therefore, claims 10-12,16-18 are rejected under similar rationale as presented against claims 1-3, 7-9 above.  Furthermore, Venkatapathy (20140207439) teaches a memory/processor, and storage (para 0035, 0041, 0045).

	Claims 19,20 are non-transitory computer readable storage medium performing program steps, said steps being the method steps found in claims 1-3, 7-9 and as such, claims 19,20 are similar in scope and content to claims 1-3, 7-9 and therefore, claims 19,20 are rejected under similar rationale as presented against claims 1-3,7-9 above.  Furthermore, Venkatapathy (20140207439) teaches storage medium (para 0119, 0045).
 
Allowable Subject Matter

Claims 4-6, 13-15 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.  The particular claimed mathematical subset calculation of the n-gram list, is not explicitly taught by the prior art of record.

Response to Arguments

Applicant's arguments filed 4/8/2021 have been fully considered but they are not persuasive.  Examiner notes the portion of Venkatapathy teaching LDA semantic probabilities and using unsupervised documents and related topics, for comparison; ie, the corpus of documents are based on categories not derived from the typed-text – e.g., the categorization of cat/dog/animal is independent of the context of input-typed-text – the probability calculated in the LDA is independent of the typed input.  Examiner notes that the term ‘latent’ is also understood to one of ordinary skill in the art, to be a ‘global’ element, and not specialized to an input.  As an example, see Bellegarda (5839106) teaches semantic probabilities based on a latent/global context ( In other words, the language model 190 may be a syntactic model (e.g., an n-gram model), providing a set of a priori probabilities based on a local word context, or it may be a semantic model (e.g., a latent semantic model), providing a priori probabilities based on a global word context – Bellegarda also contemplates using both models of global context and localized-trained context, probabilities – “According to exemplary embodiments, hybrid processing can be carried out in several different ways. One form of hybrid processing, depicted conceptually in FIG. 3(a), is carried out using a two-pass approach during the recognition process. As shown, a first single-span language model 310, based on a first type of language constraint, is used to generate a first set of likelihoods, or scores, for a group of "most likely" candidate output messages. Then, a second single-span language model 320, based on a second (different) type of language constraint, is used to process the first set of scores to produce a second set of improved, hybrid scores. In FIG. 3(a) the first and second language models 310,320 are respectively labeled "Type-A" and "Type-B" to indicate that, if the first model 310 is .    

Conclusion

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

The following reference(s) were found toward the idea of a global grammar/document:
	Bellegarda (5839106) teaches “According to exemplary embodiments, hybrid processing can be carried out in several different ways. One form of hybrid processing, depicted conceptually in FIG. 3(a), is carried out using a two-pass approach during the recognition process. As shown, a first single-span language model 310, based on a first type of language constraint, is used to generate a first set of likelihoods, or scores, for a group of "most likely" candidate output messages. Then, a second single-span language model 320, based on a second (different) type of language constraint, is used to process the first set of scores to produce a second set of improved, hybrid scores. In FIG. 3(a) the first and second language models 310,320 are respectively labeled "Type-A" and "Type-B" to indicate that, if the first model 310 is a syntactic model, then the second model 320 is a semantic model, and vice versa. Because the resulting hybrid scores incorporate both local and global constraints, they are inherently more reliable than scores computed using either single-span model standing alone” 

Bao (20170091164) teaches global analysis for content – para 0105, 0110)

The following references pertinent to applicants claims toward text matching using ngrams:
20150199339. 13 May 14. 16 Jul 15. SEMANTIC REFINING OF CROSS-LINGUAL INFORMATION RETRIEVAL RESULTS. MIRKIN; Shachar, et al. – para 0025, 0054

20140163951. 07 Dec 12. 12 Jun 14. HYBRID ADAPTATION OF NAMED ENTITY RECOGNITION. Nikoulina; Vassilina – para 0069

20120041753. 12 Aug 10. 16 Feb 12. TRANSLATION SYSTEM COMBINING HIERARCHICAL AND PHRASE-BASED MODELS. Dymetman; Marc – para 0021, 0022

20110320185. 23 Jun 11. 29 Dec 11. SYSTEMS AND METHODS FOR MACHINE TRANSLATION. BROSHI; Oded. – para 0023, 0024

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Opsasnick, telephone number (571)272-7623, who is available Monday-Friday, 9am-5pm. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Mr. Richemond Dorvil, can be reached at (571)272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Michael N Opsasnick/Primary Examiner, Art Unit 2658                                                                                                                                                                                                        06/29/2021