Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/04/2022 has been entered.
 
Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective 


Claim 1,11-14 are rejected under 35 U.S.C. 103 as being unpatentable over Barbaiani (20080300857).in view of Amarilli (20140303973). 
As per claim 1, Barbaiani (20080300857) teaches a method for unsupervised text compression (as preprocessing text retrieval – para 0004, wherein truncation – ie compression, occurs – para 0064) , the method comprising: 
receiving an input sentence having a plurality of tokens; determining a 
progressively searching for a next deletion on the 
and generating a compressed sentence following the deletion path that progressively deletes tokens from the input sentence (as, selecting the optimal sequence that removes certain underperforming candidate pairs – para 0149; and a reordering – para 0125-0126, 0127). 
	Barbaiani (20080300857) teaches on the intermediate sentence level, calculating deletion paths and pruning (as, the removal of subsequences which are less probable – para 0022, para 0149) and on a graph (as mapped above), but does not explicitly teach the intermediate sentences along a deletion path with shorter versions of the previous intermediate sentence using a directed Amarilli (20140303973) teaches generation of probability structures with a combination of edits via deletion/insertion/substitution – para 0171, and assigning weighting to the shortest path (para 0170), so that the eventual selection is the shortest path (para 0152-0153, including Table US-00003); and substituting a directed acyclic graph in the search graph (para 0190)  Therefore, it would have been obvious to one of ordinary skill in the art of score assignment for text compression to modify the algorithms of Barbaiani (20080300857) with the assignment scoring during editing with a directed acyclic graph, as taught by Amarilli (20140303973), because it would advantageously improve the error rate (para 0003) without being computationally expensive (para 0002). 

Claim 11 is a system claims comprising processors executing code stored in memories performing the method steps of claim 1 above and as such, claim 11 is similar in scope and content to claim 1 above; therefore, claim 11 is rejected under similar rationale as presented against claim 1 above.  Furthermore, Barbaiani (20080300857) teaches computer (processor) with memory (para 0047), performing the disclosed steps.

    
As per claim 12, the combination of Barbaiani (20080300857) in view of Amarilli (20140303973) teaches the system of claim 11, further comprising: assigning a score to each token in the plurality of tokens based on a pretrained bidirectional language model (Barbaiani (20080300857), as, the language is bidirectional on many fronts – para 0040, as well as either direction – para 0041; and on the model level, forward reverse – para 0086, 0154); comprising of 

As per claim 13, the combination of Barbaiani (20080300857) in view of Amarilli (20140303973) teaches the system of claim 12: 
designate a root node of the directed acylcil graph (Amarilli – para 0190) representing the input sentence (Barbaiani (20080300857), as, using lemmatization – which is the identification of the root form of the words – para 0064; in view of using the root form of the words, as the states – para 0101 – e1,e2,e3,e4 – as applied to fig. 8);
 designate a first lower level of nodes that are directly connected to the root node to a first set of candidate intermediate sentences (Barbaiani (20080300857), as, each level is calculated and contributed to the next layer – see fig. 4, layer of ei.1—ei.k. then to ei+1,.1, etc.), respectively, each candidate intermediate sentence from the first set being obtained after deleting the one or more tokens from the input sentence (Barbaiani (20080300857), as, the removal of subsequences which are less probable – para 0022, para 0149);
 and for each outgoing edge connecting from the root node to one of the first lower level of nodes, assigning a respective average perplexity score based on a respective candidate intermediate sentence (Barbaiani (20080300857), as, for the last state/edge connecting from the node to the next level, using a parameter pi to measure the reordering score – para 0116, and then recompiling the probability states (of the HMM), into a WFST, but has as the lining score, but outputs the next jump state for the next level to use – last 2 full sentences of para 0116). 

Barbaiani (20080300857) in view of Amarilli (20140303973) teaches the system of claim 12, wherein to assign the respective average perplexity score based on the respective candidate intermediate sentence comprises:
determining a first set of tokens present in the respective candidate intermediate sentence (Barbaiani (20080300857), as, scoring the continuity, for a source sentence – para 0105, and then expanding into intermediate sentence – para 0106);
determining a second set of tokens that have been deleted from the input sentence to result in the respective candidate intermediate sentence (Barbaiani (20080300857), as using the above mentioned removal/deletion techniques -- (Barbaiani (20080300857), as, using lemmatization – which is the identification of the root form of the words – para 0064; as, controlling the removal rate based on the pairs occurring fewer than a threshold – para 0149);
and calculating the respective average perplexity score based on a sum of logarithms of scores corresponding to the first set of tokens and scores corresponding to the second set of tokens (Barbaiani (20080300857), as summation of the log of probabilities – see fig. 12; corresponding spec – para 0108; for each transition in the next layer of states/tokens; and a complexity score as well – para 0109-0111). 

Allowable Subject Matter

Claims 2-10, 15-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.  As per claims 2-10, 15-20, the claim limitations towards the deletion process with the acyclic graph, is not explicitly taught by the prior art of record.

Response to Arguments

Applicant’s arguments with respect to claim(s) have been considered but are moot in view of the newly recited section of the Amarilli reference.  Examiner notes that after further review of the Amarilli reference after the conducted interview, examiner found in para 0190 of Amarailli, teaching the concept of using a direct acyclic graph in the discovery technique.  Examiner notes above, the divergence between the prior art of record and the new claim scope, with the indication of allowable subject matter.. 

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

Zhang (20190377792) teaches graphs identifying the next word with soft edges, beam searching, softmax layer (fig. 5) with greedy searching (para 0034).

Sapoznik (20180013699) teaches graphs with nodes, and removal of other possible paths by deletion – para 0171.

Gupta (20100274770) teaches removal of subsets (ie, deletion paths) based on scoring – para 0054, as well as greedy searching – para 0055).


Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Opsasnick, telephone number (571)272-7623, who is available Monday-Friday, 9am-5pm. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Mr. Richemond Dorvil, can be reached at (571)272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).


/Michael N Opsasnick/Primary Examiner, Art Unit 2658                                                                                                                                                                                                        01/12/2022