DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 3, 5, 7, 9, and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al (CN110263350A, the English Translation of this document with corresponding paragraph numbering is .

With respect to Claims 1, 5, and 9: An end-to-end model training method, comprising: [Yang (para 38 and Fig 1) has disclosed at least a computing device having processor, memory, and corresponding software for performing the disclosed method.]
obtaining training data containing a plurality of training samples, wherein the plurality of training samples comprise an original sequence, [Yang (para 41 and 48) has disclosed receiving training text as the training samples for training the translation model.]
a target sequence [Yang (para 41 and 48) has disclosed the desired text as the “target sequence” that is the corresponding desired output text given the set of training samples or “training text” input to the training process.] and
[Yang has not further disclosed the following claim limitation requiring “a corresponding tag list, the tag list comprises……corresponding to the importance tags;”.]
a corresponding tag list, the tag list comprises importance tags in the target sequence and avoidance tags corresponding to the importance tags, and the avoidance tags are irrelevant tags corresponding to the importance tags; and [Wang (para 13, 17, 32, 38-39) has disclosed including in the set of training samples for translation of text, a set of important/correct words and their associated correct words and a list of False words that are wrongs words that are interpreted as the “avoidance”/”irrelevant” tags/words of the present claim limitations.  Wang refers to these target sequence datasets as the conventional corpus and noise corpus utilized in the training/learning model (para 23, 38-39, 41-42 and 43 of Wang).]
[Wang and Yang are analogous art of recognition processing to translate text from a first language (Wang – para 4 and 16) to a target language. It would have been obvious to alter the target sequence and comparison based training process of Yang by further including a noise corpus in the training sample/sequence of text so as to reduce false results as disclosed by Wang (para 42-43). The motivation for combining would have been to increase the accuracy of the recognition results produced by translation as disclosed by Wang (para 9) by recognizing and identifying incorrect translation results as disclosed by Wang (para 41-43). Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to at least try to combine the teachings of Yang and Wang to achieve the limitations of the presently claimed invention.]
adopting the training data to train a preset end-to-end model [Yang (para 53-56) has disclosed inputting the training data into the training model to translate said source text to said output/cypher text.] until a value of a preset optimization target function is smaller than a preset threshold, wherein
the optimization target function is determined according to the target sequence, a prediction sequence obtained after inputting the original sequence into the end-to-end model, and [Yang (para 64-65 and 68 of Yang “discloses the Evaluation Model On Quality of the training process) the matching degree that is the analysis of the quality of the text sequence translation between the input training text and the desired text “target sequence (para 64 of Yang), wherein the cypher text of Yang is the observed output of the trainer and desired text is the “target sequence” corresponding to the correction translation input to the system to measure the devices quality (i.e. comparing input training sample output to the output that is supposed to occur, para 65, para 68-69, para 74 of Yang). Yang (para 97-99) iterations of comparing the matching between the cypher text and the expected/desired text until conditions are met such that the loss function is less than a minimum change (para 98, 99 and 110-111 of Yang). The minimization of the loss function of Yang, hence a loss less than a level necessary to meet the criteria of the minimizing loss function of Yang.]
the tag list corresponding to the target sequence. [As per the combination of Wang and Yang the “tag list” or noise corpus of Wang is incorporated in the sequence of input training data, hence as per the teachings of Yang of Wang and Yang the determination of an optimization (min loss function of Yang) based on the training corpus target sample including a noise corpus has been disclosed.]

With respect to Claims 3, 7, and 11: The method according to claim 1, wherein when
the target sequence is a character sequence, the importance tag is nouns and verbs in the character sequence. [Yang the word sequences can be full sentences and paragraphs hence comprise nouns and verbs necessary to construct said word sequences into said sentences and paragraphs. Furthermore, the set of “importance tag” is the target sequence including both the conventional corpus and the noise corpus as disclosed by Wang, wherein Yang and Yang in view of Wang disclose the analysis of natural text language inherently comprising at least common grammatical components such as nouns and verbs.]


Claims 4, 8, and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al (CN110263350A, the English Translation of this document with corresponding paragraph numbering is attached with this office action) in view of Wang et al (CN109635305A, the English Translation of this document with corresponding paragraph numbering is attached with this office action) as applied to at least claim 1, in view of the teachings of Cao et al (US 2020/0167527).

With respect to Claims 4, 8, and 12: The method according to claim 1, wherein
the avoidance tags corresponding to the importance tags are determined by performing acts of: [Wang has disclosed the determination of importance tags defining correct and wrong word alignment based on a direct matching process (para 68-75) but has not detailed a vector model or correlation degrees for defining the tags of correct and wrong based on a vector model and correlation degrees as required by the below claim limitations. Furthermore, the identification and tagging to create a corpus of both positive (correct) and negative (wrong/false) has thus been disclosed and utilized in the training process of Wang and Yang in view of Wang.]
inputting each importance tag into a preset tag vector model to obtain correlation degrees between the importance tag and a plurality of preset tags; and [Cao (para 0003-0010) has disclosed the training and recognition of words based on a corpus containing both positive sample/training words in the corpus and negative sample/training words in the corpus, said words (words are the “tags” of the present claim limitations) being represented as vectors in the natural processing technique of Cao (para 0006). Cao, further discloses the identification of words samples for training based on correlation and including an identification of negative training samples (corresponding to the noise corpus of Wang and the avoidance tags of the present claim limitations) when a correlation is below a threshold value (para 0045-0047).]
selecting a tag from tags whose correlation degrees are smaller than a correlation degree threshold, and [Cao (para 0045-0047) when a correlation is below a threshold value (para 0045-0047).]
determining the tag as the avoidance tag corresponding to the importance tag. [Cao, further discloses the identification of words samples for training based on correlation and including an identification of negative training samples (corresponding to the noise corpus of Wang and the avoidance tags of the present claim limitations) when a correlation is below a threshold value (para 0045-0047).]
[Cao and Yang in view of Wang are analogous art of natural language processing to train recognition models based on a provided training corpus, wherein said corpus includes the use of negative/wrong/false samples.  It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the development of the training model corpus’s set of training samples as disclosed by Wang of Yang in view of Wang by utilizing the vector model and correlation method of identifying training samples including those correct and negative sample words/tags as disclosed by the process of Cao. The motivation for combining would have been to resolve the technical problem of large-scale word training efficiency as disclosed by Cao (para 0006-0007). Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to at least try to combine the teachings of Cao with Wang and Yang to achieve the limitations of the presently claimed invention.]

Allowable Subject Matter
Claims 2, 6, and 10 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
The details of the optimization function provided by Claims 2, 6, and 10 requiring at least three sets of cross entropy determination based on pairs of tag data and the weighted addition of said resulting cross entropies has not been disclosed by the known prior art, cited prior art, or reasonable combinations thereof.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Audhkhasi et al (US 2016/0019459) has disclosed a process of Neural Network based learning, wherein cross entropy is utilized in the minimization process of determining weights of the matrices based on the training data sets (para 0005 and 0007-0011). However, the set of 

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NATHAN J BLOOM whose telephone number is (571)272-9321.  The examiner can normally be reached on 9:30AM - 6:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kim Vu can be reached on (571) 272-3859.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/NATHAN J BLOOM/Examiner, Art Unit 2666                                                                                                                                                                                                        




/KIM Y VU/Supervisory Patent Examiner, Art Unit 2666