DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the
first inventor to file provisions of the AIA .

Response to Amendment
The amendment filed on April 12th, 2022 has been entered. Claims 1-20 remain  
pending. Applicant’s argument to the specifications have overcome the objections and 35 U.S.C 101 rejection previously set forth in the Non-Final Office Action mailed January 27th, 2021.

Response to Arguments
Applicant's arguments filed April 7th, 2022 have been fully considered but they are
not persuasive. Applicant’s arguments with respect to claims 1-23 have been considered but
are moot because the new ground of rejection were necessitated due to the amendments as
there has been a change in scope.
	Applicant submits on the fourth paragraph of pg. 10 that, “training a neural network model with difference samples could not be performed merely in the mind.” Furthermore, applicant describes such features of training with the tasks in regards to the neural network model, improves the functionality of the computer and therefore integrate the abstract idea into a practical application. 
	Argument is persuasive; however, not entirely agreed upon. The features reflect an improvement to the technology rather than the functionality of the computer as there is no new hardware introduced rather a process with training the neural network. Paragraphs reflecting how the features improve the technology are seen in paras. 5-6, 22, 44, and 74; therefore, argument is persuasive and amendment has overcame the  35 U.S.C. 101 rejection previously set forth in the Non-Final Office Action mailed January 27th, 2021

	Applicant submits on the first paragraph of pg. 12 that, “if as indicated by the examiner, V2 in Zhang is compared with the first training samples in claim 1, V1 is compared with the second training sample in claim 1, then V2 could not be seen as anticipating the first training samples which comprise the text data and a replacement text with the word or phrase replaced with a substitute word or phrase. It is obvious that V2 in Zhang does not include simultaneously the first input text with the rare word per se and the second input text with the rare word replaced with the preset character. Assuming that V1+V2 in Zhang is compared with the first training samples, then Zhang fails to teach or suggest the second training sample as recited in claim 1… Patra also fails to teach or suggest the above limitations either. 
	Argument is fully considered in view of the references of Zhang and Patra. However, due to the amendment, a change of scope has been determined where the first training sample introduces that it comprises text data and a replacement text with the word or phrase replaced with a substitute word or phrase. 




Claim Objections
Claims 1, 9, and 17 are objected to because of the following informalities: 
Independent claims 1, 9, and 17 read, “constructing a first training samples…” in
lines 4-5 of claim 1 as an example. The “a” is not needed and deleting corrects grammatical errors. Appropriate correction is required.

Independent claims 1, 9, and 17 read, “wherein the first training sample
comprise the text data and a replacement….”, this limitation should have a line break from the previous limitation after the “;” e.g. in independent claim 1,
	“according to the word or phrase…”;
	“wherein the first training samples…”; and 
As to fit the formatting of the claims thereafter. The same is present for independent claims 9 and 17. Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35
U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries for establishing a background for determining obviousness under
35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the
claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (CN
109766523) in view of Lin et al. (US 2021/0027018 A1) hereinafter Lin and further in view of Patra et al. (US Pub No. 2020/0342055 A1).
Regarding claim 1, Zhang teaches a method for creating a label marking model (Para. 10 pg. 3 under description heading, The present invention provides a-of-speech tagging method, comprises a convolutional neural network CNN model, a lock rod circulation unit BGRU model, two-way of the length of the memory network BLSTM model and the condition with the airport CRF model), comprising:
obtaining text data and determining a word or phrase to be marked in the text data (Step S301 on para. 18 pg. 7, training corpus of sample data conversion i.e. obtains text data
through conversion; Step S302 on para. 19 pg. 7, detecting whether the first input text includes a rare word as to tag as the purpose of the document);
according to the word or phrase to be marked, constructing a first training samples of the text data corresponding to a word or phrase replacing task and a second training sample corresponding to a label marking task (Para. 19 – 20 pg. 7, S302, rare word replaced by preset
characters i.e. V2 second input text from S303; S303 V1 is first input text i.e. label marking); 
	However, Zhang fails to explicitly disclose:
	Implemented by a computer; 
wherein the first training samples comprise the text data and a replacement text with the word or phrase replaced with a substitute word or phrase; and
training a neural network model with a plurality of the first training samples and a plurality of the second training samples, respectively, until a loss function of the word or phrase replacing task and a loss function of the label marking task satisfy a preset condition, to obtain the label marking model.
In a related field of endeavor (e.g. natural language processing used for text content classification, see para. 4) Lin discloses, training samples can be determined based on a plurality of related text contents related to a current scenario. Each related text content can have a plurality of keywords as sample features, and the related text content can be used as a corresponding sample label, see para. 37. As an example, para. 39 discusses, The scenario-related words can be various scenario-related words. For example, in a marketing scenario, words such as “== supermarket”, “New Year goods”, “personal hygiene”, “clothing”, “red packet”, “offline payment”, and “coupon” can be selected as scenario-related words. The scenario-related words can be manually determined, can be captured from a predetermined website based on a related category, or can be extracted from description information of a related scenario. Furthermore, word expansion can be performed on a few initially determined scenario-related words, to automatically recall more related words by using the existing words. The word expansion can be performed by using a synonym, a near-synonym, an associated word, a word with similar semantics, etc. The synonym and the near-synonym can be determined based on a record in a dictionary then the scenario-related words are matched with the text content in the text content library, see paras. 40-41.
Modifying Zhang’s neural network model training including both samples to include the
features of Lin discloses:
	wherein the first training samples comprise the text data and a replacement text with the word or phrase replaced with a substitute word or phrase (e.g. the training neural network model with a plurality of first and second training samples respectively taught by Zhang, now modified by Lin to include wherein the first training samples comprise text data and a replacement text with the or phrase replaced with a substitute word or phrase as taught by Lin, see paras. 37, 39-40);
It would have been obvious to one of ordinary skill in the art at the time the invention
was filed to apply the teachings of Lin to the method of Zhang. Doing so would have been
predictable to one of ordinary skill in the art given the similar nature between the two
disclosures, for example both using natural language processing techniques for text content classification. Further, doing so would have provided the users of Zhang, with the added benefits of improving the accuracy of a trained model in a process of selecting i.e. selection of replacement text content related to the text content as the scenario-related words are matched with a text content, as recognized by Lin, see para. 38. Overall, manual participation is decreased and the manual planning level constitutes no limitation, and diversified candidate text contents can be generated, thereby improving efficiency and effectiveness of generating the recommendation information as recognized by Lin, see para. 21.
Zhang teaches a method training a neural network model with a plurality of the first
training samples and a plurality of the second training samples (Para. 25 pg. 7, updating of
models used for inputting V1 and V2 i.e. plurality of second training samples and first training
samples respectively as it updates the models according to error). Moreover, while there is a
loss function according to each parameter of a step interval estimation and two-step interval
estimation… each iteration parameter is learning step have a determined range… and the value
of the parameter is relatively stable. Preferably, the present invention is provided for each 3000
step to carry out a learning rate index attenuation, the attenuation base is 0.1, the remaining
parameter defaults (Para. 25 - 29 pg. 7) to obtain a labeling model hence the title; however, it is
silent whether the 3000 step and attenuation base is a preset condition to satisfy.
In a related field of endeavor (e.g. technology useful for building NLP applications to
implement text analysis see Para. 2), Patra discloses a method and computing
devices, programs, and other computing elements in which the methods may be implemented,
see para. 63, where the named entity disambiguation techniques results in improved
classification accuracy of machine learning classification models, see para. 25 as through neural
networks further described in para. 100. Furthermore, the generic hardware is described in
para. 75 with a computer system process comprises an allotment of memory (physical and/or
virtual), the allotment of memory being for storing instructions executed by the hardware
processor, for storing data generated by the hardware processor executing the instructions. Patra discloses a method where the “training data comprises multiple inputs, each being referred to as sample in a set of samples. Each sample includes a value for each input neuron. A sample may be stored as a vector of input values” i.e. where the training samples represent a task respectively (see para. 114). Furthermore, the neural network includes a loss function where the arithmetic or geometric difference between correct and actual outputs may be measured as error according to a loss function, such that zero represents error free (i.e. completely accurate) behavior (see para. 121) in which training may cease when the error stabilizes (i.e. ceases to reduce) or vanishes beneath a threshold (i.e. approaches zero) (see para. 122).
Modifying Zhang’s neural network model training including both samples to include the
features of Lin and further includes features of Patra discloses:
	implemented by a computer (e.g. the training neural network model with a plurality of first and second training samples respectively taught by Zhang including first training samples features as taught by Lin, now modified by Patra to include where the method is implemented by a computer as taught by Patra, see para. 75 and 100);
Training a neural network model with a plurality of the first training samples and a
plurality of the second training samples, respectively, until a loss function of the word or phrase replacing task and a loss function of the label marking task satisfy a preset condition, to obtain the label marking model (e.g. the training neural network model with a plurality of first and
second training samples respectively taught by Zhang including first training samples features as taught by Lin, now modified by Patra until a loss function of both training samples representing the tasks satisfy a preset condition i.e. beneath a threshold, to obtain the label marking model, see para. 121 and 122).
It would have been obvious to one of ordinary skill in the art at the time the invention
was filed to apply the teachings of Patra to the method of Zhang in view of Lin. Doing so would have been predictable to one of ordinary skill in the art given the similar nature between the two disclosures, for example both using natural language processing techniques for textual analysis. Further, doing so would have provided the users of Zhang, with the added benefits of the model knowing when to cease training due to a loss function results being compared to a
threshold i.e. a measure of accurate behavior (see. Para. 121 - 122) in which the model training
may be supervised or unsupervised (see. Para. 123), where doing so would have provided the users of Zhang, with generic hardware to execute instructions set forth by a method as recognized by Patra, see para. 65 and 75-78.

Regarding claim 2, in addition to the elements stated above regarding claim 1, the
combination (Zhang in view of Lin and further in view of Patra) Zhang further discloses:
obtaining part-of-speeches of words or phrases in the text data after performing word segmentation on the text data (Para. 1 – 2 pg. 9, principles can be associated with each other at the reference; Para. 6 pg. 7, participle to text to be separated form first input text i.e. performing word segmentation on text data where para. 8 pg. 3 under description points out that the rare word is limited to the noun part-of-speech hence part of speeches are obtained after participle to text to be separated form first input text and before S302 of rare word detection); and
taking a word or phrase whose part-of-speech belongs to a preset part-of-speech as the word or phrase to be marked (Para. 19 pg. 7, S302 rare word detection due to its noun part of
speech is taken to be labeled in the model).

Regarding claim 3, in addition to the elements stated above regarding claim 1, the
combination (Zhang in view of Patra) as reasoned above in the rejection of claim 1 does not
make obvious:
obtaining the substitute word or phrase corresponding to the word or phrase to be marked;
after replacing the word or phrase to be marked in the text data with the substitute word or phrase, taking a class of the substitute word or phrase as a replacement class marking result of the replacement text; and
taking the replacement text and the replacement class marking result corresponding to the replacement text as the first training sample.
Zhang discloses according to the word or phrase to be marked, constructing a first
training sample of the text data corresponding to a word or phrase replacing task (Para. 19 – 20
pg. 7, S302, rare word replaced by preset characters i.e. V2 second input text from S303). While
Zhang discloses constructing a first training sample corr. to a word or phrase replacing task
according to the word or phrase to be marked, it is silent in which the preset characters are
substitute words or phrases corresponding to the word or phrase to be marked and taking the
class corresponding to the replaced word or phrase to be marked with a substitute; and taking
the replacement text and class marking result corresponding to the replacement text as the
first training sample.
In a related field of endeavor (e.g. technology useful for building NLP applications to
implement text analysis see Para. 2), Patra additionally discloses a method where the “training
data comprises multiple inputs, each being referred to as sample in a set of samples. Each
sample includes a value for each input neuron. A sample may be stored as a vector of input
values” i.e. where the training samples represent a task respectively (see para. 114).
Furthermore, Patra teaches that words of interest may be named entities, such as names of
persons, locations, companies, and the like see. Para. 2 i.e. nouns words or phrase to be
marked for example. Para. 44, indicates that the candidate finder component gathers “NY,”
“NYC,” “Big Apple,” “The City,” among others for entity linking from the redirect’s subgraph
where an example of a redirect may be Wikipedia). Para 45, indicates that the different
appearances of the entity, “New York” are different surface forms and their categories or
replacement class are taken as nodes or vertex explained further in para. 46 with Paris being a
French or American city). Para. 44, indicates that a sample may be stored as a vector of input
values i.e. where the training samples represent replacement text and replacement class
marking results, see Figure 4A.
Modifying Zhang’s neural network model training including both samples to include the features of Lin and further includes features of Patra discloses:
obtaining a substitute word or phrase corresponding to the word or phrase to be
marked (e.g. The rare word detection limited to nouns and replaced to preset characters as taught by Zhang including features of Lin now modified by substitute words or phrase as taught by Patra, see para. 44-46):
after replacing the word or phrase to be marked in the text data with the substitute
word or phrase, taking a class of the substitute word or phrase as a replacement class marking
result of a replacement text (e.g. e.g. The rare word detection limited to nouns and replaced to
preset characters as taught by Zhang including features of Lin now modified to now replaced with the substitute word or phrase and taking the class of the replacement word or phrase of marked result as taught by
Patra, see para. 44); and
taking the replacement text and the replacement class marking result corresponding to
the replacement text as the first training sample (e.g. The rare word detection limited to nouns
and replaced to preset characters as taught by Zhang represented as the first sample i.e. V2 including features of Lin  now modified by the replacement text and the corresponding replacement class marking result as taught by Patra, see para. 44-46).
It would have been obvious to one of ordinary skill in the art at the time the invention
was filed to apply the teachings of Patra to the method of Zhang in view of Lin. Doing so would have been predictable to one of ordinary skill in the art given the similar nature between the two disclosures, for example both using natural language processing techniques for textual analysis. Further, doing so would have provided the users of Zhang, with the added benefits of the disclosed named entity disambiguation (NED) to reduce overhead training by not requiring
large text corpus for effective training, also provide flexibility by facilitating application for
different domains, and help to provide better quality of answer, performance, and accuracy
even without large text mining, see para. 16.

Regarding claim 4, in addition to the elements stated above regarding claim 1, the
combination (Zhang in view of Lin and further in view of Patra) Zhang further discloses:
obtaining a label word or phrase associated with the word or phrase to be marked (Para. 19 – 20 pg. 7, S302, Label word is itself as second input text is enabled to equal to the
first input text as it obtains the first input text V1), and taking the label word or phrase as a label marking result of the word or phrase to be marked (Para. 19 – 20 pg. 7, S302, rare word replaced by preset characters, if it is not, then the second input text is enabled to be equal to the first input text; S303 V1 is first input text i.e. label marking);
and taking the text data, the word or phrase to be marked and the label marking result corresponding to the word or phrase to be marked as the second training sample (Para. 19 – 20 pg. 7, S302, rare word replaced by preset characters, if it is not, then the second input text is enabled to be equal to the first input text; S303 V1 is first input text i.e. label marking i.e. second training sample as it differs from V2 which is the replacement).

Regarding claim 5, in addition to the elements stated above regarding claim 3, the
combination (Zhang in view of Lin and further in view of Patra) Patra further discloses: 
determining identification information of the word or phrase to be marked in a preset knowledge base (Patra teaches in Para. 44, an entity for “New York City” might appear also as
“NY,” “NYC,” “Big Apple,” “The City,” among others. To help address this issue, according to an embodiment, the candidate finder component 112 traverses through the redirect’s subgraph, which was built during the graph learning phase 102, where graph learning phase 102 uses the graph building component 106 is configured to convert a knowledge base into a knowledge graph, see para. 22; e.g. where the rare detected words to be marked as taught by Zhang now including the features of Lin modified by identifying the word in a preset knowledge base); and
obtaining the substitute word or phrase in the preset knowledge base corresponding to the identification information (Para. 45, By building redirects graphs with DBpedia redirect
links, such as redirects graph 400 of FIG. 4A, the candidate finder component 112 may match a surface form of a mention, such as an “NY” surface form 402 to a vertex 404 corresponding to the entity “NY” in the redirects graph, and then follow a redirect link or edge 406 from the
vertex “NY” to a vertex 408 corresponding to the entity “New York.”; e.g. the where the rare
detected words to be marked as taught replaced by preset characters by Zhang including features of Lin now modified to be replaced with substitute words in the preset knowledge base corresponding to the identification information as taught by Patra).

Regarding claim 6, in addition to the elements stated above regarding claim 3, the
combination (Zhang in view of Lin and further in view of Patra) Zhang further discloses:
taking the replacement text as input, and taking the replacement class marking result corresponding to the replacement text as output, so that the neural network model is able to, according to the input replacement text, output a probability that the input replacement text belongs to a replacement class (Para. 22 and 27 pg. 7. Input text into BGRU, a type of neural
network i.e. Bidirectional Gated Recurrent Unit NN, of word or phrase to be marked as input V2 and taking the label marking result corresponding to V2 and V2’ as output, so it then outputs a probability through Adam (adaptive moment estimation) in probability theory square that variable X obeys some distribution according to loss function to the gradient of each parameter and dynamic adjustments are directed to the learning rate of each parameter i.e. probability of input correlation with output V2 and V2’ respectively).

Regarding claim 7, in addition to the elements stated above regarding claim 4, the
combination (Zhang in view of Lin and further in view of Patra) Zhang further discloses:
taking the text data and the word or phrase to be marked as input, and taking the label marking result corresponding to the word or phrase to be marked as output, so that the neural network model is able to, according to the input text data and the word or phrase to be marked, output a probability that the label word or phrase belong to the label marking result of the word or phrase to be marked (Para. 21 and 27 pg. 7. Input text into CNN of word or phrase
to be marked as input V1 and taking the label marking result corresponding to V1 and V1’ as
output, so it then outputs a probability through Adam (adaptive moment estimation) in
probability theory square that variable X obeys some distribution according to loss function to the gradient of each parameter and dynamic adjustments are directed to the learning rate of each parameter i.e. probability of input correlation with output V1 and V1’ respectively).

Regarding claim 8, in addition to the elements stated above regarding claim 1, the
combination (Zhang in view of Lin and further in view of Patra) as reasoned above in the rejection of claim 1 does not make obvious:
dividing the word or phrase replacing task into a label word or phrase replacing subtask and an appositive word or phrase replacing subtask; and
completing the training with the word or phrase replacing task based on the training samples in the plurality of the first training samples corresponding to the two subtasks.
Zhang discloses according to the word or phrase to be marked, constructing a first
training sample of the text data corresponding to a word or phrase replacing task (Para. 19 – 20
pg. 7, S302, rare word replaced by preset characters i.e. V2 second input text from S303). While
Zhang discloses constructing a first training sample it is silent in which it has subtasks of a label
word or phrase replacing subtask and an appositive word or phrase replacing subtask; and
completing the training with word or phrase replacing task now containing the label word or
phrase replacing subtask and an appositive word or phrase replacing subtask.
In a related field of endeavor (e.g. technology useful for building NLP applications to
implement text analysis see Para. 2), Patra additionally discloses a method where the “training
data comprises multiple inputs, each being referred to as sample in a set of samples. Each
sample includes a value for each input neuron. A sample may be stored as a vector of input
values” i.e. where the training samples represent a task respectively (see para. 114).
Furthermore, Patra discloses pairwise similarities as depicted on figure 5A in which Indiana
Pacers are connected through dots with Miami Heat i.e. relational words fitting a generic
concept of basketball teams, where each of the mentions is associated with a set of one or
more candidate entities i.e. examples of figures 4A and 4B with the recognized entities of figure
5A. Firstly, the mentions of entities are labeled and secondly pairwise similarities may contain
one or more candidate entities, see para. 54.
Modifying Zhang’s neural network model training including both samples to include features of Lin and further include the features of Patra discloses:
dividing the word or phrase replacing task into a label word or phrase replacing subtask
and an appositive word or phrase replacing subtask (e.g. the word or phrase replacing task as taught and represented by V2 by Zhang including features of Lin now modified to label word or phrase replacing subtask and an appositive word or phrase replacing subtask as taught by Patra, see para. 54); and
completing the training with the word or phrase replacing task based on the training
samples in the plurality of the first training samples corresponding to the two subtasks (e.g. the
neural network model trained and updated until error meets condition hence completion including features of Lin modified now to use the first training sample including the label word or phrase replacements and the appositive word or phrase replacement tasks as taught by Patra, see para. 54).
It would have been obvious to one of ordinary skill in the art at the time the invention
was filed to apply the teachings of Patra to the method of Zhang. Doing so would have been
predictable to one of ordinary skill in the art given the similar nature between the two
disclosures, for example both using natural language processing techniques for textual analysis.
Further, doing so would have provided the users of Zhang, with the added benefits of the
disclosed named entity disambiguation (NED) to reduce overhead training by not requiring
large text corpus for effective training, also provide flexibility by facilitating application for
different domains, and help to provide better quality of answer, performance, and accuracy
even without large text mining, see para. 16.

Regarding claim 9, is directed to a system claim corresponding to the method claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1; however, combination of rejected claim 1 fails to disclose:
at least one processor; and
a storage communicatively connected with the at least one processor; wherein, the storage stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform a method for creating a label marking model, wherein the method comprises:
Zhang teaches a method for creating a label marking model, The present invention
provides a-of-speech tagging method, comprises a convolutional neural network CNN model, a
lock rod circulation unit BGRU model, two-way of the length of the memory network BLSTM
model and the condition with the airport CRF model see Para. 10 pg. 3 under description
heading. Where it involved a word or phrase to be marked, constructing a first training sample
of the text data corresponding to a word or phrase replacing task and a second training sample
corresponding to a label marking task, see Para. 19 – 20 pg. 7, S302, rare word replaced by
preset characters i.e. V2 second input text from S303; S303 V1 is first input text i.e. label
marking. While Zhang discloses a labeling system it is silent on the generic hardware
components to perform the method.
	In a related field of endeavor (e.g. technology useful for building NLP applications to
implement text analysis see Para. 2), Patra additionally discloses a method and computing
devices, programs, and other computing elements in which the methods may be implemented,
see para. 63, where the named entity disambiguation techniques results in improved
classification accuracy of machine learning classification models, see para. 25 as through neural
networks further described in para. 100. Furthermore, the generic hardware is described in
para. 75 with a computer system process comprises an allotment of memory (physical and/or
virtual), the allotment of memory being for storing instructions executed by the hardware
processor, for storing data generated by the hardware processor executing the instructions,
and/or for storing the hardware processor state (e.g. content of registers) between allotments
of the hardware processor time when the computer system process is not running as in the
example of Fig. 8 that illustrates a computer system where it includes a general purpose
microprocessor where such instructions when stored in non-transitory storage media accessible
to processor, see para. 77 – 78.
	Modifying Zhang’s labeling system of creating a neural network model training both samples including features of Lin as in combination set forth in claim 1, and further including to include the features of Patra discloses:
at least one processor (e.g. the labeling system of neural network model training as taught by Zhang now modified to include a generic processor as taught by Patra, see para. 75); and
a storage communicatively connected with the at least one processor (e.g. the labeling
system of creating a neural network model training as taught by Zhang including features of Lin as in combination set forth in claim 1 now modified to include a generic processor to execute instructions stored in a memory i.e. storage as taught by Patra hence it communicates connectedly, see para. 75);
wherein the storage stores instructions executable by the at least one processor, and
the instructions are executed by the at least one processor to enable the at least one processor
to perform a method for creating a label marking model (e.g. the labeling system of creating a
neural network model training as taught by Zhang including features of Lin as in combination set forth in claim 1 now modified to include a storage i.e. memory for storing instructions executed by the generic hardware of a processor as taught by Patra, further demonstrated by the method in rejected claim 1, see para. 75).	
It would have been obvious to one of ordinary skill in the art at the time the invention
was filed to apply the teachings of Patra to the method of Zhang. Doing so would have been
predictable to one of ordinary skill in the art given the similar nature between the two
disclosures, for example both using natural language processing techniques for textual analysis.
Further, doing so would have provided the users of Zhang, with generic hardware to execute
instructions set forth by a method as recognized by Patra, see para. 65 and 75-78.

Regarding claim 10, is directed to a system claim corresponding to the method claim presented in claim 2 and is rejected under the same grounds stated above regarding claim 2; in addition to combination set forth in rejected claim 9.

Regarding claim 11, is directed to a system claim corresponding to the method claim presented in claim 3 and is rejected under the same grounds stated above regarding claim 3; in addition to combination set forth in rejected claim 9.

Regarding claim 12, is directed to a system claim corresponding to the method claim presented in claim 4 and is rejected under the same grounds stated above regarding claim 4; in addition to combination set forth in rejected claim 9.

Regarding claim 13, is directed to a system claim corresponding to the method claim presented in claim 5 and is rejected under the same grounds stated above regarding claim 5.

Regarding claim 14, is directed to a system claim corresponding to the method claim presented in claim 6 and is rejected under the same grounds stated above regarding claim 6.

Regarding claim 15, is directed to a system claim corresponding to the method claim presented in claim 7 and is rejected under the same grounds stated above regarding claim 7.

Regarding claim 16, is directed to a system claim corresponding to the method claim presented in claim 8 and is rejected under the same grounds stated above regarding claim 8; in addition to combination set forth in rejected claim 9.


Regarding claim 17, is directed to a non-transitory computer-readable storage medium
corresponding to the system claim presented in claim 9 and is rejected under the same grounds
stated above regarding claim 9.

Regarding claim 18, is directed to a non-transitory computer-readable storage medium
corresponding to the method claim presented in claim 11 and is rejected under the same
grounds stated above regarding claim 11.

Regarding claim 19, is directed to a non-transitory computer-readable storage medium
corresponding to the method claim presented in claim 12 and is rejected under the same
grounds stated above regarding claim 12.

Regarding claim 20, is directed to a non-transitory computer-readable storage medium
corresponding to the method claim presented in claim 14 and is rejected under the same
grounds stated above regarding claim 14.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s
disclosure.
MacAvaney et al. (US Pub. No. 2021/0027141 A1) teaches, methods, non-transitory
computer readable media, and systems that can classify term sequences within a source text based on textual features analyzed by both an implicit-class-recognition model and an explicit-class-recognition model. For example, by applying machine-learning models for both implicit and explicit class recognition, the disclosed systems can determine a class corresponding to a particular term sequence within a source text and identify the particular term sequence reflecting the class. The dual-model architecture can equip the disclosed systems to apply (i) the implicit-class-recognition model to recognize implicit references to a class in source texts and (ii) the explicit-class-recognition model to recognize explicit references to the same class in source texts, see abstract. Specifically, Para. 30 indicates, In addition to improving classification accuracy, in certain implementations, the class recognition system identifies a term sequence from a source text reflecting a class with more flexibility than existing text classification systems. Rather than exclusively or rigidly identifying term sequences corresponding to a class precisely matching labels from training samples, the class recognition system can train or apply a class-recognition-machine-learning model that identifies a class expressed in various different terms or term sequences that existing text classification systems currently fail to recognize. In some embodiments, for instance, the class recognition system can apply or train such a machine-learning model to recognize various labels for a class to correctly identify different expressions referring to the class throughout source texts.

Neumann (US Pub. No. 2021/0342212 A1) teaches, A system for identifying root causes,
the system including a computing device designed and configured to receive a user input from a user client device, extract at least a symptom datum form the user input, extracting the at least a symptom datum includes being configured to generate at least a query using the user input, and generate the at least a symptom datum as a function of the at least a query, train a machine learning process with an expert input training set from an expert knowledge database wherein the expert input training set further includes prognostic data correlated to causal link data, configured to assign weights to the correlated data as a function of the at least a symptom datum, identify root causes as a function of the assigned weights and display the root causes to the user. Specifically, para. 38 teaches about training data and linking categorical data between data elements.

Semenov (US Pub. No. 2021/0150338 A1) teaches, mechanisms for identification of
fields in documents using neural networks. A method of the disclosure includes obtaining a layout of a document, the document having a plurality of fields, identifying the document, based on the layout, as belonging to a first type of documents of a plurality of identified types of documents, identifying a plurality of symbol sequences of the document, and processing, by a processing device, the plurality of symbol sequences of the document using a first neural network associated with the first type of documents to determine an association of a first field of the plurality of fields with a first symbol sequence of the plurality of symbol sequences of the document, see abstract. Specifically, para. 46 indicates training data with repositories where there are mark-ups in the text data as for example, the text field “John Smith” as the input used to training the machine learning model(s) where it corresponds to the text field “name” i.e. replacement text corresponding to the text data.

Zheng (WO 2018028077 A1) teaches, A deep learning based method and device for
Chinese semantics analysis, relating to the technical field of natural language processing. The method comprises: a mobile terminal acquiring, by means of performing standardization processing of an acquired Chinese text, a standard Chinese text (S101); the mobile terminal performing word recognition of a specified type of words and/or custom word recognition and/or Chinese name recognition of the standard Chinese text, and taking the recognition results as constraint conditions (S102); the mobile terminal obtaining, according to the constraint conditions and by means of deep learning, Chinese text segmentation and part-of-speech tagging models to perform Chinese text segmentation and part-of-speech analysis on the standardized Chinese text, so as to obtain segmented texts and parts of speech of the standard Chinese text (S103); the terminal using the segmented texts, parts of speech, and/or recognized name types of the standard Chinese text to perform Chinese semantics analysis of the standardized Chinese text (S104), see abstract. Furthermore, it specifies, The analysis module 203 further includes: a structured processing unit configured to move the terminal performs a structured processing on the Chinese text of the specification according to the semantic role labeling result and the event model of the Chinese text of the specification, and extracts key information of the Chinese text of the specification. Specifically, the key information of the specification Chinese text includes an event name, a key attribute, and an attribute value. Among them, the event name can correspond to the sentence classification result. For example, for the text message received by the terminal, the sentence classification model is divided into bank bills, flight trains, appointments, weather forecasts, and the like. Then the result type of the sentence classification can be used as the event name. The key attribute is the semantic role labeling result. For example, in the bank billing text message, it is marked as billing date, consumption amount, repayment date, repayment amount, etc. The attribute value is marked as the specific value in the original text message corresponding to the above category, such as the specific date and specific Amount, etc.

Applicant's amendment necessitated the new ground(s) of rejection presented in this
Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the
examiner should be directed to JONATHAN E AMAYA HERNANDEZ whose telephone number is (571)272-2484. The examiner can normally be reached Monday - Friday 7:30 am - 3:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andy Flanders can be reached on 571-272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.E.A./             Examiner, Art Unit 2655      
       
/ANDREW C FLANDERS/             Supervisory Patent Examiner, Art Unit 2655