DETAILED ACTION
This action is in response to the RCE filed 02 October 2022.
Claims 1–10 and 21–30 are pending. Claims 1, 21, and 22 are independent.
Claims 1–10 and 21–30 are rejected.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after 16 March 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Continued Examination
A request for continued examination under 37 C.F.R. § 1.114, including the fee set forth in 37 C.F.R. § 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 C.F.R. § 1.114, and the fee set forth in 37 C.F.R. § 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 C.F.R. § 114. Applicant's submission filed on 07 September 2022 has been entered.
Response to Arguments
The objection to the drawings is withdrawn in response to the amendment to the specification (remarks, p. 12).
The objection to claim 10 is withdrawn in response to the amendment to the claim (remarks, pp. 12–13).
Applicant's arguments, see remarks, filed 07 September 2022, with respect to the rejection(s) of claim(s) 1–10 and 21–30 under § 103 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of Liu et al. (cited in the Office action mailed 24 December 2021).
Claim Rejections—35 U.S.C. § 103
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1–5, 8–10, 21–26, 29, and 30 are rejected under 35 U.S.C. § 103 as being unpatentable over Duan et al. (US 2004/0167771 A1) [hereinafter Duan] in view of Miller et al. (US 6,052,682 A) [hereinafter Miller] and Liu et al. (US 2008/0319738 A1) [hereinafter Liu].
Regarding independent claim 1, Duan teaches [a] method for separating words, comprising:	obtaining a predetermined word collection and a text with words to be separated; […]; wherein words in the predetermined word collection comprise first information and second information; wherein the first information is configured for indicating a probability of a word […]; wherein for the words in the predetermined word collection, the second information is configured for indicating a probability of presenting the word […] under a condition of presenting other words than the word; An input sentence [text with words to be separated] is broken into tokens using a lexical dictionary and lexical cost file [predetermined word collection] (Duan, ¶¶ 86, 97, 101). The lexical cost file may include lexical costs, corresponding to the probability of observing a certain word [first information] and connector costs, corresponding to the probability of observing two particular words in adjacent positions [second information], and may be stored with the lexical dictionary (Duan, ¶¶ 77–79). A lower cost corresponds to a higher probability (Duan, ¶ 95).	based on the predetermined word collection, separating the text with words to be separated to obtain at least one word list; All possible segmentations of the sentence are determined, and connections between segments, which form a graph having a plurality of possible paths [word lists] (Duan, ¶¶ 97–100).	regarding a word list in the at least one word list, determining first information of words in the word list and determining second information of the words in the word list, wherein the words in the word list are included in the predetermined word collection; A cost [probability] is assigned to each path in the graph using the lexical costs (Duan, ¶¶ 100–101).	determining a probability of the word list based on the first information of the words in the word list and the second information of the words in the world list, wherein the second information of the words in the word list is determined based on a word adjacent to one of the words in the word list; and Path costs are determined based on the lexical costs assigned to each arc within the paths (Duan, ¶¶ 101–102).	selecting a word list whose probability is maximal from the at least one word list as a result of separating words. The n best paths [maximal probabilities] are selected from all of the paths in the graph, based on the total costs; n is at least 1 (Duan, ¶¶ 102–103).
Duan teaches a predetermined word collection, but does not expressly teach the generation thereof. However, Miller teaches:	wherein the predetermined word collection is a word collection pre-generated based on a predetermined text collection Statistical information for words [a predetermined word collection] is generated based on training text [a predetermined text collection] (Miller, col. 4 ll. 30–50). The statistical information includes a probability each given word will appear in text as a specific class, as well as under different conditions (Miller, col. 4 ll. 45–55).	[the first information is configured for indicating a probability of a word] present in the predetermined text collection Training text is used to generate statistical information (Miller, col. 4 ll. 40–50).	[the second information is configured for indicating a probability of presenting the word] in the predetermined text collection [under a condition of presenting other words than the word] The statistical information includes the probability for word pairs (Miller, col. 4 ll. 45–55).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine the teachings of Miller with those of Duan. One would have been motivated to do so in order to generate a dictionary/lexicon more easily and quickly (Miller, col. 3 l. 10–25).
Duan/Miller teaches a predetermined word collection, but does not expressly teach one pre-generated using a machine learning model. However, Liu teaches:	the predetermined word collection is pre-generated using at least one machine learning model, and the at least one machine learning model is trained to separate words A dictionary having words and corresponding probability values [predetermined word collection] is generated using a document corpus [text collection] and a word segmentation engine using a Hidden Markov Model [machine learning model trained to separate words] (Liu, ¶¶ 43–51).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine the teachings of Liu with those of Duan/Miller. One would have been motivated to do so in order to improve the accuracy of the segmentation (Liu, ¶ 16).
Regarding dependent claim 2, the rejection of parent claim 1 is incorporated and Duan/Miller/Liu further teaches wherein the determining a probability of the word list based on the first information of the words in the word list and the second information of the words in the word list comprises:	connecting two adjacent words in the word list by a line to generate a path of separating words; wherein nodes of the path of separating words are indicated by the words in the word list, and the line of the path of separating words is a line configured for connecting words; A connection graph is formed, with segments for the nodes and arcs for parts-of-speech (Duan, ¶ 100).	based on the first information and the second information of the words in the word list, determining a weight of the line of the path of separating words; and Each path is assigned a cost [weight], based on the total of all costs for each arc and node in the paths (Duan, ¶ 101).	based on the weight, determining the probability of the word list. A lower cost corresponds to a higher probability (Duan, ¶ 95).
Regarding dependent claim 3, the rejection of parent claim 1 is incorporated and Duan/Miller/Liu further teaches:	wherein the second information of the words in the word list is determined based on a previous word adjacent to the one of the words in the word list. The connector cost corresponds to the probability a word appears adjacent, e.g. immediately before or after, another word (Duan, ¶ 78).
Regarding dependent claim 4, the rejection of parent claim 3 is incorporated and Duan/Miller/Liu further teaches wherein the determining second information of the words in the word list comprises:	for a word among the words in the word list, determining whether the word list comprises a previous word adjacent to the word;	in response to determining that the word list comprises the previous word adjacent to the word, determining the second information of the word based on the previous word adjacent to the word. The cost may be based on another word directly before the word (Duan, ¶ 78).
Regarding dependent claim 5, the rejection of parent claim 1 is incorporated and Duan/Miller/Liu further teaches wherein the predetermined word collection is obtained by:	obtaining the predetermined text collection and a sample result of separating words pre-marked aiming at predetermined texts in the predetermined text collection; The training text includes words which have been labeled [pre-marked] (Miller, col. 4 ll. 40–45).	taking the predetermined texts in the predetermined text collection as inputs, taking the sample result of separating words corresponding to the predetermined texts as an expected output, utilizing a machine learning method, training to obtain a model for separating words; The word probabilities are generated using a layered Hidden Markov Model [machine learning method] with labeled training text [sample result of separating words pre-marked aiming at the predetermined texts] (Miller, col. 4 l. 30–45).	utilizing the model for separating words to separate words in the predetermined texts in the predetermined text collection to obtain a first result of separating words; A first layer is used to label words with a class (Miller, col. 3 l. 30–55, col. 5 l. 1–15).	based on the first result of separating words, generating an initial word collection; wherein words in the initial word collection comprise the first information determined by the first result of separating words; Input text is searched for instances of the classified words (Miller, col. 4 l. 55–65).	based on the initial word collection, separating the words in the predetermined texts in the predetermined text collection to obtain a second result of separating words; and A second layer is used to determine the probability of each word transitioning to another word (Miller, col. 5 l. 10–50).	based on the initial word collection and the second result of separating words, generating the predetermined word collection; wherein the predetermined word collection comprises the first information and the second information determined based on the second result of separating words. A database of words is compiled, including the transition probabilities (Miller, col. 6 l. 60 to col. 7 l. 15).
Regarding dependent claim 8, the rejection of parent claim 1 is incorporated and Duan/Miller/Liu further teaches wherein the separating words in the text with words to be separated to obtain at least one word list comprises:	determining whether the text with words to be separated comprises a text matching a predetermined text format; and Text may be separated based on rules, e.g. punctuation symbols (Duan, ¶¶ 52–69).	in response to determining that the text with words to be separated comprises the text matching the predetermined text format, separating the words in the text with words to be separated to obtain the at least one word list based on the predetermined word collection and the text matching the predetermined text format. The text is tokenized [separated] based on the rules (Duan, ¶ 52).
Regarding dependent claim 9, the rejection of parent claim 1 is incorporated and Duan/Miller/Liu further teaches wherein the separating words in the text with words to be separated to obtain at least one word list comprises:	determining whether the text with words to be separated comprises a named entity; and Proper nouns [named entities] may be identified in the text (Duan, ¶ 88).	in response to determining that the text with words to be separated comprises the named entity, separating the words in the text with words to be separated to obtain the at least one word list based on the predetermined word collection and the named entity. The words may be connected based on forming a proper noun (Duan, ¶ 88).
Regarding dependent claim 10, the rejection of parent claim 1 is incorporated and Duan/Miller/Liu further teaches wherein after the selecting a word list whose probability is maximal from the at least one word list as a result of separating words, the method further comprises:	obtaining a predetermined candidate word collection, wherein words in the predetermined candidate word collection are configured for indicating at least one of a movie name, a TV series name or a music name; A dictionary includes entries for movie titles, etc. (Liu, ¶¶ 35, 44).	matching the result of separating words and the words in the predetermined candidate word collection to determine whether the result of separating words comprises a phrase matching the words in the predetermined candidate word collection or not, wherein the phrase comprises at least two adjacent words; and Each entry may comprise more than one word (Liu, ¶ 44). Word segmentation can be performed based on the words (Liu, ¶ 49).	in response to determining that the result of separating words comprises the phrase, generating an updated result of separating words, wherein the updated result of separating words comprise the phrase. Candidate segmentations are evaluated iteratively using the probability values for the words (Liu, ¶ 50).
Claim 21 recites limitations similar to those of claim 1, and is rejected for the same reasons.
Claim 22 recites limitations similar to those of claim 1, and is rejected for the same reasons.
Claim 23 recites limitations similar to those of claim 2, and is rejected for the same reasons.
Claim 24 recites limitations similar to those of claim 3, and is rejected for the same reasons.
Claim 25 recites limitations similar to those of claim 4, and is rejected for the same reasons.
Claim 26 recites limitations similar to those of claim 5, and is rejected for the same reasons.
Claim 29 recites limitations similar to those of claim 8, and is rejected for the same reasons.
Claim 30 recites limitations similar to those of claim 9, and is rejected for the same reasons.
Claims 6, 7, 27, and 28 are rejected under 35 U.S.C. § 103 as being unpatentable over Duan, Miller, and Liu, further in view of Chao (US 2005/0049852 A1).
Regarding dependent claim 6, the rejection of parent claim 5 is incorporated. Duan/Miller/Liu teaches using a machine learning model, but does not expressly teach using two models to obtain the “first results”. However, Chao teaches further comprising:	training at least two predetermined initial models to obtain at least two models for separating words; and Text is classified by multiple machine learning models, trained using training data (Chao, ¶ 86).	utilizing the at least two models for separating words to separate the words in the predetermined texts in the predetermined text collection to obtain at least two first results of separating words. The classifier can compute the probability for the input text (Chao, ¶¶ 87–88).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine the teachings of Chao with those of Duan/Miller/Liu. One would have been motivated to do so in order to improve the accuracy of the machine learning model (Chao, ¶ 86).
Regarding dependent claim 7, the rejection of parent claim 6 is incorporated and Duan/Miller/Liu/Chao further teaches further comprising:	extracting identical words from the at least two first results of separating words before the generating an initial word collection; and The sub-models are integrated using a voting mechanism [i.e., the models agreeing on particular words] (Chao, ¶ 86).	generating the initial word collection based on the extracted words. The probabilities are based on the trained machine learning models (Chao, ¶ 87).
Claim 27 recites limitations similar to those of claim 6, and is rejected for the same reasons.
Claim 28 recites limitations similar to those of claim 7, and is rejected for the same reasons.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Tyler Schallhorn whose telephone number is 571-270-3178. The examiner can normally be reached Monday through Friday, 8:30 a.m. to 6 p.m. (ET).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kavita Stanley can be reached on 571-272-8352. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (in the USA or Canada) or 571-272-1000.
                                                                                                                                                                                               /ANDREW R DYER/Primary Examiner, Art Unit 2176
/Tyler Schallhorn/Examiner, Art Unit 2176