Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This office action is in response to application 16/879,252, which was filed 05/20/20. Claims 1-14 are pending in the application and have been considered.

Foreign Priority
Receipt is acknowledged of certified copies of papers submitted under 35 U.S.C. 119(a)-(d), which papers have been placed of record in the file.


Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
The following title is suggested: Generating Finite State Automata for Recognition of Organic Compound Names in Chinese.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claim 14 is are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.   
Claim 14 is directed to a “storage medium” with a program stored thereon. On page 15, paragraph [0063], the specification discusses “storage medium”, stating "The storage medium includes, but is not limited to, a floppy disk, an optical disk, a magneto-optical disk, a memory card, a memory stick, and the like.” Thus, the specification factors against eligibility for the claimed “medium” because the defined scope of the medium is not limited (i.e., "is not limited to"). The recitation of storage per se in the claim does nothing to direct the claims toward patentable subject matter because Applicant has not specifically limited the scope of the term to statutory mediums. Because the scope is open ended, the claim as a whole includes non-statutory medium types, e.g. carrier waves, which do not fall into a category of patent eligible subject matter.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-3, 11, 13, and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (2013/0054226) in view of Sayle et al. (“Improved Chemical Text Mining of Patents with Infinite Dictionaries and Automatic Spelling Correction”. Journal of Chemical Information and Modeling, 2012, 52, 51-62).

Consider claim 1, Chen discloses a method for generating a finite state automata for recognizing a chemical name in a text (constructing a finite state machine during the matching of chemical name segments in a Chinese document, [0024]), comprising: 
initializing including substituting representation constants of categories of character segments appearing in a compound name set into the compound name set to obtain a conversion name set (matching the sentence with the chemical name segment dictionary to obtain the segments and start-end, or substituted representation constants, positions for character and numeric segments in table 1, [0025]); 
updating the conversion name set based on a conversion name segment which repeatedly appears in the conversion name set (the name set is updated to that shown in Table 2 based on reducing the redundant chemical name segments, [0025]); and 
generating a finite state automata (a finite state machine is constructed during the matching, [0024]); 
wherein respective compound names in the compound name set are in Chinese (chemical names in the Chinese document, [0024]). 
Chen does not specifically mention generating a finite state automata based on the updated conversion name set of organic compounds.
Sayle discloses generating a finite state automata based on an updated conversion name set of organic compounds (IUPAC names for organic chemistry, page 56, is used to generate the FSM “P” from a subset or update of the main molecule FSM “M”, and using bytes as characters allows Chinese text, page 54).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Chen by generating a finite state automata based on the updated conversion name set of organic compounds in order to avoid missing compounds by improving segmentation, as suggested by Sayle (page 53, Introduction). 

Consider claim 13, Chen discloses a device for generating a finite state automata (constructing a finite state machine during the matching of chemical name segments in a Chinese document, [0024]), the device comprising: a memory (RAM, [0045]); and a processor coupled to the memory (processor, [0047]) and configured to: 
initializing to substitute representation constants of categories of character segments appearing in a compound name set into the compound name set to obtain a conversion name set (matching the sentence with the chemical name segment dictionary to obtain the segments and start-end, or substituted representation constants, positions for character and numeric segments in table 1, [0025]); 
updating the conversion name set based on a conversion name segment which repeatedly appears in the conversion name set (the name set is updated to that shown in Table 2 based on reducing the redundant chemical name segments, [0025]); and 
generating a finite state automata (a finite state machine is constructed during the matching, [0024]); 
wherein respective compound names in the compound name set are in Chinese (chemical names in the Chinese document, [0024]). 
Chen does not specifically mention generating a finite state automata based on the updated conversion name set of organic compounds.
Sayle discloses generating a finite state automata based on an updated conversion name set of organic compounds (IUPAC names for organic chemistry, page 56, is used to generate the FSM “P” from a subset or update of the main molecule FSM “M”, and using bytes as characters allows Chinese text, page 54).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Chen by generating a finite state automata based on the updated conversion name set of organic compounds in order for reasons similar to those for claim 1. 

Consider claim 14, Chen discloses a storage medium with a program stored thereon, where when the program is executed on an information processing apparatus (computer readable storage medium, [0045]), causes the information processing apparatus to implement operations comprising: 
initializing including substituting representation constants of categories of character segments appearing in a compound name set into the compound name set to obtain a conversion name set (matching the sentence with the chemical name segment dictionary to obtain the segments and start-end, or substituted representation constants, positions for character and numeric segments in table 1, [0025]); 
updating the conversion name set based on a conversion name segment which repeatedly appears in the conversion name set (the name set is updated to that shown in Table 2 based on reducing the redundant chemical name segments, [0025]); and 
generating a finite state automata (a finite state machine is constructed during the matching, [0024]); 
wherein respective compound names in the compound name set are in Chinese (chemical names in the Chinese document, [0024]). 
Chen does not specifically mention generating a finite state automata based on the updated conversion name set of organic compounds.
Sayle discloses generating a finite state automata based on an updated conversion name set of organic compounds (IUPAC names for organic chemistry, page 56, is used to generate the FSM “P” from a subset or update of the main molecule FSM “M”, and using bytes as characters allows Chinese text, page 54).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Chen by generating a finite state automata based on the updated conversion name set of organic compounds in order for reasons similar to those for claim 1. 
Consider claim 2, Chen discloses the initializing further comprises initialization of a representation set including initializing a binary representation set and a repeated representation set to empty sets (the start positions and end positions being considered a “binary representation set” since each position is either a start or an end, and the chemical segments a “repeated representation set” since they overlap, i.e. repeat characters, [0025], Table 1); a binary representation in the binary representation set is used for representing a conversion name segment of two different adjacent representation constants in the conversion name set (the starting and ending positions, [0025], Table 1); and a repeated representation in the repeated representation set is used for representing a conversion name segment in the conversion name set where the same representation constant continuously appears (the chemical segment sets, [0025], Table 1). 

Consider claim 3, Chen discloses the initializing includes the substituting representation, only representation constants of selected character segments appearing in the compound name set are substituted into the compound name set, where the selected character segments are not ",", " ", ":", ";", "-", "(", ")", "[", "]", "{", "}", or "'" (Chemical segments and their starting and ending positions, [0025], Table 1, noting that the claim language does not require never selecting “,”, or alternatively, considered “,6” different from “,”). 
Chen does not specifically mention an organic compound name set.
Sayle discloses an organic compound name set (IUPAC names for organic chemistry, page 56).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Chen by including an organic compound name set for reasons similar to those for claim 1. 

Consider claim 11, Chen discloses the generating comprises: generating a regular expression based on the conversion name set; and generating the finite state automata based on the regular expression (regular expression, [0021], a finite state machine is constructed during the matching, [0024]). 
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (2013/0054226) in view of Sayle et al. (“Improved Chemical Text Mining of Patents with Infinite Dictionaries and Automatic Spelling Correction”. Journal of Chemical Information and Modeling, 2012, 52, 51-62), in further view of Shi et al. (2019/0272319).


Consider claim 10, Chen discloses categories of characters appearing in the compound name set are set to comprise: Arabic numbers (e.g. “2”, [0021]), Heavenly Stems (Chinese characters, [0021]), Uppercase letters (those found in PubChem database, [0021]), Lowercase letters ([0021]), Greek Letters (those found in PubChem database, [0021]), Order Characters (Chinese characters, [0021]), Prefixes (“chloro”, [0021]), Suffixes (“phenyl”, [0021]), Conjunctions (characters occurring together, [0021]), Chemical Elements (those found in PubChem database, [0021]), other common words (non-chemical words, [0022]), ",", " ", ":", ";", "-", "(", ")", "[", "]", "{", "}" and "'" (those found in PubChem database, [0021]).
Chen does not specifically mention an organic compound name set.
Sayle discloses an organic compound name set (IUPAC names for organic chemistry, page 56).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Chen by including an organic compound name set for reasons similar to those for claim 1.
Chen and Sayle do not specifically mention Chinese Numbers.
Shi discloses Chinese Numbers (Chinese number, [0028]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Chen and Sayle by including Chinese numbers in order to improve accuracy of identifying named entities, as suggested by Shi ([0003]).
Allowable Subject Matter
Claims 4-9 and 12 are objected to as being dependent on a rejected base claim, but would be allowable if rewritten in independent form including all limitations of the base and any intervening claims.

The following is the examiner’s statement of reasons for indicating subject matter allowable over the prior art:

Consider claim 4, the prior art does not fairly teach or suggest: “….the updating comprises: determining a conversion name segment of two adjacent representation constants in the conversion name set being different and setting a corresponding representation constant for the two adjacent representation constants as a binary representation to update a binary representation set, wherein respective binary representations in a binary representation set satisfy: a time for which a conversion name segment to which the binary representation correspond appear in the conversion name set is greater than a first predetermined threshold; determining a conversion name segment in the conversion name set where the same representation constant continuously appears for n times or more than n times and setting a corresponding representation constant for the same representation constant as a repeated representation to update a repeated representation set, wherein each repeated representation in the repeated representation set is used for collectively representing a conversion name segment in which the same representation constant to which the repeated representation corresponds continuously appears, and n is equal to a second predetermined threshold; judging whether both the updated binary representation set with respect to a previous binary representation set and the updated repeated representation set with respect to the previous repeated representation set do not vary; and updating a conversion name set, when a judgment result of the judging is that at least one of the updated binary representation set and the updated repeated representation set varies, by substituting representation constants in the binary representation set and representation constants in the repeated representation set into the conversion name set.” Dependent claims 5-9 and 12 are allowable subject matter over the prior art because they further limit the allowable subject matter of intervening parent claim 4.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US 20110082844 Bao discloses correcting chemical names
US 7676358 Coden discloses recognizing organic chemical names in text documents
US 20070016612 James discloses molecular keyword indexing for chemical structure database searching
US 6311152 Bai discloses Chinese tokenization and named entity recognition
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jesse Pullias whose telephone number is 571/270-5135. The examiner can normally be reached on M-F 8:00 AM - 4:30 PM. The examiner’s fax number is 571/270-6135.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Andrew Flanders can be reached on 571/272-7516. 

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).


/Jesse S Pullias/
Primary Examiner, Art Unit 2655                                                       08/18/22