DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). 
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/7/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1 and 6 recite the limitation "the type of each component" in the first limitation, and “a plurality of the words” in the last limitation.  There is insufficient antecedent basis for this limitation in the claim.
Claim 3  recites the limitation "the topic discrimination model" in limitation 1 and “the topic presuming part” in limitation 2.  There is insufficient antecedent basis for this limitation in the claim.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-4, 6-9 is/are rejected under 35 U.S.C. 103 as being unpatentable over RAMANATHAN Krishnan et al. (WO 2013/080214), and further in view of Li et al. (US 2020/0125671).
Regarding claim 1, RAMANATHAN et al. do teach a data extraction device (Abstract: “A topic is extracted from a digital text document …”)
Comprising:
A label input part that receives, from a user, an input of the type of each component of at least one set of sentences and a designation of a topic portion in the component (¶ 0020 lines 3+: “topic extraction” (topic portion designation and determination their associated “paragraph section heading” (“paragraph” (component) type determination (¶ 0019))) “is based upon previously received topic identifications for the digital text document” (for a set of sentences) “from a plurality of persons” (are received as input from a user));
A sentence-feature presuming part that inputs a specified set of sentences inputted by a user into the pre-trained model to presume each component of the specified set of sentences and a topic portion in each component of the specified set of sentences (¶ 0019 sentence 2+: “Topic extraction module 62 carries out step 102 in method 100” “to identify or extract topics” (topic portion is extracted) “from the digital text document” (from input of user comprising the specified set of sentences) “Examples of topics that may be extracted” “include” “paragraph section headings” (associated paragraphs (components) are thus presumed or determined));
A word-vector generation part that determines a relationship among each word in the specified set of sentences, the type of each presumed component, and the presumed topic portion to calculate a feature amount of each word (¶ 0024 sentence 3: “words or noun phrases are also assigned weights” (a feature amount is calculated for each word) “based on how often they had co-occurred with other highly weighted terms” (that determines a relationship with other words in the specified set of sentences), and this process follows ¶ 0019 sentence 2+: “Topic extraction module 62 carries out step 102 in method 100” “to identify or extract topics” (topic portion is extracted) “from the digital text document” (from the specified set of sentences) “Examples of topics that may be extracted” “include” “paragraph section heading” (and “paragraph” (component) “heading” (type) determination));
A relationship extraction part that extracts a plurality of the words having a relationship with one another based on the calculated feature amount (¶ 0024 sentence 3: “Words or noun phrases are also assigned weights” (a feature amount is calculated for each word) “based on how often they had co-occurred with other highly weighted terms” (that determines a relationship with other words or with one another in the specified set of sentences)).
RAMANATHAN et al. do not specifically disclose 
A model creation part that creates a pre-trained model that has learned the type of each component of the set of sentences and a feature of the topic portion in the component of the set of sentences.
Li et al. do teach:
A model creation part that creates a pre-trained model that has learned the type of each component of the set of sentences and a feature of the topic portion in the component of the set of sentences (¶ 0017 sentence 2+: “topic modeling” (creating a pre trained model) “may include biterm topic modeling (BTM), which is based on modeling word co-occurrence patterns” (based on a feature associated with the topic) “As a result of the topic modeling of the paragraphs of the textual content, each of the paragraphs may be tagged” (which learns a type of “paragraph” (component) based on its “tag[]”) “and each of the tags may include topic information including one or more keywords and/or phrases”).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the “topic modeling” “based on” “word-co-occurrence patterns” of Li et al. into the techniques of Li et al. into the techniques of RMANATHAN et al. pertaining to “weight[ing]” “words” based on “how often they had co-occurred with other highly weighted terms”, would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable RAMANATHAN et al. to determine “topic information” based on “tags” computed “based” on “co-occurrence patterns” as disclosed in Li et al. ¶ 0017 last 5 lines so as to either  benchmark or use it instead of RAMANATHAN et al.  ¶ 0019 “Topic extraction module” and this save in operations performed.

Regarding claim 2, RAMANATHAN et al. do teach the sentence-feature presuming part includes a paragraph-type presuming part and a topic presuming part,
The paragraph-type presuming part inputs a specified set of sentences inputted by a user into the first -pretrained model to presume the type of each component of the specified set of sentences , and the topic presuming part inputs the specified set of sentences into the second pre-trained model to presume a topic portion in each component of the specified set of sentences (¶ 0019 sentence 2+: “Topic extraction module 62 carries out step 102 in method 100” “to identify or extract topics” (topic portion is presumed) “from digital text document” (from the specified set of sentences) “Example of topics that may be extracted” “include” “paragraph section headings” (associated paragraphs (components) are thus presumed or identified with their “heading” (type) information)).
RAMANATHAN et al. do not specifically disclose the data extraction device according to claim 1, wherein the model creation part includes a paragraph-type-discrimination-model creation part and a topic-discrimination-model creation part,
The paragraph-type-discrimination-model creation part learns a relationship between each word in the set of sentences and the type of the component to create a paragraph-type discrimination model that has memorized the relationship between the word and the type of the component, as a first pre-trained model,
The topic-discrimination-model creation part creates a topic discrimination model that has memorized a relationship among the type of the component, the words in the component, and the topic portion in the component, as a second pre-trained model.
Li et al. do teach:
The paragraph-type-discrimination-model creation part learns a relationship between each word in the set of sentences and the type of the component to create a paragraph-type discrimination model that has memorized the relationship between the word and the type of the component, as a first pre-trained model  (¶ 0017 sentence 2+: “topic modeling” (creating a paragraph discrimination model) “may include biterm topic modeling (BTM), which is based on modeling word co-occurrence patterns” (based on “word-co-occurrence patterns” (a relationship between each word)) “As a result of the topic modeling of the paragraphs of the textual content, each of the paragraphs may be tagged” (and a “paragraph” (component) identified by its “tag[]” (type) which also has memorized the relationship) “and each of the tags may include topic information including one or more keywords and/or phrases”  ),
The topic-discrimination-model creation part creates a topic discrimination model that has memorized a relationship among the type of the component, the words in the component, and the topic portion in the component, as a second pre-trained model (¶ 0017 sentence 2+: “topic modeling” (creating a topic discrimination model) “may include biterm topic modeling (BTM), which is based on modeling word co-occurrence patterns” (based on “word-co-occurrence patterns” (a relationship between each word)) “As a result of the topic” (and the topic of each “paragraph” (component)) “modeling of the paragraphs of the textual content, each of the paragraphs may be tagged” (and a “paragraph” (component) identified by its “tag[]” (type)) “and each of the tags may include topic information including one or more keywords and/or phrases” (and the words in the “paragraph” (component)).
For obviousness to combine RAMANATHAN et al. and Li et al. see claim 1.

Regarding claim 3, RAMANATHAN et al. do teach the data extraction device according to claim 1, wherein 
the topic-discrimination-model creation part creates, as the topic discrimination model, a model having at least each word in the component and a word having a modification relationship with the word as a feature amount (¶ 0024 sentence 3: “Words or noun phrases are also assigned weights” (a feature amount is calculated for each word in the component) “based on how often they had co-occurred with other highly weighted terms” (that determines a modification relationship with a word))), 
and
the topic presuming part inputs each word in the component of the specified set of sentences and a word having a modification relationship with the word into the topic discrimination model to presume the topic portion (¶ 0026 lines 1-4:  “As indicated by step 206, topic extraction” (for topic presuming and identifying) “module 62 directs controller 32 further extract a set of additional terms that co-occur with a particular key phrase” (for each word in the component) “in the digital text document. As indicated by step 208, the extracted co-occurring terms are weighted” (its “weight” (modification relationship) with other words in the component is considered)).

Regarding claim 4, RAMANATHAN et al. do teach the data extraction device according to claim 1, wherein the word-vector generation part learns the relationship among each word in the specified set of sentences  (¶ 0024 sentence 3: “words or noun phrases are also assigned weights” (a feature amount is calculated for each word) “based on how often they had co-occurred with other highly weighted terms” (that determines a relationship with other words in the specified set of sentences)), 
and the word-vector generation part calculates a feature amount of each word based on the created co-occurrence-word presuming model (¶ 0024 sentence 3: “words or noun phrases are also assigned weights” (a feature amount is calculated for each word) “based on how often they had co-occurred” (based on the created word co-occurrence) “with other highly weighted terms”).
RAMANATHAN et al. do not specifically disclose 
the type of each presumed component, and the presumed topic portion to create a co-occurrence-word presuming model that has memorized the relationship among the occurrence of the words in the specified set of sentences, the type of a component in the specified set of sentences, and a topic portion in the component.
Li et al. do teach:
the type of each presumed component, and the presumed topic portion to create a co-occurrence-word presuming model that has memorized the relationship among the occurrence of the words in the specified set of sentences, the type of a component in the specified set of sentences, and a topic portion in the component (¶ 0017 sentence 2+: “topic modeling may include biterm topic modeling (BTM), which is based on modeling word co-occurrence patterns” (using a “model” (model) of “word-co-occurrence patterns” (the relationship among the occurrence of the words in the specified set of sentences)) “As a result of the topic” (and the presumed topic of each “paragraph” (component)) “modeling of the paragraphs of the textual content, each of the paragraphs may be tagged” (and a “paragraph” (component) type is thus identified by its “tag[]”) “and each of the tags may include topic information including one or more keywords and/or phrases”; e.g., ¶ 0033 last sentence referring to Fig. 4: “SIM 408 also may use weights” (memorized relationship among word occurrences is used) “associated with the words and phrases of table 410 to predict a topic of each paragraph” (to determine the topic portion of each “paragraph” (component)).
For obviousness to combine RAMANATAHN et al. and Li et al. see claim 1.

Regarding claim 6, RAMANATHAN et al. do teach a data extraction method (Abstract: “A topic is extracted from a digital text document …”)
Comprising:
A label input process of receiving, from a user, an input of the type of each component of at least one set of sentences and a designation of a topic portion in the component (¶ 0020 lines 3+: “topic extraction” (topic portion designation and determination their associated “paragraph section heading” (“paragraph” (component) type determination (¶ 0019))) “is based upon previously received topic identifications for the digital text document” (for a set of sentences) “from a plurality of persons” (are received as input from a user));
A sentence-feature presuming process of inputting a specified set of sentences inputted by a user into the pre-trained model to presume each component of the specified set of sentences and a topic portion in each component of the specified set of sentences (¶ 0019 sentence 2+: “Topic extraction module 62 carries out step 102 in method 100” “to identify or extract topics” (topic portion is extracted) “from the digital text document” (from input of user comprising the specified set of sentences) “Examples of topics that may be extracted” “include” “paragraph section headings” (associated paragraphs (components) are thus presumed or determined));
A word-vector generation process of determining a relationship among each word in the specified set of sentences, the type of each presumed component, and the presumed topic portion to calculate a feature amount of each word (¶ 0024 sentence 3: “words or noun phrases are also assigned weights” (a feature amount is calculated for each word) “based on how often they had co-occurred with other highly weighted terms” (that determines a relationship with other words in the specified set of sentences), and this process follows ¶ 0019 sentence 2+: “Topic extraction module 62 carries out step 102 in method 100” “to identify or extract topics” (topic portion is extracted) “from the digital text document” (from the specified set of sentences) “Examples of topics that may be extracted” “include” “paragraph section heading” (and “paragraph” (component) “heading” (type) determination));
A relationship extraction process  of extracting a plurality of the words having a relationship with one another based on the calculated feature amount (¶ 0024 sentence 3: “Words or noun phrases are also assigned weights” (a feature amount is calculated for each word) “based on how often they had co-occurred with other highly weighted terms” (that determines a relationship with other words or with one another in the specified set of sentences)).
RAMANATHAN et al. do not specifically disclose 
A model creation process of creating a pre-trained model that has learned the type of each component of the set of sentences and a feature of the topic portion in the component of the set of sentences;
The label input process, the model creation process, the sentence-feature presuming process, the word-vector generation process, and the relationship extraction process are performed by an information processing device.
Li et al. do teach:
A model creation process of creating a pre-trained model that has learned the type of each component of the set of sentences and a feature of the topic portion in the component of the set of sentences  (¶ 0017 sentence 2+: “topic modeling” (creating a pre trained model) “may include biterm topic modeling (BTM), which is based on modeling word co-occurrence patterns” (based on a feature associated with the topic) “As a result of the topic modeling of the paragraphs of the textual content, each of the paragraphs may be tagged” (which learns a type of “paragraph” (component) based on its “tag[]”) “and each of the tags may include topic information including one or more keywords and/or phrases”);
The label input process, the model creation process, the sentence-feature presuming process, the word-vector generation process, and the relationship extraction process are performed by an information processing device (¶ 0029 sentence 1: “FIG. 3 illustrates processing of textual content when collection information” (an information processing device) “to be used for creating a model” (responsible for creating a model for e.g. according to ¶ 0017 page 2 lines 2+:  “topic modeling” (label input)  based on “word co-occurrence patterns” (relationship extraction based on a word-vector generation process) “of the paragraphs of textual content” (sentence feature presuming processes)).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the “topic modeling” “based on” “word-co-occurrence patterns” of Li et al. into the techniques of Li et al. into the techniques of RMANATHAN et al. pertaining to “weight[ing]” “words” based on “how often they had co-occurred with other highly weighted terms”, would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable RAMANATHAN et al. to determine “topic information” based on “tags” computed “based” on “co-occurrence patterns” as disclosed in Li et al. ¶ 0017 last 5 lines so as to either  benchmark or use it instead of RAMANATHAN et al.  ¶ 0019 “Topic extraction module” and this save in operations performed.

Regarding claim 7, RAMANATHAN et al. do teach the sentence-feature presuming process includes a paragraph-type presuming part and a topic presuming process,
In the paragraph-type presuming process, the information processing device inputs a specified set of sentences inputted by a user into the first pre-trained model to presume the type of each component of the specified set of sentences , and in the topic presuming process, the information processing device inputs the specified set of sentences into the second pre-trained model to presume a topic portion in each component of the specified set of sentences (¶ 0019 sentence 2+: “Topic extraction module 62 carries out step 102 in method 100” “to identify or extract topics” (topic portion is presumed) “from digital text document” (from the specified set of sentences) “Example of topics that may be extracted” “include” “paragraph section headings” (associated paragraphs (components) are thus presumed or identified with their “heading” (type) information)).
RAMANATHAN et al. do not specifically disclose the data extraction method according to claim 6, wherein the model creation process includes a paragraph-type-discrimination-model creation process and a topic-discrimination-model creation process,
In the paragraph-type-discrimination-model creation process, the information processing device learns a relationship between each word in the set of sentences and the type of the component to create a paragraph-type discrimination model that has memorized the relationship between the word and the type of the component, as a first pre-trained model.
Li et al. do teach: In the topic-discrimination-model creation process, the information processing device creates a topic discrimination model that has memorized a relationship among the type of the component, the words in the component, and the topic portion in the component, as a second pre-trained model
In the paragraph-type-discrimination-model creation process, the information processing device learns a relationship between each word in the set of sentences and the type of the component to create a paragraph-type discrimination model that has memorized the relationship between the word and the type of the component, as a first pre-trained model (¶ 0017 sentence 2+: “topic modeling” (creating a paragraph discrimination model) “may include biterm topic modeling (BTM), which is based on modeling word co-occurrence patterns” (based on “word-co-occurrence patterns” (a relationship between each word)) “As a result of the topic modeling of the paragraphs of the textual content, each of the paragraphs may be tagged” (and a “paragraph” (component) identified by its “tag[]” (type) which also has memorized the relationship) “and each of the tags may include topic information including one or more keywords and/or phrases”  ),
In the topic-discrimination-model creation process, the information processing device creates a topic discrimination model that has memorized a relationship among the type of the component, the words in the component, and the topic portion in the component, as a second pre-trained model (¶ 0017 sentence 2+: “topic modeling” (creating a topic discrimination model) “may include biterm topic modeling (BTM), which is based on modeling word co-occurrence patterns” (based on “word-co-occurrence patterns” (a relationship between each word)) “As a result of the topic” (and the topic of each “paragraph” (component)) “modeling of the paragraphs of the textual content, each of the paragraphs may be tagged” (and a “paragraph” (component) identified by its “tag[]” (type)) “and each of the tags may include topic information including one or more keywords and/or phrases” (and the words in the “paragraph” (component)).
For obviousness to combine RAMANATHAN et al. and Li et al. see claim 6.

Regarding claim 8, RAMANATHAN et al. do teach the data extraction method according to claim 7, wherein 
In the topic-discrimination-model creation process, the information processing device creates,  as the topic discrimination model, a model having at least each word in the component and a word having a modification relationship with the word as a feature amount (¶ 0024 sentence 3: “Words or noun phrases are also assigned weights” (a feature amount is calculated for each word in the component) “based on how often they had co-occurred with other highly weighted terms” (that determines a modification relationship with a word))), 
and
in the topic presuming process, the information processing device inputs each word in the component of the specified set of sentences and a word having a modification relationship with the word into the topic discrimination model to presume the topic portion (¶ 0026 lines 1-4:  “As indicated by step 206, topic extraction” (for topic presuming and identifying) “module 62 directs controller 32 further extract a set of additional terms that co-occur with a particular key phrase” (for each word in the component) “in the digital text document. As indicated by step 208, the extracted co-occurring terms are weighted” (its “weight” (modification relationship) with other words in the component is considered)).

Regarding claim 9, RAMANATHAN et al. do teach the data extraction method according to claim 6, wherein in the word-vector generation process, the information processing device  learns the relationship among each word in the specified set of sentences  (¶ 0024 sentence 3: “words or noun phrases are also assigned weights” (a feature amount is calculated for each word) “based on how often they had co-occurred with other highly weighted terms” (that determines a relationship with other words in the specified set of sentences)), 
and the information processing device calculates a feature amount of each word based on the created co-occurrence-word presuming model (¶ 0024 sentence 3: “words or noun phrases are also assigned weights” (a feature amount is calculated for each word) “based on how often they had co-occurred” (based on the created word co-occurrence) “with other highly weighted terms”).
RAMANATHAN et al. do not specifically disclose 
the type of each presumed component, and the presumed topic portion to create a co-occurrence-word presuming model that has memorized the relationship among the occurrence of the words in the specified set of sentences, the type of a component in the specified set of sentences, and a topic portion in the component.
Li et al. do teach:
the type of each presumed component, and the presumed topic portion to create a co-occurrence-word presuming model that has memorized the relationship among the occurrence of the words in the specified set of sentences, the type of a component in the specified set of sentences, and a topic portion in the component (¶ 0017 sentence 2+: “topic modeling may include biterm topic modeling (BTM), which is based on modeling word co-occurrence patterns” (using “word-co-occurrence patterns” (the relationship among the occurrence of the words in the specified set of sentences)) “As a result of the topic” (and the presumed topic of each “paragraph” (component)) “modeling of the paragraphs of the textual content, each of the paragraphs may be tagged” (and a “paragraph” (component) type is thus identified by its “tag[]”) “and each of the tags may include topic information including one or more keywords and/or phrases”; e.g., ¶ 0033 last sentence referring to Fig. 4: “SIM 408 also may use weights” (memorized relationship among word occurrences is used) “associated with the words and phrases of table 410 to predict a topic of each paragraph” (to determine the topic portion of each “paragraph” (component)).
For obviousness to combine RAMANATAHN et al. and Li et al. see claim 6.

Claim(s) 5, 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over RAMANATHAN et al. in view of Li et al., and further in view of Iizuka (US 2014/0258302).
Regarding claim 5, RAMANATHAN et al. in view of Li et al. do not specifically disclose the data extraction device according to claim 1, further comprising an output part that outputs a plurality of the extracted words having a relationship with one another.
Iizuka does teach the data extraction device according to claim 1, further comprising an output part that outputs a plurality of the extracted words having a relationship with one another (¶ 0058 last sentence: “FIG. 10(A) includes outputting” (output part that outputs) “the information of the co-occurrence probabilities of the respective word groups” (of plurality of extracted words that have a co-occurrence relationship with one another; e.g. the words “Shibuya” and “hamburger” with “co-occurrence” “probability” (weight) of “0.9”)).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the “outputting word groups in a ranking format in descending order of co-occurrence information” which will thus “allow” “to select one of them” (as disclosed in Iizuka ¶ 0030 last 6 lines), for aiding it in determining a most appropriate “topic extraction” which depends on detection of “terms that co-occur” as disclosed in RAMANATHAN et al. ¶ 0026 sentence 1.

Regarding claim 10, RAMANATHAN et al. in view of Li et al. do not specifically disclose the data extraction method according to claim 6, wherein the information processing device  executes an output process of outputting a plurality of the extracted words having a relationship with one another.
Iizuka does teach the data extraction method according to claim 6, wherein the information processing device  executes an output process of outputting a plurality of the extracted words having a relationship with one another (¶ 0058 last sentence: “FIG. 10(A) includes outputting” (output part that outputs) “the information of the co-occurrence probabilities of the respective word groups” (of plurality of extracted words that have a co-occurrence relationship with one another; e.g. the words “Shibuya” and “hamburger” with “co-occurrence” “probability” (weight) of “0.9”)).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the “outputting word groups in a ranking format in descending order of co-occurrence information” which will thus “allow” “to select one of them” (as disclosed in Iizuka ¶ 0030 last 6 lines), for aiding it in determining a most appropriate “topic extraction” which depends on detection of “terms that co-occur” as disclosed in RAMANATHAN et al. ¶ 0026 sentence 1.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARZAD KAZEMINEZHAD whose telephone number is (571)270-5860. The examiner can normally be reached 10:30 am to 11:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, DANIEL C WASHBURN can be reached on (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Farzad Kazeminezhad/
Art Unit 2657
August 13th 2022.