DETAILED ACTION
The following is a non-final, first office action in response to the application filed March 24, 2020.  Claims 1-13 are currently pending and have been examined.


Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.



Claims 1-3 and 10-13 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Juneja et al (US2017/0169103 A1).

Regarding claims 1, 12, and 13, Juneja discloses a learning system, comprising at least one processor configured to: 
cause a learner, which is configured to classify symbol information included in each of a plurality of documents, to learn based on training data indicating an attribute value of each of a plurality of attributes; input each of the plurality of documents to the learner to acquire the symbol information classified by the learner as an attribute value candidate; (Juneja: Figure 1 - attributes and their values in training documents 3); 
determine whether a symbol or a symbol string indicated by the attribute value candidate satisfies a predetermined condition (Juneja: Figure 1 - does a candidate match user specified value after applying a canonical mapping 6);
control, based on a determination result obtained by the determination means, additional learning by the learner using the attribute value candidate (Juneja: Figure 1 - train model 11).

Regarding claim 2, Juneja discloses all of the limitations as noted above in claim 1.  Juneja further discloses at least one processor is configured to restrict an addition of an attribute value candidate for which the determination result  is not a predetermined result to the training data as a new attribute value, and restrict additional learning by the learner using the attribute value candidate (Juneja: Figure 1 - a score of zero is assigned to the candidate strings 9).

Regarding claim 3, Juneja discloses all of the limitations as noted above in claim 1.  Juneja further discloses at least one processor is configured to determine whether the symbol or the symbol string indicated by the attribute value candidate has less than a predetermined number of characters (Juneja: paragraph [0039] - The spatial context of a string could include, for example, a certain number of lines above the string, a certain number of lines below the string, a certain number of characters to the left of the string and a certain number of characters to the right of the string).

Regarding claim 10, Juneja discloses all of the limitations as noted above in claim 1.  Juneja further discloses: 
cause a first learner to learn based on the training data,  input each of the plurality of documents to the first learner to acquire, as a first attribute value candidate, symbol information to which an attribute has been assigned by the first learner, cause a second learner to learn based on the first attribute value candidate,  input each of a plurality of documents to the second learner to acquire, as a second attribute value candidate, symbol information to which an attribute has been assigned by the second learner  (Juneja: Figure 1 - attributes and their values in training documents 3)’
determine whether the symbol or the symbol string indicated by each of the first attribute value candidate and the second attribute value candidate satisfies the predetermined condition (Juneja: Figure 1 - does a candidate match user specified value after applying a canonical mapping 6);
control additional learning using each of the first attribute value candidate and the second attribute value candidate based on a determination result (Juneja: Figure 1 - train model 11).

Regarding claim 11, Juneja discloses all of the limitations as noted above in claim 1.  Juneja further discloses wherein the at least one processor is configured to input, to the second learner, each of a plurality of documents different from the plurality of documents input to the first learner (Juneja: Figure 1, 3 - repeat).




Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103(a) are summarized as follows:
1.	Determining the scope and contents of the prior art.
2.	Ascertaining the differences between the prior art and the claims at issue.
3.	Resolving the level of ordinary skill in the pertinent art.
4.	Considering objective evidence present in the application indicating obviousness or nonobviousness.



Claims 4, 6, 8, and 9 are rejected under 35 U.S.C. 103(a) as being unpatentable over Juneja et al (US2017/0169103 A1) in view of Yamamoto (US 2013/0124439 A1).  

Regarding claim 4, Juneja discloses all of the limitations as noted above in claim 1.   Juneja does not expressly disclose at least one processor is configured to determine whether the symbol or the symbol string indicated by the attribute value candidate is a specific type of symbol or symbol string.  
Yamamoto discloses at least one processor is configured to determine whether the symbol or the symbol string indicated by the attribute value candidate is a specific type of symbol or symbol string (Yamamoto: paragraph [0060] - In the example of FIG. 3, the positive example solution request pattern storage unit 30 uses two tables that are a pattern type table 201 and a pattern table 202, to store positive example solution request patterns).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method and apparatus of Juneja to have included at least one processor is configured to determine whether the symbol or the symbol string indicated by the attribute value candidate is a specific type of symbol or symbol string, as taught by Yamamoto because it would provide a method of extracting positive examples in machine learning (Yamamoto: paragraph [0013]).

Regarding claim 6, Juneja discloses all of the limitations as noted above in claim 1.   Juneja does not expressly disclose wherein the at least one processor is configured to control additional learning by the learner using the attribute value candidate based further on an appearance frequency of the attribute value candidate.  
Yamamoto discloses wherein the at least one processor is configured to control additional learning by the learner using the attribute value candidate based further on an appearance frequency of the attribute value candidate (Yamamoto: paragraph [0113] -  Next, the identification information specification unit 102 removes each word having a low co-occurrence frequency with the problem evoking expression).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method and apparatus of Juneja to have included wherein the at least one processor is configured to control additional learning by the learner using the attribute value candidate based further on an appearance frequency of the attribute value candidate, as taught by Yamamoto because it would provide a method of extracting positive examples in machine learning (Yamamoto: paragraph [0013]).

Regarding claim 8, Juneja discloses all of the limitations as noted above in claim 1.   Juneja does not expressly disclose wherein the at least one processor is configured to generate initial data of the training data by extracting, from each of the plurality of documents, symbol information written in a predetermined notation pattern as an attribute value.  
Yamamoto discloses wherein the at least one processor is configured to generate initial data of the training data by extracting, from each of the plurality of documents, symbol information written in a predetermined notation pattern as an attribute value (Yamamoto: paragraph [0014] - solution request sentence set acquisition means for acquiring, using a positive example solution request pattern representing a positive example of a sentence including the problem evoking expression and a negative example solution request pattern representing an opposite request to the positive example solution request).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method and apparatus of Juneja to have included wherein the at least one processor is configured to generate initial data of the training data by extracting, from each of the plurality of documents, symbol information written in a predetermined notation pattern as an attribute value, as taught by Yamamoto because it would provide a method of extracting positive examples in machine learning (Yamamoto: paragraph [0013]).

Regarding claim 9, Juneja discloses all of the limitations as noted above in claim 1.   Juneja does not expressly disclose at least one processor is configured to generate the initial data by acquiring an appearance frequency of each of a plurality of notation patterns from each of the plurality of documents, and extracting, as an attribute value, symbol information written in a notation pattern appearing in a predetermined frequency or more.  
Yamamoto discloses at least one processor is configured to generate the initial data by acquiring an appearance frequency of each of a plurality of notation patterns from each of the plurality of documents, and extracting, as an attribute value, symbol information written in a notation pattern appearing in a predetermined frequency or more (Yamamoto: paragraph [0017] -  The identification information specification unit 102 may use a method of storing a dictionary containing a value of each co-occurrence frequency between words beforehand so that the co-occurrence frequency can be easily obtained using $x$ and the word i as a key).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method and apparatus of Juneja to have included at least one processor is configured to generate the initial data by acquiring an appearance frequency of each of a plurality of notation patterns from each of the plurality of documents, and extracting, as an attribute value, symbol information written in a notation pattern appearing in a predetermined frequency or more, as taught by Yamamoto because it would provide a method of extracting positive examples in machine learning (Yamamoto: paragraph [0013]).

Claims 5 and 7 are rejected under 35 U.S.C. 103(a) as being unpatentable over Juneja et al (US2017/0169103 A1) in view of Urainczyk et al (US 2006/0143175 A1).  

Regarding claim 5, Juneja discloses all of the limitations as noted above in claim 1.   Juneja does not expressly disclose wherein each of the plurality of documents is written in a markup language, and wherein the at least one processor is configured to determine whether the symbol or the symbol string indicated by the attribute value candidate is a tag portion.  
Urainczyk discloses wherein each of the plurality of documents is written in a markup language, and wherein the at least one processor is configured to determine whether the symbol or the symbol string indicated by the attribute value candidate is a tag portion (Urainczyk: paragraph [0029] - 2) introduce BE-tags into the eXtended Markup Language (XML) form of the document. BE-tags are XML tags denoting entity types, including such things as organizations, place names, people's names, and domain terms, inserted around the text that has been recognized by the feature finder).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method and apparatus of Juneja to have included wherein each of the plurality of documents is written in a markup language, and wherein the at least one processor is configured to determine whether the symbol or the symbol string indicated by the attribute value candidate is a tag portion, as taught by Urainczyk because it would provide an effective method for classifying text (Urainczyk: paragraph [0011]).

Regarding claim 7, Juneja discloses all of the limitations as noted above in claim 1.   Juneja does not expressly disclose wherein the at least one processor is configured to control additional learning by the learner using the attribute value candidate based further on a probability of the attribute value candidate, which is calculated by the learner.  
Urainczyk discloses wherein the at least one processor is configured to control additional learning by the learner using the attribute value candidate based further on a probability of the attribute value candidate, which is calculated by the learner (Urainczyk: paragraph [0052] -  In the event a user does not specify the weight, it will be assigned based on the feature's distribution in the training data (possibly modulated by its prior probability of occurrence in the language). In the absence of training data, the weight will be assigned based on the feature's distribution in the REE (again, possibly modulated by its prior probability of occurrence in the language)).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method and apparatus of Juneja to have included wherein the at least one processor is configured to control additional learning by the learner using the attribute value candidate based further on a probability of the attribute value candidate, which is calculated by the learner, as taught by Urainczyk because it would provide an effective method for classifying text (Urainczyk: paragraph [0011]).


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US 20210073532 A1, Torres et al discloses METAMODELING FOR CONFIDENCE PREDICTION IN MACHINE LEARNING BASED DOCUMENT EXTRACTION.
PTO-892 Reference U discloses deep learning for symbols detection and classification.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KATHLEEN G PALAVECINO whose telephone number is (571)270-1355.  The examiner can normally be reached on M-F 9-4.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jeffrey Smith can be reached on 571-272-6763.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


KATHLEEN GAGE PALAVECINO
Primary Examiner
Art Unit 3625


/KATHLEEN PALAVECINO/
Primary Examiner, Art Unit 3625