DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed on 2/10/2021 have been fully considered but they are not persuasive.
Examiner thanks the Applicant for clarification set forth in the claim amendments and the arguments.  Although issues raised in prior rejection under 35 USC 112 appear to be resolved,  further clarification is required with regards to amendments as outlined in the rejection under 35 USC 112 below.  Additionally, Applicant is suggested to distinctly claim what the “numerical spatial feature” is and how it is related to the “plurality of numerical values indicating the respective particular words and the hierarchical spatial information”. For purpose of examination, Examiner interprets the “plurality of numerical values” indicating the “hierarchical spatial information” as values that can overlap with the “plurality of numerical values” indicating “the respective particular words”. Stated another way, they are not mutually exclusive.
As provided in the claim rejection below, Calapodescu teaches generating, by the processor of the computer system, a numerical spatial feature based on the encoded plurality of spatial features, wherein the numerical spatial feature includes a plurality of numerical values indicating the respective particular words and the hierarchical spatial information ([0053]-[0055]—teaches a cluster is created based on multiple chunks of text; [0093]-[0095]; [0093] To determine whether each cluster is complete an analysis of the numbers of entities of each type in a given section may be performed. For example, a correlation coefficient (e.g., Pearson's product-moment coefficient) may be computed for co-occurrence of entities in pairs of types (e.g., selected from number of job titles, number of company names, number of dates and so forth). If the coefficient is high, this indicates that these entity types can be expected to be found in the same cluster for each cluster in the section. Examiner notes this step is in reference to S114 of Fig. 3 of Calapodescu. It is evident that the “numerical spatial feature” of step S114 is generated based on plurality of spatial features in the prior steps of Fig. 3, such as those extracted entities and their locations in a document section [0069].
The analogous art Lee further supports the rejection by teaching the method of segmenting and classifying multimedia material by extracting elements from the material (see abstract), and teaches generating, by the processor of the computer system, a numerical spatial feature based on the encoded plurality of spatial features, ([0070] The first step includes loading relevant configurations and target media files to be analyzed into the system in block 211.  At block 212, the multimedia elements (e.g., text, figures, logos, images, etc.) of the multimedia content are identified and extracted, and then the extracted elements are grouped into objects.  As described herein, in some embodiments, extracting and/or identifying one or more elements from the multimedia material and forming objects comprising the elements includes one or more of (i) identifying an individual element within the multimedia material; (ii) determining a location of the individual element within the multimedia material; and (iii) inferring at least one semantic relationship between the individual element and at least one other element within a threshold distance from the determined location of the individual element based at least in part on an application of rules from a semantic application rule database.  In some embodiments the extraction function additionally or alternatively include extracting one or more elements from the multimedia material and forming one or more objects comprising the elements based at least in part on one or both of a location and a size of the individual elements.  In some embodiments, extracting one or more elements from the multimedia material and forming one or more objects comprising the elements based at least in part on one or both of a location and a size of the individual elements includes: (i) identifying an individual element within the multimedia material; (ii) determining a location and a size of the individual element within the multimedia material; and (iii) inferring at least one relationship between the individual element and at least one other element based at least in part on a distance between the individual element and the at least one other element.). [0004] An object within a multimedia material is an element with some attribute. For instance, an object could be a piece of text with one or more attributes like font type, font size, color, etc. Similarly, an object could be an image with one or more attributes like size, location, sentiment like brightness, number of colors, resolution, etc.
For at least these reasons, Examiner maintains that the prior art of record full teaches the claims.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 1-9, and 11-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
In claims 1, 9, 11, and 19, it appears, the limitation “plurality of spatial features” should be changed to “plurality of pieces of spatial information”. Clarification is required. Claims 2-10, and 12-20 are also rejected since they are dependent on the unclear claims. 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-9, and 11-19 are rejected under 35 U.S.C. 103 as being unpatentable over Calapodescu et al.  (US 2017/0300565) in view of Lee et al. (US 2019/0065911). 
Claim 1
Calapodescu teaches a method, comprising: 
receiving, by a processor of a computer system, a text stream of data derived by an optical character recognition process from an image of a piece of content ([0040] The system 30 receives as input a text document 10, such as a resume, for processing. The resume may be an electronic document such as a Word document, or a scanned paper document. In the case of a scanned document, the document may be preprocessed, either by the system or elsewhere, using optical character recognition to extract the text content.); 
detecting, by the processor of the computer system, a plurality of pieces of spatial information associated with the piece of content ([0042] The segmentation component 50 (optional) segments the document 10 into sections 18, e.g., based on document structure, such as lines between rows of text, section titles, paragraph breaks, combinations thereof, and the like. In one embodiment, the section starts are identified using a categorizer which is applied on each experience, education, skills, other, and none). In one embodiment the categorizer is trained to detect the beginning of each section, for example, by classifying each line as being a title or not.), 
wherein the plurality of pieces of spatial information represent hierarchical spatial information indicative of relative positions of the plurality of pieces of spatial information within the piece of content ([0019] In accordance with another aspect of the exemplary embodiment, a method for extracting entities from a resume includes segmenting the resume into sections. A first set of entities and respective entity class labels is extracted from the section with at least one of grammar rules, a probabilistic model, and a lexicon. At least a subset of the extracted entities in the first set is clustered into clusters, based on locations of the entities in the resume. [0036] FIG. 1 illustrates part of a loosely-structured document 10, such as a resume. Entities 12, 14, 16, etc., to be extracted,…tend to be grouped or clustered together (location wise). In this example, four clusters 20, 22, 24, 26 of entities are seen and, in each entity cluster, the entities are semantically related (i.e., the date corresponds to the job title and to the company name of the cluster).); 
encoding, by the processor of the computer system, each of the plurality of pieces of spatial information into a respective token associated with a respective particular word, each respective token being indicative of the respective particular word and a respective portion of the hierarchical spatial information ([0043], Component 52 may be a conventional entity extraction component which has been trained to label entities 12, 14, 16, etc., in the text that fall within one of a predefined set of entity classes, such as at least two or at least three or at least four entity classes. The entity extraction component 50 may be a rule based extraction system or based on a tokenizing the text to form a sequence of tokens (generally, words), applying morphological tags (in particular, parts of speech, such as noun, verb, adjective, etc.), identifying noun clusters, and labeling the nouns/noun clusters with entity tags using, for example, an entity resource, such as a lexicon 60 of entities, in which the entities may each be labeled according to one of a predefined set of entity classes. In this case, the extraction of the first set of entities may include accessing the lexicon of entities and identifying text sequences (one or more tokens) in the section which each match a respective entity in the lexicon; Notes as provided above [0019] Entities 12, 14, 16, etc., to be extracted,…tend to be grouped or clustered together (location wise). [0048]-[0052], [0048] The first entity extraction component 52 also identifies a location of each of the extracted entities, e.g., with offset precision or other location indicator. For example, each character (including spaces between tokens) is indexed in sequence. Each entity can them be located by its first index and its length. For example, entity 14, classed as City/State, may have the location: index 19, length 13. The output of the first entity extraction component 52 is a list 61 of extracted entities and respective entity classes and associated location information. Examiner notes each entity corresponds to a token with spatial features such as location indicators. See also [0107]-[0113], [0107] The chunker component 65 may employ tokenization and syntactic and semantic features provided by a syntactic parser 52, such as the Xerox Incremental Parser (XIP), in addition to the results of the first extraction component 52 to split the text into chunks. In the exemplary embodiment, a chunk can be:); 
generating, by the processor of the computer system, a numerical spatial feature based on the encoded plurality of spatial features, wherein the numerical spatial feature includes a ; [0093] To determine whether each cluster is complete an analysis of the numbers of entities of each type in a given section may be performed. For example, a correlation coefficient (e.g., Pearson's product-moment coefficient) may be computed for co-occurrence of entities in pairs of types (e.g., selected from number of job titles, number of company names, number of dates and so forth). If the coefficient is high, this indicates that these entity types can be expected to be found in the same cluster for each cluster in the section.); 
inputting, by the processor of the computer system, the numerical spatial feature into a machine learning model ([0057], [0154]-[0155]; S128 of Fig. 3; [0154] The entities extracted from the resumes can be used to learn a classifier model, e.g., a binary classifier, for distinguishing between good and bad resumes.); and 
performing, by the processor of the computer system, the machine learning model using the numerical spatial feature ([0058]-[0059], [0077]; [0058] The output component 58 outputs information 86 extracted by the first and second entity extraction components 52, 54, such as the sequences of entities and their classes, e.g., in table format or other structured data format. In another embodiment, the information output may be an annotated version of the input document 10. [0077] At S130, information 86 may be output by the system, based on the entities identified in at least the second pass. For example resumes may be associated with metadata for each of the identified entities which can be used to highlight (e.g. with different colors) or otherwise distinguish the entities when shown on the GUI 45. Relevant documents, as identified by the machine learning model, may be displayed first or shown in a cluster.).  

Lee also teaches method of segmenting and classifying multimedia material by extracting elements from the material (see abstract), and teaches generating, by the processor of the computer system, a numerical spatial feature based on the encoded plurality of spatial features, ([0070] The first step includes loading relevant configurations and target media files to be analyzed into the system in block 211.  At block 212, the multimedia elements (e.g., text, figures, logos, images, etc.) of the multimedia content are identified and extracted, and then the extracted elements are grouped into objects.  As described herein, in some embodiments, extracting and/or identifying one or more elements from the multimedia material and forming objects comprising the elements includes one or more of (i) identifying an individual element within the multimedia material; (ii) determining a location of the individual element within the multimedia material; and (iii) inferring at least one semantic relationship between the individual element and at least one other element within a threshold distance from the determined location of the individual element based at least in part on an application of rules from a semantic application rule database.  In some embodiments the extraction function additionally or alternatively include extracting one or more elements from the multimedia material and forming one or more objects comprising the elements based at least in part on one or both of a location and a size of the individual elements.  In some embodiments, extracting one or more elements from the multimedia material and forming one or more objects comprising the elements based at least in part on one or both of a location and a size of the individual elements includes: (i) identifying an individual element within the multimedia material; (ii) determining a location and a size of the individual element within the multimedia material; and (iii) inferring at least one relationship 0004] An object within a multimedia material is an element with some attribute. For instance, an object could be a piece of text with one or more attributes like font type, font size, color, etc. Similarly, an object could be an image with one or more attributes like size, location, sentiment like brightness, number of colors, resolution, etc. Examiner notes spatial features such as location and size of individual elements, for example, can have numerical values.
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to incorporate the generation of spatial feature as taught by Lee with the entity extraction method of Calapodescu, because doing so would have provided a way to improve accuracy of classification information ([0033] of Lee). 
Claim 2
Calapodescu in view of Lee further teaches the method of claim 1, wherein detecting the plurality of pieces of spatial information further comprises detecting, by the processor of the computer system, at least one empty cell in the piece of content ([0127] of Calapodescu, incomplete clusters).  
Claim 3
Calapodescu in view of Lee further teaches the method of claim 2, wherein detecting the at least one empty cell further comprises inserting, by the processor of the computer system, an empty cell placeholder into the detected empty cell ([0117] of Lee, The checker flags missing information as part of the metadata.).  


Claim 4

Claim 5
Calapodescu in view of Lee further teaches the method of claim 4, wherein performing an information extraction machine learning process further comprises receiving a textstream from an optical character recognition process of a form and extracting words from the form using the information extraction machine learning process ([0003] of Calapodescu, the content of the sections may have many different forms (list of structured paragraphs, tables, full sentences or list of words).).
Claim 6
Calapodescu in view of Lee further teaches the method of claim 5, wherein performing an information extraction machine learning process further comprises using a conditional random field machine learning model to extract data from the form ([0004] of Calapodescu, The UIMA Ruta rule-based system for information extraction and general natural language processing is described as well as machine learning techniques based on Conditional Random Fields (CRF) and extensions.).  
Claim 7
Calapodescu in view of Lee further teaches the method of claim 5, wherein performing an information extraction machine learning process further comprises using a bidirectional long short term memory and conditional random field machine learning models to extract data from 
Claim 8
Calapodescu in view of Lee further teaches the method of claim 1, wherein the hierarchical spatial information further comprises spatial information about a page of the piece of content, spatial information about a cell in the page of the piece of content, spatial information about a paragraph in the cell of the piece of content, spatial information about a line in the paragraph of the piece of content and spatial information about a word in the line of the piece of content ([0003] of Calapodescu, While a resume is a well-defined document, with fairly standard sections (personal information, education, experience, etc.), the format and presentation may vary widely. Also, multiple file formats are possible (PDF, Microsoft Office Word document, text file, html, etc.), the order of the sections may vary (e.g., the education section may be at the beginning or at the end), and the content of the sections may have many different forms (list of structured paragraphs, tables, full sentences or list of words). [0042] of Calapodescu, The segmentation component 50 (optional) segments the document 10 into sections 18, e.g., based on document structure, such as lines between rows of text, section titles, paragraph breaks, combinations thereof, and the like.).  

Claim 9
 [0018] of Lee Image annotation (also known as automatic image tagging or linguistic indexing) is a process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image.  Metadata is "data that provides information about other data.").  


Claims 11-19
These claims recite substantially the same limitations as those provided in claims 1-9 above respectively and therefore they are rejected for the same reasons. Regarding neural network . 

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to THOMAS H MAUNG whose telephone number is (571)270-5690.  The examiner can normally be reached on Monday-Friday, 9am-6pm, EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/THOMAS H MAUNG/Primary Examiner, Art Unit 2654