DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Allowable Subject Matter
Claims 2, 3, 6-14, 21 and 22 are allowed.

			  Statement of Reasons for Allowance
Regarding claim 2, The following is an Examiner’s statement of reasons for allowance:
Claessens (US 20150178786, hereinafter “Claessens”), teaches
“A method for document feature extraction, comprising:  receiving a document in a digital format, wherein the digital format comprises text information and image information (the publisher 104 may include content providers. For example, content providers may include those with an internet presence, such as online publication and news providers (e.g., online newspapers, online magazines, television websites, etc.), and online service providers (e.g., photo sharing sites, video sharing sites, social networks, etc.), Paras. [0068]-[0072]) performing a text extraction function on a first portion of the document to produce a set of text features (recognize or otherwise determine information about the central theme of the image (e.g., "kitchen", "bathroom", "women's apparel", "wedding") and other relevant information of the image, through an analysis of text data 302, metadata 303, image data 304, or any combination thereof, Para. [0092]), performing an image extraction function on a second portion of the document to produce a set of image features (recognize or otherwise determine information about the central theme of the image (e.g., "kitchen", "bathroom", "women's apparel", "wedding") and other relevant information of the image, through an analysis of text data 302, metadata 303, image data 304, or any combination thereof, Para. [0092])”.
Kalenkov (US 20190294921, hereinafter “Kalen”), teaches, 
 “A method for document feature extraction comprising generating a document matrix based on the feature tree (method 300 generates a three dimensional feature matrix representing a portion of the image comprising the first field and an associated local context. In one embodiment, text field identification engine 112 performs a number of processing operations on the document image 200 to extract a number of features for input into machine learning models 114, Para. [0044]), wherein each element of the document matrix corresponds to a location within one or more zones corresponding to the plurality of nodes of the feature tree (the first dimension of the matrix may be a height measurement representing a relative position along Y-axis (e.g., a specified line), the second dimension of the matrix may be a width measurement representing a relative position in the specified line along the X axis (e.g., a particular cell), and the third dimension of the matrix may be a feature vector representing values extracted from the X-Y location the document image 200 and arranged in a certain order, Para. [0044] and also, image processing and generation of the three dimensional feature matrix are provided below with respect to FIGS. 4-6) and generating an input vector for a machine learning model based at least in part on a the document matrix (At block 330, method 300 provides the three dimensional feature matrix as an input to one or more of trained machine learning models 114, Para. [0045]).”
However, Claessen and Kalen, whether taken alone or combination, do not teach or suggest the following novel features, “A method for document feature extraction comprising  generating a feature tree, wherein each of a plurality of nodes of the feature tree comprises a zone including position information, size information, and a zone type selected from  to the set of text features and the set of image features; generating a document matrix based on the feature tree, by associating each element of the document matrix with a zone type of a corresponding node of the feature tree based on the position information and the size information”, in combination with all the recited limitations of the claim 2.

Regarding claim 21, the following is an Examiner’s statement of reasons for allowance:
Claessens (US 20150178786, hereinafter “Claessens”), teaches
“A method for document feature extraction comprising receiving a document in a digital format, wherein the digital format comprises text information and image information (the publisher 104 may include content providers. For example, content providers may include those with an internet presence, such as online publication and news providers (e.g., online newspapers, online magazines, television websites, etc.), and online service providers (e.g., photo sharing sites, video sharing sites, social networks, etc.), Paras. [0068]-[0072]), performing a text extraction function on a first portion of the document to produce a set of text features (recognize or otherwise determine information about the central theme of the image (e.g., "kitchen", "bathroom", "women's apparel", "wedding") and other relevant information of the image, through an analysis of text data 302, metadata 303, image data 304, or any combination thereof, Para. [0092]), performing an image extraction function on a second portion of the document to produce a set of image features (recognize or otherwise determine information about the central theme of the image (e.g., "kitchen", "bathroom", "women's apparel", "wedding") and other relevant information of the image, through an analysis of text data 302, metadata 303, image data 304, or any combination thereof, Para. [0092]), generating a feature tree, wherein a plurality of nodes of the feature tree correspond to the set of text features and the set of image features ( System 220 may contain an indexer 400, which may use an image data indexer 410 to collect and index information from the actual content of non-text files (e.g., image data 304, 314 and/or 324), through the use of a recognition component, which may employ one or more of the many available recognition techniques to identify low level and high level features, such as shapes, patterns, colors, faces, local or global features and/or other visual information, contained in an image, Paras. [0108-0111])”. 
Kalenkov (US 20190294921, hereinafter “Kalen”), teaches, 
A method for document feature extraction comprising generating a document matrix based on the feature tree (method 300 generates a three dimensional feature matrix representing a portion of the image comprising the first field and an associated local context. In one embodiment, text field identification engine 112 performs a number of processing operations on the document image 200 to extract a number of features for input into machine learning models 114, Para. [0044]), wherein each element of the document matrix corresponds to a location within one or more zones corresponding to the plurality of nodes of the feature tree (the first dimension of the matrix may be a height measurement representing a relative position along Y-axis (e.g., a specified line), the second dimension of the matrix may be a width measurement representing a relative position in the specified line along the X axis (e.g., a particular cell), and the third dimension of the matrix may be a feature vector representing values extracted from the X-Y location the document image 200 and arranged in a certain order, Para. [0044] and also, image processing and generation of the three dimensional feature matrix are provided below with respect to FIGS. 4-6) and generating an input vector for a machine learning model based at least in part on a the document matrix (At block 330, method 300 provides the three dimensional feature matrix as an input to one or more of trained machine learning models 114, Para. [0045]).”
However, Claessen and Kalen, whether taken alone or combination, do not teach or suggest the following novel features, “A method for document feature extraction comprising identifying two or more features from the set of text features and the set of image features corresponding to the location; identifying a feature type for each of the two or more features; identifying a feature weight for each of the identified feature types; selecting a single feature from the two or more features based on a comparison of the feature weights, wherein an element of the document is based on the selected single feature”, in combination with all the recited limitations of the claim 21.

Any comments considered necessary by Applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GOLAM SOROWAR whose telephone number is (571)270-3761.  The examiner can normally be reached on Mon-Fri: 8:30AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Appiah can be reached on (571) 272-7904.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for 






/GOLAM SOROWAR/           Primary Examiner, Art Unit 2641