DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.


Drawings
The drawings filed 2/12/2021 are accepted.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.




Claims 8, 10, 14, and 19 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 10 recites the limitation "the classifiers".  There is insufficient antecedent basis for this limitation in the claim. Examiner suggests making claim 10 depend on claim 9 to overcome this rejection.
Claims 8, 14, and 19 recite the limitation "the cluster tree".  There is insufficient antecedent basis for this limitation in the claim. Examiner suggests making claim 8 depend on claim 2, claim 14 depend on claim 12, and claim 19 depend on claim 17 to overcome these rejections.




Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5-8, 11-14, and 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Duan et al (US 20110035345 A1; filed 8/10/2009) in view of Le et al (D. Le, G. R. Thoma and J. Zou, "Combining DOM tree and geometric layout analysis for online medical journal article segmentation," Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06), 2006, pp. 119-128, doi: 10.1145/1141753.1141777.) and Rajan et al (US 20120005686 A1; filed 7/1/2010).

With regards to claim 1, Duan et al discloses a method comprising: obtaining a set of tagged… documents, wherein each of the tagged… documents includes one or more visual elements and a document object model (DOM) structure (Duan et al, paragraph 37: “Here, for example, learner 312 may be enabled to establish one or more machine learned models 318 based, at least in part, on a sample set of segmented portions 310-1, editorial input 314, and/or one or more feature properties 316.”); pre-processing each tagged… document, wherein the pre-processing organizes visual elements of the document into graphical objects (Duan et al, paragraph 45: “In certain implementations, for example, one may employ a layout and Document Object Model (DOM) based process which may use a rule based process that starts with a single segmented portion including all DOM nodes, and divides segmented portions (portions) recursively into smaller segments until a desired size is reached”); processing each tagged … document (Duan et al, paragraph 38: “classifier 320 may be enabled to classify segmented portions 310 by segment type(s) 322… a type may be any property that may be identified by a human (e.g., in the supervised learning mode);” the tagged portions of the document are being interpreted as the types assigned to portions), wherein the processing the tagged… document identifies relationships between the graphical objects and corresponding elements of the DOM structure, and generates training records corresponding to the identified relationships; training a machine learning model using the training records, wherein the training trains the machine learning model to determine DOM structure elements that are associated with graphical objects (Duan et al, paragraph 37: “Here, for example, learner 312 may be enabled to establish one or more machine learned models 318 based, at least in part, on a sample set of segmented portions 310-1, editorial input 314, and/or one or more feature properties 316.” Duan et al, paragraph 87: “some feature properties may be used to consider topical coherence of a segmented portion with respect other individual segmented portions, such as, neighboring or otherwise nearby segmented portions (e.g., on the visual layout, in a DOM, and/or all other segmented portions), or with respect to whole displayed web page where all other segmented portions are considered as one segmented portion.” The sample sets are being interpreted as the training records. Paragraph 87 describes the use of the DOM to find neighbors of the corresponding graphical objects); 
obtaining a set of untagged … documents; and automatically analyzing each of the untagged… documents (Duan et al, paragraph 38: “As shown, in this example, classifier 320 may be enabled to classify segmented portions 310 based, at least in part, on one or more machine learned models 318 and/or one or more feature properties 316”), the analyzing including identifying one or more graphical objects contained in the untagged… document, determining for each of the identified graphical objects a corresponding DOM structure element using the trained machine learning model, and generating a analyzed … document corresponding to the untagged … document, wherein the analyzed … document contains the one or more graphical objects contained in the untagged … document and the corresponding DOM structure elements determined by the trained machine learning model (Duan et al, paragraph 87: “some feature properties may be used to consider topical coherence of a segmented portion with respect other individual segmented portions, such as, neighboring or otherwise nearby segmented portions (e.g., on the visual layout, in a DOM, and/or all other segmented portions), or with respect to whole displayed web page where all other segmented portions are considered as one segmented portion.”).
However, Duan et al does not disclose PDF documents… automatically tagging each of the untagged… documents… generating a tagged… document corresponding to the untagged … document, wherein the tagged … document contains the one or more graphical objects contained in the untagged … document and the corresponding DOM structure elements.
Le et al teaches PDF documents (Le et al, Introduction, paragraph 5: “we choose to convert them into HTML files by using an open source library, PDFTOHTML, [18] and then analyze the PDF-converted-HTML files”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan et al and Le et al such that the documents can include PDFs. This would have helped to standardize the input system since many documents are in PDF format (Le et al, Introduction, paragraph 5: “Many online journal articles are in PDF format. In order to standardize our input system and to minimize the number of modules for reading articles, we choose to convert them into HTML files”).
Rajan et al teaches automatically tagging each of the untagged… documents… generating a tagged… document corresponding to the untagged … document, wherein the tagged … document contains the one or more graphical objects contained in the untagged … document (Rajan et al, abstract: “In the approach described herein, one of a generic set functional labels are automatically assigned to each segment of a web page, where the generic functional labels may be topic-independent and application-independent.”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan et al and Rajan et al such that the segments are labeled according to their classification. “By annotating the function of each segment of a web page, each individual application can select the segments having the labels that correspond to content that is meaningful to the application” (Rajan et al, paragraph 17).

With regards to claim 2, which depends on claim 1, Duan et al discloses wherein the pre-processing of the tagged … documents includes: parsing the tagged … document to identify visual elements of the document (Duan et al, paragraph 36: “Segmenter 306 may be enabled to automatically identify one or more initial segmented portions 310… some initial properties may include one or more layout properties that may be derived from coordinates of the DOM nodes included in candidate segments/portions”); grouping the identified visual elements into visual objects; determining, for each of the visual objects, a corresponding visual bounding box (Duan et al, paragraph 31: “FIGS. 2A and 2B also graphically show web page 100 with such example segmented portions 102-1 through 102-8… More specifically in the example context of the news agency web page, sections relating to the article title (identified as segmented portion 102-2), article text (identified as segmented portion 102-3), and summaries and links to other articles (identified as segmented portion 102-5) may, for example, be classified as of a type representing “main” content information”); determining, for each of the visual objects, whether the visual object is a foreground element or a background element (Duan et al, paragraph 82: “Certain feature properties may be related to and/or used to consider visual features of one or more segmented portions. Here, for example, feature properties may be related to and/or used to measure or otherwise consider distribution statistics relating to one or more colors of one or more objects (e.g., background objects, foreground objects)”).
However, Duan et al does not disclose PDF documents… and generating a cluster tree by performing a plurality of cuts which segment the … document into multiple visually separated pieces.
Le et al teaches PDF documents… (Le et al, Introduction, paragraph 5: “we choose to convert them into HTML files by using an open source library, PDFTOHTML, [18] and then analyze the PDF-converted-HTML files”) and generating a cluster tree by performing a plurality of cuts which segment the … document into multiple visually separated pieces (Le et al, abstract: “The Web page content is modeled by a zone tree structure based primarily on the geometric layout of the Web page. For a given journal article, a zone tree is generated by combining DOM tree analysis and recursive X-Y cut algorithm.”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan et al and Le et al such that the documents can include PDFs. This would have helped to standardize the input system since many documents are in PDF format (Le et al, Introduction, paragraph 5: “Many online journal articles are in PDF format. In order to standardize our input system and to minimize the number of modules for reading articles, we choose to convert them into HTML files”). It also would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan and Le such that the graphical objects were reorganized into a zone tree. This would have enabled the inventio to better organize and later retrieve the information using the layout information (Le, Section 3, paragraph 9: “We believe that this zone tree model is better for organizing the related information within a document, and therefore better for information retrieval compared to the DOM tree model.”).

With regards to claim 3, which depends on claim 2, Duan et al discloses wherein grouping the identified visual elements into visual objects comprises grouping text elements and grouping image elements (Duan et al, Fig. 1: text sections such as 102-2 and 102-3 are separate from 102-4, which could be an image; paragraph 30: “Another content portion 102-4 may be provided that includes displayed/selectable image, video, audio, and/or certain interactive content/links, which may or may not be associated with article text portion 102-3” Note: The claim does not require that the image elements and text elements are grouped separately, just that they both are grouped).


With regards to claim 5, which depends on claim 2, Duan et al discloses wherein identifying the one or more graphical objects contained in the untagged …document is performed in the same manner by which the one or more graphical objects contained in the tagged… document are identified (Duan, paragraph 45: “In certain implementations, for example, one may employ a layout and Document Object Model (DOM) based process which may use a rule based process that starts with a single segmented portion including all DOM nodes, and divides segmented portions (portions) recursively into smaller segments until a desired size is reached. The desired size may be determined, for example, by rules based, at least in part, on HTML tags in a segmented portion and/or a size of a segment relative to the web page and the other segmented portions.” The segmentation of the document is performed using a rule based process instead of the machine learned process, thus is the same during and after training).
However, Duan et al does not teach yet Le et al teaches PDF documents… (Le et al, Introduction, paragraph 5: “we choose to convert them into HTML files by using an open source library, PDFTOHTML, [18] and then analyze the PDF-converted-HTML files”) and generating a cluster tree by performing a plurality of cuts which segment the … document into multiple visually separated pieces (Le et al, abstract: “The Web page content is modeled by a zone tree structure based primarily on the geometric layout of the Web page. For a given journal article, a zone tree is generated by combining DOM tree analysis and recursive X-Y cut algorithm.”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan et al and Le et al such that the documents can include PDFs. This would have helped to standardize the input system since many documents are in PDF format (Le et al, Introduction, paragraph 5: “Many online journal articles are in PDF format. In order to standardize our input system and to minimize the number of modules for reading articles, we choose to convert them into HTML files”). It also would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan and Le such that the graphical objects were reorganized into a zone tree (cluster tree). This would have enabled the invention to better organize and later retrieve the information using the layout information (Le, Section 3, paragraph 9: “We believe that this zone tree model is better for organizing the related information within a document, and therefore better for information retrieval compared to the DOM tree model.”).


With regards to claim 6, which depends on claim 2, Duan et al discloses wherein a size of the visual bounding box is different than an extent of the visual object (Duan, paragraph 87: “whole displayed web page where all other segmented portions are considered as one segmented portion.” Also, Fig. 1: dotted line around the entire display is larger than individual visual objects within the page). Note: This claim does not specify that the visual bounding box and the visual object are a pair of corresponding box/object as mentioned in claim 2, so it can be interpreted with any bounding box and any visual object.

With regards to claim 7, which depends on claim 2, Duan et al does not disclose pruning the cluster tree by recombining a plurality of leaves of the cluster tree.
However, Le et al teaches pruning the cluster tree by recombining a plurality of leaves of the cluster tree (Le et al, 4.3.2 Collect leaf zones inside a zone: “If there are no overlapping children zones detected within a zone, we collect a set of leaf zones inside this zone at this level… There is no space between the consecutive inline nodes and they should be naturally merged together to form a leaf zone”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan and Le such that there is some adjustment to the leaf nodes of the cluster tree. This would have allowed Le to continue to apply the X-Y cut algorithm onto the merged zones (Le, 4.3.3: “After the leaf zones are collected, the traditional recursive X-Y cut algorithm is applied to build the sub-zone-tree for the zone.”).

With regards to claim 8, which depends on claim 1, Duan et al discloses identifying groups of visual objects that are closely positioned visually in the document; identifying elements of the DOM structure that correspond to the identified groups of visual objects and associating the identified elements of the DOM structure with the corresponding groups of visual objects (Duan et al, paragraph 32: “As further illustrated in FIG. 2B, in certain example implementations two or more initially segmented portions having the same, similar and/or otherwise specified relationship, may be combined and/or otherwise associated together to form a single segmented portion that may be classified by common or resulting type(s)”); and identifying region segment features… storing indications of neighboring region segment features in the training records ((Duan et al, paragraph 32: “two or more initially segmented portions having the same, similar and/or otherwise specified relationship, may be combined and/or otherwise associated together”)).
However, Duan et al does not disclose PDF document… region segment features which are leaves on the cluster tree.
Le et al teaches PDF document (Le et al, Introduction, paragraph 5: “we choose to convert them into HTML files by using an open source library, PDFTOHTML, [18] and then analyze the PDF-converted-HTML files”) … region segment features which are leaves on the cluster tree (Le et al, abstract: “The Web page content is modeled by a zone tree structure based primarily on the geometric layout of the Web page. For a given journal article, a zone tree is generated by combining DOM tree analysis and recursive X-Y cut algorithm.”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan et al and Le et al such that the documents can include PDFs. This would have helped to standardize the input system since many documents are in PDF format (Le et al, Introduction, paragraph 5: “Many online journal articles are in PDF format. In order to standardize our input system and to minimize the number of modules for reading articles, we choose to convert them into HTML files”). It also would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan and Le such that the graphical objects were reorganized into a zone tree (cluster tree). This would have enabled the invention to better organize and later retrieve the information using the layout information (Le, Section 3, paragraph 9: “We believe that this zone tree model is better for organizing the related information within a document, and therefore better for information retrieval compared to the DOM tree model.”).


Claims 11-14 recite substantially similar limitations to claims 1, 2, 5, and 8 respectively and are thus rejected along the same rationales.
Claims 16-19 recite substantially similar limitations to claims 1, 2, 5, and 8 respectively and are thus rejected along the same rationales.




Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Duan et al in view of Le et al and Rajan et al, and further in view of Wang (US 5588072 A; published 12/24/1996).


With regards to claim 4, which depends on claim 3, Duan et al discloses wherein the text elements comprise text characters and the text characters are grouped based on text size, font, position (Duan et al, paragraph 49: “For example, some content features may be identified from a rendered web page such as, e.g., a font size, weight, style, and/or color, as well as an image size if a segmented portion contains an image.” Duan et al, paragraph 49: “By way of example but not limitation, feature properties may include various layout features such as, e.g., measurements and/or relative measurements for segmented portions upon rendering a web page, absolute size and/or position of segmented portions, a size of a segmented portion relative to the web page and/or a relative position of a segmented portion with respect to the web page as well as a “visible fold”, and/or other like layout characteristics if interest”).
However, Duan et al does not disclose grouping text characters based on direction.
Yet, Wang teaches grouping text characters based on direction (Wang, abstract: “searching the document for visible and invisible lines along edges of the non-text components, forming irregularly-shaped text and non-text blocks using the identified text components and the visible and invisible lines, detecting the text orientation for each formed text block, extracting text lines from the text block based on the detected orientation, detecting the skew angle for the stored document based on the extracted lines, and modifying the formed text and non-text blocks based on the detected skew angle”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan and Wang such that the segmentation could account for the text orientation or direction for each text block. This would have enabled the invention to work with bi-directional languages such as Japanese, and to detect vertical words in English (Wang, paragraph 12: “While the above-described block selection technique may be appropriate for horizontal documents, (e.g., English-language documents) it is possible for a page to contain both horizontal and vertical text blocks (bi-directional) For example, a Japanese document may contain vertical Kanji characters in combination with horizontal characters such as tables and figure legends. Also, certain English documents include vertically-extending characters in order to highlight certain information or to provide some desired effect.”).



Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Duan et al in view of Le et al and Rajan et al, and further in view of Filimonova (US 20160307067 A1; filed 6/29/2016).

With regards to claim 9, which depends on claim 1, Duan et al does not disclose generating a plurality of training files, wherein each of the training files contains one or more of the generated training records, each training file corresponding to a distinct classifier of the machine learning model.
Filimonova teaches generating a plurality of training files, wherein each of the training files contains one or more of the generated training records, each training file corresponding to a distinct classifier of the machine learning model (Filimonova, paragraph 34: “In some implementations of the method, the first MLA classifier has been trained on a first set of training objects, the second MLA classifier has been trained on a second set of training objects, the third MLA classifier has been trained on a third set of training objects; and the fourth MLA has been trained on a fourth set of training objects.” Paragraph 134: “The first training object 302 includes a first training digital document 304;” Paragraph 135: “The second training object 308 includes a second training digital document 310”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan and Filimonova such that individual classifiers which each make certain types of classifications are trained using their own training sets. This would have allowed the invention to train each classifier independently such that each classifier can identify a particular type of object (Filimonova, paragraph 19: “Each of the MLA classifiers is trained to identify a particular document type using a particular set of document features”).


Claims 15 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Duan et al in view of Le et al and Rajan et al, and further in view of Filimonova and Miranda (US 20170278015 A1; filed 3/23/2017).

With regards to claim 15, which depends on claim 11, Duan et al does not disclose generating a plurality of training files, wherein each of the training files contains one or more of the generated training records, each training file corresponding to a distinct classifier of the machine learning model, wherein the classifiers include one or more of: a text separator classifier; a cluster cut classifier; a cluster join classifier; a layout features classifier; a table cluster join classifier; a complex table cell classifier; and a region segment.
Filimonova teaches generating a plurality of training files, wherein each of the training files contains one or more of the generated training records, each training file corresponding to a distinct classifier of the machine learning model (Filimonova, paragraph 34: “In some implementations of the method, the first MLA classifier has been trained on a first set of training objects, the second MLA classifier has been trained on a second set of training objects, the third MLA classifier has been trained on a third set of training objects; and the fourth MLA has been trained on a fourth set of training objects.” Paragraph 134: “The first training object 302 includes a first training digital document 304;” Paragraph 135: “The second training object 308 includes a second training digital document 310”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan and Filimonova such that individual classifiers which each make certain types of classifications are trained using their own training sets. This would have allowed the invention to train each classifier independently such that each classifier can identify a particular type of object (Filimonova, paragraph 19: “Each of the MLA classifiers is trained to identify a particular document type using a particular set of document features”).
Miranda et al teaches wherein the classifiers include one or more of: a text separator classifier; a cluster cut classifier; a cluster join classifier; a layout features classifier (Miranda et al, see all of paragraph 32, in particular: “the text classifier 125 may determine the datafield 118 based on the layout of the log description 114. For example, the text classifier 125 may determine that text at a particular position qualifies as the datafield 118.” Fig. 1B: Datafield Classification 128 working alongside a second classifier); a table cluster join classifier; a complex table cell classifier; and a region segment.
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Duan et al, Filimonova, and Miranda et al such that the multiple classifiers include a classifier that uses the layout features to determine how to classify a portion of the document. This would have helped the invention to “meaningfully identify and classify the electronic data in a reasonable amount of time” (Miranda et al, paragraph 3).

Claim 20 recites substantially similar limitations to claim 15 and is thus rejected along the same rationales.



Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Zunger (US 20110173527 A1): Teaches determining layout regions of a webpage and annotating the document to include attributes that include geometric parameters of semantic elements.
Xie et al (US 20060123042 A1): Teaches splitting up a webpage using the layout of the webpage, then splitting up the DOM tree according to the corresponding groups.
Mansfield (US 9959259): Teaches creating a tree from an unstructured documents.
Ravikumar et al (US 20090248608 A1): Teaches training a machine learning system to segment a webpage into visual pieces the creating a graph of the web page where the graph nodes are associated with DOM tree nodes.
Taylor et al (US 5848184 A): Analyzes a document using its graphical properties and identifies the background and foreground properties of segments.
Aggarwal et al (US 11003862 B2): Teaches taking digital documents and tagging structural features using machine learning. 
Feng et al (US 20100312728 A1): Teaches splitting up a document into visual boundary blocks using DOM tree paths.
Seabright (US 20160070677 A1): Teaches tagging a document to help visually impaired individuals.
Bahrami (US 20180260389 A1): Teaches finding relationships between elements in a PDF and grouping them into segments.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRODERICK C ANDERSON whose telephone number is (313)446-6566. The examiner can normally be reached Monday-Friday 9-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen Hong can be reached on 5712724124. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/B.C.A/Examiner, Art Unit 2178                                                                                                                                                                                                        

/HOPE C SHEFFIELD/Primary Examiner, Art Unit 2178