DETAILED ACTION
This correspondence is responsive to the Application filed on April 14, 2020. Claims 1-25 are pending in the case, with claims 1, 11, 12, 20 and 21 in independent form.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Allowable Subject Matter
Claim 4-8, 14-18 are 24-25 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 9, 11 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Marda et al. (Pub. No. US 2019/0019022 A1, published Jan. 17, 2019) hereinafter Marda in view of Watanabe et al. (Pub. No. US 2018/0349693 A1, published Dec. 6, 2018) hereinafter Watanabe.

Regarding claim 1, Marda teaches:
A method, comprising: receiving two documents (i.e., Marda teaches in Figure 4 and corresponding description that two documents are received, a physical document in the field of view and a corresponding electronic document. Marda, Fig. 4, paragraphs 41-47), one of the two documents having at least one table that includes the same information as a corresponding table in the other of the two documents (i.e., Marda teaches in Figures 2A-2D and corresponding description that the physical document and electronic document have at least one corresponding table with the same information. Marda, Figs 2A-2D, paragraphs 41-47), wherein (i) one of the two documents comprises the at least one table in an unstructured table representation (i.e., Marda teaches in Figure 2C and corresponding description that the physical document contains at least one table in an unstructured table image representation. Marda, Fig. 2C,  paragraphs 44-47) and (ii) the other of the two documents comprises the at least one table in a structured table representation (i.e., Marda teaches in Figure 2D and corresponding description, that the electronic document corresponding to the unstructured document contains a corresponding table in a structured table representation. Marda, Fig. 2D, paragraphs 44-47); 
identifying text elements within the at least one table in the unstructured table representation (i.e., Marda teaches in Figure 5 and corresponding description, that text elements within the table in the unstructured physical document image are identified. Marda Fig 5, paragraphs 109-110.);  
matching the identified text elements with table elements within the at least one table in the structured table representation (i.e., Marda teaches in Figure 5 and corresponding description, that a physical document is identified and a corresponding and matching electronic structured document table elements are identified. Marda, Figure 5, paragraphs 109-110);  and 
generating an annotated version of the at least one table in the structured table representation by annotating the at least one table in the structured table representation based upon matches between the table elements and the identified text elements (i.e., Marda teaches in Figures 2C-2D and corresponding description, that annotations entered into the physical document are processed and merged with the electronic document based upon locations identified by shapes within the document. Marda, Figures 2C-2D; paragraphs 44-47), wherein the annotating comprises adding tags to the at least one table in the structured table representation that identify a location of the corresponding text element within the at least one table in the unstructured table representation.  
As discussed above, Marda teaches the matching corresponding text element within the at least one table in the unstructured table representation and the annotated version of the at least one table in the structured table representation. Marda does not specifically disclose wherein the annotations comprise adding tags to the at least one table in the structured table representation that identify a location of the corresponding text element.
However, Watanabe teaches in the field related to computer, method, and system for identifying a document (paragraph 0002). Watanabe teaches a structured template table representation that is tagged with attributes and positional information and coordinates corresponding to the attribute position on the unstructured paper form image (adding tags to the at least one table in the structured table representation that identify a location of the corresponding text element). Watanabe discloses that, The attribute 302 is a field for storing an identification name indicating a type of attribute included in the template. The positional information 303 is a field for storing information on a position on a paper surface of an attribute corresponding to the type of attribute. For example, the positional information 303 stores coordinates of the top right and top left of a rectangular region. The coordinates may be relative coordinates, or may be absolute coordinates. The positional information 303 may also store information for designating a plurality of positions. Watanabe, Figure 3, 8, paragraphs 57-60.
 It would have been obvious to one of ordinary skill in the art before the effective filing date of present application to implement the method of synching digital and physical documents of Marda using tags added to the at least one table in the structured table representation that identify a location of the corresponding text element of Watanabe, with a reasonable expectation of success, in order to enable extracting text attributes from unstructured physical paper forms with high accuracy and improving efficiency of work and reducing cost. Watanabe, para 12-13. 

Regarding claim 2, which depends from claim 1 and recites:
wherein the identifying comprises converting the unstructured table representation to machine text.
Marda in view of Watanabe teaches the method of claim 1, including identifying text elements within the at least one table in the unstructured table representation. Marda does not specifically disclose converting the unstructured representation to machine text.
However, Watanabe teaches that, First, the document examination module 211 executes the OCR processing on the document image data 701 (Step S201). The OCR processing is described in detail with reference to FIG. 11. By the OCR processing, the document image data 701 is converted into data on a character string group that can be handled by a computer (converting the unstructured representation to machine text). Watanabe, Fig 11, para 97, 127
It would have been obvious to one of ordinary skill in the art before the effective filing date of present application to implement the method of synching digital and physical documents of Marda using the feature for converting the unstructured representation to machine text of Watanabe, with a reasonable expectation of success, in order to enable extracting and handling text strings and attributes from unstructured physical paper forms. Watanabe, para 12-13, 97. 

Regarding claim 9, which depends from claim 1 and recites:
wherein the location comprises a coordinate location.  
Marda in view of Watanabe teaches the method of claim 1, including identifying text elements within the at least one table in the unstructured table representation. Marda does not specifically disclose location comprises a coordinate location.
However, Watanabe teaches that, The positional information 303 is a field for storing information on a position on a paper surface of an attribute corresponding to the type of attribute. For example, the positional information 303 stores coordinates of the top right and top left of a rectangular region (location comprises a coordinate location). The coordinates may be relative coordinates, or may be absolute coordinates. The positional information 303 may also store information for designating a plurality of positions. Watanabe, para 59.
It would have been obvious to one of ordinary skill in the art before the effective filing date of present application to implement the method of synching digital and physical documents of Marda using tags added to the at least one table in the structured table representation that identify a location of the corresponding text element, the location comprises a coordinate location of Watanabe, with a reasonable expectation of success, in order to enable extracting text attributes from unstructured physical paper forms with high accuracy and improving efficiency of work and reducing cost. Watanabe, para 12-13. 

Claim 11 recites an apparatus that substantially parallels the method of claim 1. Therefore, the analysis discussed above with respect to claim 1, also applies to claim 11. Accordingly, claim 11 is rejected under substantially the same rationale as set forth above with respect to claim 1. More specifically regarding An apparatus, comprising: at least one processor; and a computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising: computer readable program code configured (i.e., Marda, Figures 6, 7, paragraphs 123-125, 135-136, 143-144).

Claim 12 recites an computer program product that substantially parallels the method of claim 1. Therefore, the analysis discussed above with respect to claim 1, also applies to claim 12. Accordingly, claim 12 is rejected under substantially the same rationale as set forth above with respect to claim 1. More specifically regarding A computer program product, comprising: a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code executable by a processor and comprising: computer readable program code configured (i.e., Marda, Figures 6, 7, paragraphs 123-125, 135-136, 143-144).


Claims 3, 10, 13 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Marda in view of Watanabe as applied to claim 1 above, and further in view of Duta (Pub. No. US 2019/0340240 A1, published Nov. 7, 2019).

Regarding claim 3, which depends from claim 1 and recites:
comprising sorting the identified text elements based upon a reading direction, wherein the reading direction comprises text elements from left to right and from top to bottom.  
Marda in view of Watanabe teaches the method of claim 1 from which claim 3 depends, including identifying text elements within the at least one table in the unstructured table representation. Marda teaches mapping the identified elements to the corresponding electronic document based on reading direction left to right and top to bottom. Figure 5, paragraphs 0109-0110. Marda does not specifically disclose sorting.
However, Duta teaches in the field related to text and image based electronic documents containing tables. Duta, abstract, para 1. Duta teaches that, The Table Extractor tracks the characters in each token, each token's page, physical position, and optionally the corresponding fonts, font sizes, and start/end indices in the original sorted character list. In addition, the Table Extractor sorts these tokens from top to bottom and left to right with respect to the corresponding page of the arbitrary document from which the tokens were generated (sorting the identified text elements based upon a reading direction, wherein the reading direction comprises text elements from left to right and from top to bottom). Duta, para 75, 95.
It would have been obvious to one of ordinary skill in the art before the effective filing date of present application to implement the method of synching digital and physical documents of Marda using tags added to the structured table representation that identify a location of the corresponding text element of Watanabe and sorting the identified text elements based upon a reading direction, wherein the reading direction comprises text elements from left to right and from top to bottom of Duta, with a reasonable expectation of success, in order to enable extracting text attributes from unstructured physical paper forms with high accuracy and improving efficiency of work and reducing cost and to facilitate natural language processing on semantic relationships between elements in tables. Watanabe, para 12-13. Duta, abstract, para 1, 75, 95.

Regarding claim 10, which depends from claim 1 and recites:
comprising utilizing the annotated version within at least one of a testing dataset for a machine-learning model and a training dataset for a machine-learning model.  
Marda in view of Watanabe teaches the method of claim 1 from which claim 10 depends, including the annotated version. Marda does not specifically disclose utilizing the annotated version within at least one of a testing dataset for a machine-learning model and a training dataset for a machine-learning model.  
However, Duta teaches that, the original arbitrary documents received as input, in combination with some or all of the various information generated by the Table Extractor during the table identification process (e.g., character alignments and tokenization, token alignments, generation of table candidates, identification of row and/or column headers, generation of tuples, etc.) are provided as automatically generated labeled training examples for use with a variety of machine-learning processes (utilizing the annotated version within at least one of a testing dataset for a machine-learning model and a training dataset for a machine-learning model). In general, these labeled training examples, optionally in combination with one or more hand-authored training examples, are provided as input to various machine-learning processes to learn one or more table extraction models or networks. In various implementations, the Table Extractor then applies one or more of these machine-learned models or networks to automate some or all of the steps for delimiting and extracting tables from arbitrary documents, and for generating text-based relational functions (e.g., the aforementioned tuples) on those tables. These text-based relational functions are then optionally processed via NLP-based techniques to extract semantic information for answering queries on the extracted tables. Duta, para 9, 141.
It would have been obvious to one of ordinary skill in the art before the effective filing date of present application to implement the method of synching digital and physical documents of Marda using tags added to the structured table representation that identify a location of the corresponding text element of Watanabe and utilizing the annotated version within at least one of a testing dataset for a machine-learning model and a training dataset for a machine-learning model of Duta, with a reasonable expectation of success, in order to enable extracting text attributes from unstructured physical paper forms with high accuracy and improving efficiency of work and reducing cost and to facilitate natural language processing on semantic relationships between elements in tables. Watanabe, para 12-13. Duta, abstract, para 1, 75, 95.

Claims 13 and 19 recite computer program products that substantially parallel the methods of claims 3 and 10, respectively. Therefore, the analysis discussed above with respect to claims 3 and 10, also apply to claims 13 and 19, respectively. Accordingly, claims 13 and 19 are rejected under substantially the same rationale as set forth above with respect to claim 3 and 10, respectively.

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Marda in view of Watanabe and Emmanual et al. (Pub. No. US 2017/0091162 A1, published Mar. 30, 2017, issued patent cited on Applicant’s IDS dated 4/14/2020) hereinafter Emmanuel.

Regarding claim 20, Marda teaches:
A method, comprising: receiving two representations (i.e., Marda teaches in Figure 4 and corresponding description that a physical document representation in the field of view and a corresponding electronic document representation are received. Marda, Fig. 4, paragraphs 41-47) of at least one table, wherein one of the two representations comprises a non-native table representation (i.e., Marda teaches in Figure 2C and corresponding description that the image document contains an image with at least one table (non-native table representation). Marda, Fig. 2C, paragraphs 44-47) and wherein the other of two representations comprises a native table representation (i.e., Marda teaches in Figure 2D and corresponding description, that the electronic document corresponding to the unstructured document contains a corresponding table (native table representation). Marda, Fig. 2D, paragraphs 44-47), wherein the non-native table representation comprises a table in an unstructured format (i.e., Marda teaches in Figure 2C and corresponding description that the image document contains an image with at least one table (non-native unstructured format table representation). Marda, Fig. 2C, paragraphs 44-47) and wherein the native table representation comprises a table in a structured format (i.e., Marda teaches in Figure 2D and corresponding description, that the electronic document corresponding to the unstructured document contains a corresponding table (native structured format table representation). Marda, Fig. 2D, paragraphs 44-47); 
identifying text tokens comprising text within the non-native table representation;
Marda teaches identifying text elements comprising text within the non-native table representation in Marda illustrates and discloses in Figure 5 and corresponding description, that text elements within the table in the physical document are identified. Marda Fig 5, paragraphs 109-110. Marda does not specifically disclose text tokens.
However, Emmanuel teaches in the field related to annotating embedded tables for text analytics. Text analytics systems extract free text from a whole range of different document formats (e.g., plain text, Word, PDF). Emmanuel, paragraphs 1-2. FIG. 4 is an example of a matrix which implements tokenization and corresponding lines, in accordance with an embodiment of the present invention. Emmanuel, Fig 4, paragraph 9, 34. Algorithms are used by analytics module 115 to determine the readability of text amenable for human detection, positioning of text, tokenizing text (text and table tokens), and validation of text to properly annotate the text. Emmanuel, Fig 4, paragraphs 20, 30-34.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement method of synching digital and physical documents of Marda using the text and table tokens of Emmanuel, with a reasonable expectation of success, in order to correctly and consistently interpret the source text data for further processing. Emmanuel, paragraph 14, 32.
 matching the identified elements with table tokens of the table within the native table representation, wherein the matching comprises identifying text tokens having text matching table tokens; and 
Marda teaches matching the identified text elements with table elements within the at least one table in the structured table representation in that Marda discloses in Figure 5 and corresponding description, that a physical document is identified and a corresponding electronic document table elements are identified. Marda, Figure 5, paragraphs 109-110. As discussed above, Marda in view of Emmanuel teaches identified text and table tokens.
adding tags to the table tokens within the native table representation, wherein a given tag identifies a location of a text token within the non-native table representation corresponding to the table token having the given tag. 
Marda teaches generating an annotated version of the at least one table in the structured table representation by annotating the at least one table in the structured table representation based upon matches between the table elements and the identified text elements in in Figures 2C-2D and corresponding description, that annotations entered into the physical document are processed and merged with the electronic document based upon locations identified by shapes within the document. Marda, Figures 2C-2D; paragraphs 44-47. As discussed above, Marda in view of Emmanuel teaches identified text and table tokens.

Marda in view of Emmanuel does not specifically adding tags to the at least one table in the structured table representation that identify a location of the corresponding text element.
However, Watanabe teaches in the field related to computer, method, and system for identifying a document (paragraph 0002). Watanabe teaches a structured template table representation that is tagged with attributes and positional information and coordinates corresponding to the attribute position on the unstructured paper form image (adding tags to the at least one table in the structured table representation that identify a location of the corresponding text element). Watanabe discloses that, The attribute 302 is a field for storing an identification name indicating a type of attribute included in the template. The positional information 303 is a field for storing information on a position on a paper surface of an attribute corresponding to the type of attribute. For example, the positional information 303 stores coordinates of the top right and top left of a rectangular region. The coordinates may be relative coordinates, or may be absolute coordinates. The positional information 303 may also store information for designating a plurality of positions. Watanabe, Figure 3, 8, paragraphs 57-60.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement method of synching digital and physical documents of Marda using the text and table tokens of Emmanuel and adding tags to the at least one table in the structured table representation that identify a location of the corresponding text element of Watanabe, with a reasonable expectation of success, in order to correctly and consistently interpret the source text data for further processing in order to enable extracting text attributes from unstructured physical paper forms with high accuracy and improving efficiency of work and reducing cost. Emmanuel, paragraph 14, 32. Watanabe, para 12-13.

Claims 21 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Marda in view of Blanchflower et al. (Pub. No. US 2014/0006369 B1, published Jan. 2, 2014) hereinafter Blanchflower and Koduru (Pub. No. US 2016/0055376 A1, published Feb. 25, 2016).

Regarding claim 21, Marda teaches:
A computer-implemented method for use with a structured document A and an unstructured document B (i.e., Marda teaches in Figure 4 and corresponding description that a physical document (unstructured document B) in the field of view and a corresponding electronic document (structured document A) are received. Marda, Fig. 4,6, 7, paragraphs 41-47), the method comprising: 
(a) finding content in document B that is similar to a table T1 in document A, by identifying content that is common to both documents A and B, wherein the table T1 includes at least two cells (i.e., Marda teaches in Figures 2A-2D and corresponding description that the physical document (common table content in unstructured document B) and electronic document (common table T1 in structured document A) have at least two corresponding table cells with the same information. Marda, Figs 2A-2D, paragraphs 41-47); 
(b) inferring the presence of a table T2 in document B, in view of the common content (i.e., Marda teaches in Figure 5 and corresponding description, that text elements within the table in the physical document (table T2 in document B) are identified and corresponding electronic document table elements (table T1 in document A) are identified.  Marda identifies the common table cells content, infers the presence of the tables in the documents and that annotations entered into the physical document are to be processed and merged with the electronic document based upon locations identified by shapes within the document. Marda, Figures 2C-2D; paragraphs 44-47. Marda Fig 5, 2A-2D, paragraphs 109-110, 41-47); 
P201911037US01Page 37 of 40(c) for each cell in table T1, identifying content in table T2 that corresponds to content in that cell in table T1, by matching text snippets from tables T1 and T2 in view of a similarity threshold, thereby identifying one or more cells within T2;  and 
As discussed above, Marda teaches matching content in  tables T1 and T2 cells. Marda does not specifically disclose by matching text snippets in view of a similarity threshold.
However, Blanchflower teaches in the field related to data management systems including unstructured data and structured data. Blanchflower, paragraphs 1, 2. Blanchflower teaches that  Correlating structured data and unstructured data can refer to determining correlative patterns in the structured data and the unstructured data. Blanchflower, Fig 2, paragraphs 11, 19-20. In some implementations, the determination of correlative patterns (referred to as "correlating" or "correlation" in this discussion) includes finding a first pattern in the structured data and a second pattern in the unstructured data, and determining a degree of similarity between the first and second patterns. Generally, patterns found in different data collections may not match exactly. As a result, techniques or mechanisms according to some implementations determine degrees of similarity based on how close (conceptually) the patterns are to each other conceptually (matching text snippets in view of a similarity threshold). For example, consider the phrase "low-drag wing design expert" as compared to "high-efficiency aerofoil designer." These words do not match exactly, but they express similar ideas. (matching text snippets in view of a similarity threshold).Techniques or mechanisms according to some implementations can thus determine conceptual distances between different patterns, such as the text strings above. Blanchflower, Fig 2, paragraphs 19, 20, 11.
It would have been obvious to one of ordinary skill in the art before the effective filing date of present application to implement the method of synching digital and physical documents of Marda using the matching text snippets in view of a similarity threshold feature of Blanchflower, with a reasonable expectation of success, in order to correlate structured data with unstructured data, which allows for access and analytics to be performed with respect to the structured and unstructured data in a more integrated manner. Blanchflower, abstract, Fig 2, paragraph 11. 
(d) determining the boundary of table T2, by aggregating the cells identified in T2 and then identifying the extrema of the aggregated cells; 
As discussed above, Marda in view of Blanchflower teaches the tables T1 and T2 and the identified cells. Marda in view of Blanchflower does not specifically disclose determining the boundary of table, by aggregating the cells and then identifying the extrema of the aggregated cells.
However, Koduru teaches in the field related to document management systems and methods and particularly relates to a method and system for extracting structured data in electronic documents using Optical Character Recognition (OCR). Koduru, paragraph 1.Koduru teaches that, The plurality of documents 101 is in the form of either one or more physical sheets of paper, or a digital file containing images of one or more sheets of paper. The digital file can be in one of many formats, such as PDF, TIFF, BMP, or JPEG. The system employs image processing techniques on the document to segment the document image and to isolate potential content areas. The documents 101 are then provided to an OCR engine 102 which produces a text output. Further the OCR recognized text is inputted to the text extraction module 103, which extracts text from scanned documents with location on page data. The extracted text is then passed to a data processing module 105 through a user interface 104. The data processing module 105 is adapted for identifying tables in a page using patterns in text placement in rows and columns, identifying the boundaries and edges of tables using pattern recognition methods (determining the boundary of table, by aggregating the cells and then identifying the extrema of the aggregated cells (aggregated rows and columns pattern and identifying extrema pattern boundaries)) and identifying table borders using page information on location and defines a data structure for extraction after table borders, rows and columns are identified. Koduru, paragraph 37. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of present application to implement the method of synching digital and physical documents of Marda using the matching text snippets in view of a similarity threshold feature of Blanchflower and determining the table boundary by aggregating the cells and identifying their extrema of Koduru, with a reasonable expectation of success, in order to correlate structured data with unstructured data, which allows for access and analytics to be performed with respect to the structured and unstructured data in a more integrated manner and to enable identifying and extracting content from various data forms with minimal manual intervention. Blanchflower, abstract, Fig 2, paragraph 11. Korduru, paragraph 5.
and (e) repeating steps (a) through (d) for additional tables in document A, thereby identifying tables and corresponding cells in document B.  
However, the identifying, matching, and determining steps taught by the combination of Marda in view of Blanchflower and Korduru needs no alteration to apply to additional tables in document A and tables and corresponding cells in document B. Simply repeating the identifying, matching, and determining tables and corresponding cells is an insignificant step and would be obvious to anyone of ordinary skull in the art. Performing repetitive calculations and processes is the most basic function of a computing device and the performance of repetitive processes, steps or calculations does not impose meaningful limits on recitations of the claims.
One of ordinary skull in the art before the effective filing date of the present application would be motivated to repetitively apply the identifying, matching, and determining table and corresponding cells of Marda in view of Blanchflower and Korduru to additional tables in document A and tables and corresponding cells in document B in order to integrate additional corresponding tables in unstructured and structured documents.

Regarding claim 22, which depends from claim 21 and recites:
wherein optical character recognition (OCR) is employed to extract text from the unstructured document B.  
Marda in view of Blanchflower and Korduru teaches the method of claim 21 from which claim 22 depends, including the unstructured document B. Marda does not specifically disclose optical character recognition (OCR) is employed to extract text from the unstructured document. 
However, Korduru teaches that, the digital file can be in one of many formats, such as PDF, TIFF, BMP, or JPEG. The system employs image processing techniques on the document to segment the document image and to isolate potential content areas. The documents 101 are then provided to an OCR engine 102 which produces a text output. Further the OCR recognized text is inputted to the text extraction module 103, which extracts text from scanned documents (optical character recognition (OCR) is employed to extract text from the unstructured document) with location on page data. The extracted text is then passed to a data processing module 105 through a user interface 104. The data processing module 105 is adapted for identifying tables in a page using patterns in text placement in rows and columns, identifying the boundaries and edges of tables using pattern recognition methods and identifying table borders using page information on location and defines a data structure for extraction after table borders, rows and columns are identified. Further, the data extraction module 106 enables the user interface 104 for data extraction and validation. Korduru, paragraph 37.
It would have been obvious to one of ordinary skill in the art before the effective filing date of present application to implement the method of synching digital and physical documents of Marda using the matching text snippets in view of a similarity threshold feature of Blanchflower and determining the table boundary by aggregating the cells and identifying their extrema and using optical character recognition (OCR) is to extract text from the unstructured document of Koduru, with a reasonable expectation of success, in order to correlate structured data with unstructured data, which allows for access and analytics to be performed with respect to the structured and unstructured data in a more integrated manner and to enable identifying and extracting content from various data forms with minimal manual intervention. Blanchflower, abstract, Fig 2, paragraph 11. Korduru, paragraph 5.

Claim 23 is rejected under 35 U.S.C. 103 as being unpatentable over Marda in view of Blanchflower and Korduru as applied to claim 21 above, and further in view of LaComb et al. (Pub. No. US 2004/0193520 A1, published Sep. 30, 2004, and cited on Applicant’s IDS dated 4/14/2020) hereinafter LaComb.

Regarding claim 23, which depends from claim 21 and recites:
wherein document A has an html format, and document B has a PDF format.  
Marda in view of Blanchflower and Korduru teaches the method of claim 21 from which claim 23 depends, including the structured document A and the unstructured document B. Marda does not specifically disclose an html format and a PDF format. 
However, Korduru teaches that, the digital file can be in one of many formats, such as PDF (document has a PDF format), TIFF, BMP, or JPEG. Korduru, paragraph 37.
It would have been obvious to one of ordinary skill in the art before the effective filing date of present application to implement the method of synching digital and physical documents of Marda using the matching text snippets in view of a similarity threshold feature of Blanchflower and determining the table boundary by aggregating the cells and identifying their extrema and using optical character recognition (OCR) is to extract text from the unstructured document and the document that has a PDF format of Koduru, with a reasonable expectation of success, in order to correlate structured data with unstructured data, which allows for access and analytics to be performed with respect to the structured and unstructured data in a more integrated manner and to enable identifying and extracting content from various data forms with minimal manual intervention and in order to process documents in many formats. Blanchflower, abstract, Fig 2, paragraph 11. Korduru, paragraph 5.
Thus, Marda in view of Blanchflower and Korduru teaches the structured document A and the unstructured document B has an html format. Marda in view of Blanchflower and Korduru does not specifically disclose an html format.
However, LaComb teaches in the field related to systems and methods for automatically processing electronic documents. More specifically, the present invention relates to systems and methods that automatically understand and decompose unstructured tabular information from ASCII-formatted documents. LaComb, paragraph 2. LaComb teaches that, Although many embodiments described herein relate to electronic ASCII-formatted financial documents, many other types and formats of documents could be utilized in this invention. For example, the tabular documents could be formatted as Microsoft Office documents and/or spreadsheets, PDF files, Postscript files, HTML documents (document has an html format), or the like. LaComb, paragraph 16.
It would have been obvious to one of ordinary skill in the art before the effective filing date of present application to implement the method of synching digital and physical documents of Marda using the matching text snippets in view of a similarity threshold feature of Blanchflower and determining the table boundary by aggregating the cells and identifying their extrema and using optical character recognition (OCR) is to extract text from the unstructured document and the document that has a PDF format of Koduru and the document that has an html format of LaComb, with a reasonable expectation of success, in order to correlate structured data with unstructured data, which allows for access and analytics to be performed with respect to the structured and unstructured data in a more integrated manner and to enable identifying and extracting content from various data forms with minimal manual intervention and in order to process documents in many formats, including pdf and html formats. Blanchflower, abstract, Fig 2, paragraph 11. Korduru, paragraph 5. LaComb, paragraph 16.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BARBARA M LEVEL whose telephone number is (303)297-4748. The examiner can normally be reached Monday through Friday 8:00 AM - 5:00 PM MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Scott T Baderman can be reached on (571) 272-3644. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BARBARA M LEVEL/Examiner, Art Unit 2144