DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 12/30/2021 have been fully considered but they are not persuasive.
Applicant states (pp. 8) that the cited art does not teach the amended limitation “derive a score as to changes required to derive a value of one cell to the value of the other cell, wherein a score is given as to the changes.” Examiner respectfully disagrees.
Hino calculates similarity between every pair of corresponding cells in the same column of two rows, by comparing them in character similarity [0058] and character type similarity [0066]. Hino calculates character similarity between a pair of string cells based on the standard Levenshtein distance (i.e., score) in information theory, which is the number of steps (i.e., changes) required to produce (i.e., derive) the second string by inserting, replacing, deleting, or adding a character to the first string [0059].
In summary, Hino teaches the amended limitation of independent claims 1, 7 and 13.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


Claims 1-5, 7-11, and 13-18 are rejected under 35 U.S.C. 103 as being unpatentable over Hino et al. US patent application 2009/0313205 [herein “Hino”], in view of Wang et al. Understanding Tables on the Web. International Conference on Conceptual Modeling. 2012, pp 141-155 [herein “Wang”], and Ferrucci et al. Building Watson: An Overview of the DeepQA Project. AI Magazine, 2010, pp. 1-26 [herein “Ferrucci”], and further in view of Raschka. About Feature Scaling and Normalization – and the effect of standardization for machine learning algorithms. https://sebastianraschka.com/Articles/2014_about_feature_scaling.html, 2014, pp. 1-20 [herein “Raschka”].
Claim 1 recites “A computer-implemented method implemented by a knowledge management system for detecting headers in a document, comprising: receives input questions; parses major features of the input questions that are applied to formulate queries to a corpus of data provided by a content creator; wherein the input questions are received by the content creator from information handling systems;”
Hino teaches a method of processing documents by analyzing the structure of table data [0002], but does not disclose how to apply it in a question answering system; however, IBM Watson QA system (i.e., content creator) constructs a corpus of data, takes natural language questions from users, and deeply analyzes the breadth of relevant content from the corpus to more precisely answer and justify answers to user’s questions (Ferrucci: pp. 12/26).
Hino with Ferrucci. One having ordinary skill in the art would have found motivation to incorporate Hino’s method in Ferrucci’s system to enable deep analysis of tabular data in the relevant content from the corpus of documents.
Claim 1 further recites “the input questions directed to: performing pair wise comparisons between cells in each orthogonal column or row for each row or column in a two dimensional table that is part of the corpus of data to derive a score as to changes required do derive a value of one cell to the value of the other cell, wherein a score is given as to the changes, for each pair wise comparison;”
Hino calculates similarity between every pair of corresponding cells (i.e., pairwise) in the same column of two rows (i.e., orthogonal column), by comparing them in character similarity [0058] and character type similarity [0066]. Hino calculates character similarity between a pair of string cells based on the standard Levenshtein distance (i.e., score) in information theory, which is the number of steps (i.e., changes) required to produce (i.e., derive) the second string by inserting, replacing, deleting, or adding a character to the first string [0059].
Hino does not disclose if this comparison is performed between all pairs of rows; however, Wang compares data types between every pair of corresponding cells in the same column of any two rows, and identify all rows above a threshold as candidate headers (Wang: sec. 3.2, para. 2).
Hino with Wang. One having ordinary skill in the art would have found motivation to use Hino’s similarity calculation to determine similarity between corresponding cells of all pairs of rows in Wang.
Claim 1 further recites “summing the scores of the pair wise comparisons for each orthogonal column or row to derive a summed score;” In Hino, overall similarity between every pair of corresponding cells in the same column of two rows is a weighted sum of character similarity and character type similarity ([0073]).
Claim 1 further recites “summing the summed scores of the orthogonal columns or rows to derive a score for each row or column,” Hino calculates series similarity between two rows as a weighted average (i.e., weighted sum divided by the length of rows) of overall similarity between all pairs of corresponding cells in the two rows ([0079]).
Claim 1 further recites “wherein the scores are scaled using a min-max scaler to normalized values between 0 and 1; and”.
Hino calculates similarity between a pair of rows based on the standard Levenshtein distance in information theory [0059]. Hino does not disclose normalizing Levenshtein distance; however, feature scaling such as min-max scaling is a simple and standard data processing method to normalize data features to a fixed range of 0 to 1 (Raschka: pp. 3/20). One having ordinary skill in the art would have found motivation to Hino’s similarity scores into metric values between 0 and 1.
Claim 1 further recites “comparing normalized values of each row or column to determine the likelihood of headers.” Hino determines that a pair of rows contain the boundary between a header row and a data row if its series similarity is below a threshold – the two rows are significantly different ([0078]).
Claim 13 is analogous to claim 1, and is similarly rejected.

Claim 2 recites “The method of claim 1, wherein: the pair wise comparison is between a common cell in a row or column and an orthogonal column or row, and each of the other cells in the orthogonal column or row.”
Hino calculates similarity between every pair of corresponding cells (i.e., pairwise) in the same column of two rows (i.e., orthogonal column), by comparing them in character similarity ([0058]) and character type similarity ([0066]). Hino does not disclose if this comparison is performed between all pairs of rows; however, Wang compares data types between every pair of corresponding cells in the same column of any two rows, and identify all rows above a threshold as candidate headers (Wang: sec. 3.2, para. 2).
Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Hino with Wang. One having ordinary skill in the art would have found motivation to use Hino’s similarity Wang.
Claim 17 is analogous to claim 2, and is similarly rejected.

Claim 3 recites “The method of claim 1, wherein: the pair wise comparison is a global distance comparison.” Hino’s character similarity between a pair of cells is computed using Levenshtein distance (i.e., a global distance comparison) ([0058], [0066]).

Claim 4 recites “The method of claim 1, wherein: the pair wise comparison uses global Levenshtein distance comparison.” Hino’s character similarity between a pair of cells is computed using Levenshtein distance ([0058], [0066]).
Claim 16 is analogous to claim 4, and is similarly rejected.

Claim 5 recites “The method of claim 1, wherein: the pair wise comparison is based on a feature chosen from a list that includes: data types, fonts styles, text similarity (i.e., edit distance), cell alignments, text indentation, font sizes, font colors, number of characters, percent symbolic characters, percent numeric characters, and percent date/time/year/address/area/money/percentage cells.” Hino’s character similarity (i.e., text similarity) between a pair of cells is computed using edit distance ([0058]).
Claims 11 and 15 are analogous to claim 5, and are similarly rejected.

Claim 7 recites “A knowledge manager system comprising: 2a processor; 3a data bus coupled to the processor; and a computer-usable medium embodying computer program code, the computer-5usable medium being coupled to the data bus, the computer program code used for header 6detection and comprising instructions executable by the processor and configured for a knowledge manager device to perform the steps of: receiving input questions; parsing major features of the input questions that are applied to formulate queries to a corpus of data provided by a content creator; wherein the input questions are received by the content creator from information handling systems;”
Hino teaches a method of processing documents by analyzing the structure of table data [0002], but does not disclose how to apply it in a question answering system; however, as referenced in the instant specification [0029]-[0031], IBM Watson QA system (i.e., content creator) constructs a corpus of data, takes natural language questions from users, and deeply analyzes the breadth of relevant content from the corpus to more precisely answer and justify answers to user’s questions (Ferrucci: pp. 12/26).
Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Hino with Ferrucci. One having ordinary skill in the art would have found motivation to incorporate Hino’s method in Ferrucci’s system to enable deep analysis of tabular data in the relevant content.
Claim 7 further recites “the input questions directed to: calculating pair wise comparisons in a two dimensional table, between cells in 8each row or column with cells in columns or rows that are orthogonal to 9each row or column in a two dimensional table that is part of the corpus of data to derive a score as to changes required to derive a value of one cell to the value of the other cell, wherein a score is given as to the changes, for each pair wise comparison;”
Hino calculates similarity between every pair of corresponding cells (i.e., pairwise) in the same column of two rows (i.e., orthogonal column), by comparing them in character similarity [0058] and character type similarity [0066]. Hino calculates character similarity between a pair of string cells based on the standard Levenshtein distance (i.e., score) in information theory, which is the number of steps (i.e., changes) required to produce (i.e., derive) the second string by inserting, replacing, deleting, or adding a character to the first string [0059].
Hino does not disclose if this comparison is performed between all pairs of rows; however, Wang compares data types between every pair of corresponding cells in the same column of any two rows, and identify all rows above a threshold as candidate headers (Wang: sec. 3.2, para. 2).
Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Hino with Wang. One having ordinary skill in the art would have found motivation to use Hino’s similarity calculation to determine similarity between corresponding cells of all pairs of rows in Wang.
Claim 7 further recites “summing for each row or column, pairwise comparison scores of each of the 11orthogonal columns or rows to derive a summed column or row score;” In Hino, overall similarity between every pair of corresponding cells in the same column of two rows is a weighted sum of character similarity and character type similarity ([0073]).
Claim 7 further recites “adding summed column or rows of the orthogonal columns or rows to derive a 13value for each row or column;” Hino calculates series similarity between two rows as a weighted average (i.e., weighted sum divided by the length of rows) of overall similarity between all pairs of corresponding cells in the two rows ([0079]).
Claim 7 further recites “wherein the values are scaled using a min-max scaler to normalized values between 0 and 1; and”.
Hino calculates similarity between a pair of rows based on the standard Levenshtein distance in information theory [0059]. Hino does not disclose normalizing Levenshtein distance; however, feature scaling such as min-max scaling is a simple and standard data processing method to normalize data features to a fixed range of 0 to 1 (Raschka: pp. 3/20). One having ordinary skill in the art would have found motivation to utilize min-max scaling to normalize Hino’s similarity scores into metric values between 0 and 1.
Claim 7 further recites “performing a relative comparison of the normalized values of each of the rows or columns to 15determine likelihood of headers in each row or column.” Hino determines that a pair of rows contain the boundary between a header row and a data row if its series similarity is below a threshold – the two rows are significantly different ([0078]).

Claim 8 recites “The system of claim 7, wherein: the two dimensional table is derived from a PDF or HTML document.” Hino acquires table data from HTML documents ([0050]).
Claim 14 is analogous to claim 8, and is similarly rejected.

Claim 9 recites “The system of claim 7, wherein: the two dimensional table includes spanning rows or columns.” Hino's method applies to two rows that share data items (i.e., spanning rows or columns) (fig. 3-4; [0081] and [0084]).
Claim 18 is analogous to claim 9, and is similarly rejected.

Claim 10 recites “The system of claim 7, wherein: the pair wise comparison is a Boolean comparison between cells.” Hino does not disclose this claim; however, Wang's similarity score of a pair of cells includes a fact (i.e., Boolean value) whether the two cells have different data types (Wang: sec. 3.2, para. 2).
Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Hino with Wang. One having ordinary skill in the art would have found motivation to incorporate Wang’s fact on data type difference into Hino’s similarity calculation between pairs of cells.

Claims 6 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Hino and Wang as applied to claims 1 and 7 above respectively, and further in view of US patent application 2014/0369602 Meier et al. [herein “Meier”].
Claim 6 recites “The method of claim 1, wherein: performing the pair wise calculation for the same cell pairs implements dynamic programming call up previously determined cell pair calculation.” Hino does not disclose this claim; however, Meier uses dynamic programming in processing tabular data to find the best match string format based on two scores: edit distance and OCR error matrix.
Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Hino with Meier. One having ordinary skill in the art would have found motivation to utilize Meier’s dynamic programming in computing Hino’s character similarity between a pair of cells.
Claim 12 is analogous to claim 6, and is similarly rejected.

Claims 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Hino and Wang as applied to claim 13 above, and further in view of US patent application 2019/0370540 Freed et al. [herein “Freed”].
Claim 19 recites “The non-transitory, computer-readable storage medium of claim 13, wherein the computer executable instructions are deployable to a client system from a server system at a remote location.” Hino does not disclose this claim; however, in Freed, the program code may be executed (i.e., deployed) on the user's computer (i.e., client system) connected to a remote computer (i.e., server system) [0020].
Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Hino with Freed. One having ordinary skill in the art would have found motivation to implement Hino’s table structure analyzing method in a client-server system taught by Freed.

Claim 20 recites “The non-transitory, computer-readable storage medium of claim 13, wherein the computer executable instructions are provided by a service provider to a user on an on-demand basis.” Hino does not disclose this claim; however, in Freed, the program code may be executed entirely on a remote computer, connected to the user's computer through the Internet using an Internet Service Provider [0020].
Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Hino with Freed. One having ordinary skill in the art would have found motivation to offer Hino’s table structure analyzing method as a service provided to users on-demand as taught by Freed.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHELLY X. QIAN whose telephone number is (408)918-7599. The examiner can normally be reached Monday - Friday 8-5 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tony Mahmoudi can be reached on (571)272-4078. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SHELLY X QIAN/Examiner, Art Unit 2163                                                                                                                                                                                                        



/ALEX GOFMAN/Primary Examiner, Art Unit 2163