DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
Applicant’s Information Disclosure Statement, filed 12/09/2020, has been received, entered into the record, and considered.  See attached form PTO-1449.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5-9, 11-15, and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Chan; Pak On et al. (“Chan”) US 20210406266 A1 in view of GUGGILLA; Chinnappa et al. (“GUGGILLA”) US 20200073882 A1 and Burdick; Douglas Ronald (“Burdick”) US 20200042785 A1.
Regarding claim 1, teaches Chan teaches A computer-implemented method (CIM) comprising: 
receive, from a corpus of text documents, a plurality of text fragments, with the plurality of text fragments being text strings that are organized in a staggered manner as Data sources 104a and 104b through 104n may comprise data sources and/or data systems, which are configured to make data available to any of the various constituents of operating environment 100, or system 200 described in connection to FIG. 2 ([0040]).
Example system 200 includes the table recognition component 203. The table recognition component 203 is generally responsible for detecting or identifying a table (e.g., an EXCEL sheet table or relational database table) from a document (e.g., an EXCEL sheet or spreadsheet) ([0043 and 0047]);
The preprocessing component 204 is generally responsible for formatting, tagging, and/or structuring the data in the detected table in some particular way … In these embodiments, some preprocessing rules may specify to classify or predict a specific element to be sensitive or displayed only if another element in the same row has a particular header cell value. For example, if the goal is to determine whether a piece of information is sensitive, and it was only determined that birthplaces within the United States is sensitive (and not birthplaces outside of the United States), then the preprocessing component 204 can replace the birthplace of places outside of the Unites states with the string NA, indicating that this data will not be analyzed [0046].
identify, from the plurality of text fragments, a first arrangement of the plurality of text fragments as the element feature representation component 205 is generally responsible for extracting one or more features from one or more elements (e.g., cells or group of cells) of the table and representing the magnitude or value of those features in a particular format that is readable by computers. For example, in some embodiments, the element feature representation component 205 analyzes each word or other character sequence of a given cell and uses Natural Language Processing (NLP) to tag the character sequence with a particular Part-of-Speech (POS) tag (e.g., [“John”, proper noun]) (e.g., a first feature) and tags each character of the character sequence with a character type (e.g., for the word “John” [letter, letter, letter, letter]) (e.g., a second feature) [0050]; 
For example, a first cell of a first row may have the value “John” (e.g., corresponding to a name ID) included in a first vector and a second cell immediately adjacent to the first cell of the same row may have the value “Jake” (e.g., corresponding to John's physician) included in a second vector [0055].
responsive to the identification of the first arrangement, extracting a first set of text fragments to obtain an extracted set of text fragments as For some or each of these features, some embodiments generate or derive a feature vector that computers are configured to analyze. For example, using the illustration above, if the word “John” is the only information contained in a cell, “John” can be converted into a first feature vector via vector encoding (e.g., one hot encoding). For instance, the word “John” may be converted into the vector [1,0,0,0,0,0,0,0,0,0] [0051]; 
Each of the values “John” and “Jake” can be aggregated into a single vector [0055].
generating a relation model based upon the extracted set of text fragments, with the relation model including information indicative of a semantic relatedness value defining the extracted set of text fragments as the element feature representation component 205 aggregates each feature value of a vector based on performing a contextual linear function or otherwise combining the output (e.g., a dot product or a softmax function) where the output is a feature vector or vector space embedding. The feature vector may thus be indicative of the actual coordinates that a feature vector will be embedded in feature space. For example, using the illustration above, the encoded “John” feature vector [1,0,0,0,0,0,0,0,0,0] can be converted or encoded to an output layer vector [1,2], which is the 2-dimensional plotting coordinates in feature space [0052]; 

 Each of the values “John” and “Jake” can be aggregated into a single vector. This aggregation of entries (as well as any aggregation of other entries, such as columns) may be yet another feature vector referred to herein as “contextualized vectors.” In this way, each row and the features of a row are modeled [0055].

The decision statistic component 215 (e.g., ~ relational model) is generally responsible for generating a decision statistic (e.g., classification prediction, regression prediction, or clustering prediction) based at least in part on functionality performed by the element feature representation component 205, the row feature aggregation component 207, the column feature aggregation component 209, the column-row feature aggregation component 211 the row-column feature aggregation component 213, and/or the concatenation component 214 … the decision statistic component 215 may classify “John” as being a “patient” (whose sensitive information is included in the table), as opposed to classifying John as the “boss” or some other person related to the patient. For instance, the same table may include the name “Jane” in the same record for “John.” However, after embodiments analyze the entire table, it may be determined that the column header of “Jane” is “supervisor,” which indicates that Jane is a supervisor of John rather than a patient [0070].
Cosine similarity is a measure of similarity between two non-zero feature vectors of an inner product space that measures the cosine of the angle between the two non-zero feature vectors [0071].
clustering the extracted set text fragments into a first group, based, at least in part, upon the relation model as When the data includes noisy data, the preprocessing component 24 may include data binning, clustering, employing a machine learning model, and/or manual removal. For example, substantially continuous data (e.g., data from table rows) from the data can be grouped together into a smaller number of “bins” (e.g., if raw training data includes every age from 0-100, the ages may be “binned” into groups of ages at five year intervals). As another example, similar data may be grouped together (e.g., into the same cluster or class) [0048]; 
combining the clustered set of text fragments from the first group into composite text objects as For example, a first cell of a first row may have the value “John” (e.g., corresponding to a name ID) included in a first vector and a second cell immediately adjacent to the first cell of the same row may have the value “Jake” (e.g., corresponding to John's physician) included in a second vector. Each of the values “John” and “Jake” can be aggregated into a single vector. This aggregation of entries (as well as any aggregation of other entries, such as columns) may be yet another feature vector referred to herein as “contextualized vectors” [0055].
GUGGILLA is cited for additional support of the limitation:
generating a relation model based upon the extracted set of text fragments, with the relation model including information indicative of a semantic relatedness value defining the extracted set of text fragments as The entity and relation annotator 102 may generate, based on the aforementioned annotated training documents, the entity and relation (i.e., relation model) annotation model 106 by transforming each annotated training document of the annotated training documents into a vector representation, and generating, based on vector representations of the annotated training documents, the entity and relation annotation model. That is, the entity and relation annotator 102 may transform the entity-annotated corpus into vector representations, and set different parameters for a word2vec process and entity-embeddings … For example, once the invoice document as shown in FIG. 5A is annotated with the specified set of entities, vocabulary in the annotated invoice may be replaced by the corresponding entity name. For example, in FIG. 5A, each line or vendor segment, ‘Bill To’ and ‘ShipTo’ may be considered as chunks. All of the chunks in the historical annotated invoices may then be considered as a training corpus, and fed into the word2vec learning process where invoice ‘ABC Repair Inc.’ may be replaced by ‘VendorName’, ‘Total’ may be replaced by the entity ‘InvoiceTotal’, and ‘$’ may be replaced by the ‘Currency’ entity name … Syntactic and semantic variations of these entity names denoted entity embeddings may be learned through the semantic distribution process denoted word2vec … In this regard, the word2vec process may generate a weighted (i.e., sematic relatedness value) representation for each entity by learning contexts based on the assumption that words that occur together tend to be similar semantically ([0043, 0048, and 0049]).
clustering the extracted set text fragments into a first group, based, at least in part, upon the relation model as The document categorizer 110 may implement a deep learning based LSTM technique to classify and/or cluster documents (e.g., invoices) into various groups and/or categories [0089].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because GUGGILLA’s teaching would have allowed Chan’s to improve the accuracy of clustering/classification of the text fragments by assigning a score that represents the semantic distance between the terms.

Chan and GUGGILLA do not explicitly teach the step of:
optimizing an alignment of the composite text objects to obtain an organized text fragment.
	Burdick; however, teaches optimizing an alignment of the composite text objects to obtain an organized text fragment as As noted above, Step 2 includes discretizing the canvas and detecting alignment. In connection with Step 2.1, finding blocks of aligned text objects, some of the text objects can be identified as left-aligned (top-aligned), right-aligned (bottom-aligned), or center-aligned with each other, either horizontally or vertically [0052].
For example, step 100 (identifying composite text objects) can include grouping text and/or characters in a PDF document into words (also referred to herein as tokens), grouping tokens into phrases, and grouping phrases into paragraphs. Additionally, step 102 (discretizing a canvas and detecting alignment(s)) can include discretizing one or more contiguous areas of the document, snapping and/or merging ruling lines (for example, ruling lines that are near each other), and detecting all aligned groups of phrase-units within each contiguous area ([0018 and 0015]).
combining the clustered set of text fragments from the first group into composite text objects as Identifying composite text objects can proceed in a bottom-up fashion, wherein all characters in the PDF document are grouped into disjoint tokens, then tokens are grouped horizontally into disjoint phrase-units, and then phrase-units are grouped vertically and horizontally into disjoint paragraphs [0136].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Burdick’s teaching would have allowed Chan-GUGGILLA’s to improve organization of tables in an unstructured document by identifying composite text objects and discretizing a canvas and detecting alignments.

	Regarding claims 2, 8, and 14, Chan further teaches wherein the semantic relatedness value defining the extracted set of text fragments is determined from aAs another example, similar data may be grouped together (e.g., into the same cluster or class) [0048].
The group of feature vectors 504 (e.g., a group of entries that are each the same or similar to the entry 408) represents a row or column of an object or table. Accordingly, the input is a sequence of feature vectors and the output is a sequence of contextualized vectors 508 [0097].
The distance between any two contextualized vectors (or any feature vector described herein) or class of vectors is measured according to any suitable method. For example, in some embodiments, automated cosine (or Euclidian) distance similarity is used to compute distance. Cosine similarity is a measure of similarity between two non-zero feature vectors of an inner product space that measures the cosine of the angle between the two non-zero feature vectors. In these embodiments, no similarity is expressed as a 90 degree angle, while total similarity (i.e., the same word) of 1 is a 0 degree angle. For example, a 0.98 distance between two contextual vectors reflects a very high similarity while a 0.003 distance reflects little similarity [0071].

Regarding claims 3, 9, and 15, Chan further teaches wherein a vector for the composite text objects is a function of the clustered set of text fragments as The row feature aggregation component 207 is generally responsible for aggregating feature vectors for some (e.g., within a threshold) or each entry in a same row or record as a particular element. Accordingly, the row feature aggregation component 207 summarizes adjacent entries on the same row of the table. This is represented by r.sub.ij=f(e.sub.i1, e.sub.i2, . . . , e.sub.iW).sub.j (where f is some aggregation function or contextual linear operation (e.g., a BiLSTM) [0054].

Regarding claims 5, 11, and 17, Chan does not explicitly teach wherein the combination of the clustered set of text fragments from the first group into composite text objects includes determining a physical proximity between multiple text fragments within the clustered set of text fragments from the first group.
GUGGILLA teaches a physical proximity between multiple text fragments as
Syntactic and semantic variations of these entity names denoted entity embeddings may be learned through the semantic distribution process denoted word2vec … In this regard, the word2vec process may generate a weighted (i.e., sematic relatedness value) representation for each entity by learning contexts based on the assumption that words that occur together tend to be similar semantically ([0043, 0048, and 0049]).

Burdick; however, teaches wherein the combination of the clustered set of text fragments from the first group into composite text objects includes determining a physical proximity between multiple text fragments within the clustered set of text fragments from the first group as Composite text objects can be identified via one or more heuristic rules. For example, phrases can be identified from tokens (words) based on proximity: if two tokens are close to one another, they are put into the same phrase [0049].

Regarding claims 6, 12, and 18, Chan and GUGGILLA do not explicitly teach wherein the combination of the clustered set of text fragments from the first group is done by considering the vertical or horizontal alignment of the text fragments within the clustered set of text fragments from the first group.
Burdick; however, teaches wherein the combination of the clustered set of text fragments from the first group is done by considering the vertical or horizontal alignment of the text fragments within the clustered set of text fragments from the first group as Paragraphs can be identified based on a number of features. For example, in at least one embodiment of the invention, it is expected that paragraph text-lines share the same font characteristics, are vertically close, and are left-co-aligned except for the first line, which may have an indentation [0050].

Regarding claim 7, the claim recites a computer program product with similar limitations as claim 1. As such, claim 7 is rejected under the same rationale as noted above for claim 1.
Chan also teaches A machine readable storage device as the methods may also be embodied as computer-useable instructions stored on computer storage media [0021];
Computer code stored on the machine readable storage device…to perform operations as Embodiments of the disclosure may be described in the general context of computer code or machine-useable instructions, including computer-useable or computer-executable instructions, such as program modules, being executed by a computer or other machine, such as a personal data assistant, a smartphone, a tablet PC, or other handheld device. Generally, program modules, including routines, programs, objects, components, data structures, and the like, refer to code that performs particular tasks or implements particular abstract data types [0143].


Regarding claim 13, the claim recites a computer system with similar limitations as claim 1. As such, claim 13 is rejected under the same rationale as noted above for claim 1.
Chan also teaches a process as For instance, various functions may be carried out by a processor executing instructions stored in memory [0021]; and
A machine readable storage device as the methods may also be embodied as computer-useable instructions stored on computer storage media [0021].

Claims 4, 10, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Chan; Pak On et al. (“Chan”) US 20210406266 A1 in view of GUGGILLA; Chinnappa et al. (“GUGGILLA”) US 20200073882 A1 and Burdick; Douglas Ronald (“Burdick”) US 20200042785 A1 as applied to claims 1, 7, and 13 further in view of Dong; Lijun et al. (“Dong”) US 20160191295 A1.
Regarding claims 4, 10, and 16, Chan further teaches wherein the semantic relatedness value for the composite text objects is determined through the use of a semantic As another example, similar data may be grouped together (e.g., into the same cluster or class) ([0048 and 0097]).
Cosine similarity is a measure of similarity (i.e. semantic relatedness value) between two non-zero feature vectors of an inner product space that measures the cosine of the angle between the two non-zero feature vectors [0071].
Chan, GUGGILLA, and Burdick do not explicitly teach the step of:
Dong; however, teaches wherein the semantic relatedness value … semantic web graph as The Semantics Web is discussed below. The Semantics Web uses a combination of a schema language and an ontology language to provide the capabilities of ontologies. An ontology uses a predefined, reserved vocabulary to define classes and the relationships between them for a specific area of interest, or more ([0008 and 0014]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Dong’s teaching would have allowed Chan-GUGGILLA-Burdick’s to improve clustering of related data by determining similarity measurements between the two fragments/documents utilizing the semantics web technique.

Conclusion
The prior art made of record and not relied upon in form PTO-892 is considered pertinent to applicant's disclosure.


	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LESLIE WONG whose telephone number is (571)272-4120. The examiner can normally be reached Monday-Friday 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ashish K. Thomas can be reached on : 571-272-0631. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LESLIE WONG/Primary Examiner, Art Unit 2164