DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 11/13/2020 is being considered by the examiner.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 5-7, 12-14, and 18-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claims 5, 12, and 18, the ending of each claim recites “or a combination thereof” leading one of ordinary skill in the art to conclude that options were previously recited in the alternative. However, while three different actions were recited, they were not worded as being in the alternative leading to confusion on how to properly interpret the claims. In addition, the first option recited is that the document manager is to dynamically produce structure given a table boundary update. However, this is confusing as its unclear what structure is being referred to and what exactly a table boundary update is. In addition, it is unclear how a label correction as currently claimed would lead to structure being produced or what it is even being produced for (e.g. is this for a document in the clusters, a new document, etc). The second option recited is for the document manager to intelligently split one or more cells along a text line and cell bounding box of a neighboring cell. This is also confusing since it’s unclear what the cells are as they have not been previously recited and what the text line is referring to. As with the first option, it is unclear if this is supposed to be for a previous document in the clusters, a new document, etc and how this is related to the label correction. The final option is for the document manager to dynamically apply a character correction to one or more similar characters. As with the other two options, it is unclear what the character correction is applied to and how it is related to label correction. 

Regarding claims 6, 13, and 19, the first clause of each of the claims recites “the ANN to generate output data classifying interpretation of the extracted one or more structures and location”. It is unclear what a “classification” of an interpretation is and what an interpretation would be of extracted structures and location. Furthermore, the second clause recites “the cluster manager to leverage the classification for the selective document evaluation, including order documents for intra-cluster and inter-cluster evaluation”. However, it is unclear what “order documents” are and how “inter-cluster evaluation” could be performed in for instance only one cluster were chosen by the evaluator which is within the scope of the independent claims and what the full extent of an “intra-cluster evaluation” would be (e.g. is it how similar the documents are to each other, and if so, how would “label correction” be used in this evaluation). Dependent claims 7, 14, and 20 further these issues by referring to “the ordering” which as stated above is unclear. In addition, each of these claims recite “inter-cluster classifications” for use in determining a heterogeneous collection of documents; however, it is unclear what such classifications are and how they would lead to a heterogeneous collection of documents to be chosen.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 8, 9, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Eshghi et a, U.S. Publication No. 2022/0108065 in view of Cohen et al, U.S. Patent No. 10,803,399.

Regarding claim 1, Eshghi teaches a computer system comprising: 

a processor operatively coupled to memory (see Eshghi Figure 3, message processor 312 and storage devices 330 connected over communication layer 320); 

an artificial intelligence (AI) platform, in communication with the processor, having one or more machine learning (ML) tools (see Figure 3, document template learner 314 with various analyzers), the tools comprising: 

a document manager (see Figure 3, feature analyzer 362) configured to cause a machine learning model (MLM) (see paragraph [0032]) to subject a document collection (see Figure 1B, content objects) to table region identification within one or more discretized contiguous areas (see Figure 1B, step 4 and paragraph [0026] indicates that the features can be tables. The discretized contiguous area would be within each document); 

a cluster manager configured to subject the document collection to clustering (see Figure 3, feature analyzer 362 and document analyzer 364 with document clusters 326), including: 

leverage the MLM (see paragraph [0032]) to extract one or more structures and location of the one or more structures in the document collection (see Figure 1B, step 4 and paragraph [0044]); and 

assign documents within the document collection to one or more clusters responsive to the leveraged MLM, wherein each cluster includes one or more documents having a content characteristic (see Figure 1B, steps 5 and 6).

Eshghi does not expressively teach wherein 

the MLM is an artificial neural network (ANN); and 

wherein the tools comprise an evaluator configured to selectively evaluate a selection of documents from the one or more clusters, and apply one or more label corrections to the ANN; and the ANN configured to generate an updated document collection incorporating the applied one or more label corrections.
However, Cohen in a similar invention in the same field of endeavor teaches a computer system comprising: a processor operatively coupled to memory (see Cohen Figure 1, data management system 116 connected to repository 114) and an AI platform, in communication with the processor, having one or more ML tools using an MLM (see Figure 1, machine learning system 104 with various modules) to cluster documents into one or more clusters (see Figure 2, steps 200 and 202) as taught in Eshghi wherein the tools comprise 

an evaluator (see Figure 1, supervised tuning interface 124) configured to selectively evaluate a selection of documents from the one or more clusters (see Figure 2, step 204), and apply one or more label corrections to the MLM (see Figure 2, steps 206 and 208 and column 13, lines 57-63); and 

the MLM configured to generate an updated document collection incorporating the applied one or more label corrections (see column 11, lines 53-55).

One of ordinary skill in the art before the effective filing date of the invention would have found it obvious to combine the teaching of fine tuning clustering of documents by an MLM via an evaluator as taught in Cohen with the system taught in Eshghi, the motivation being to allow for online corrections of the system to be made thereby increasing the clustering accuracy while still performing live clustering of documents. 

Eshghi in view of Cohen does not expressively teach wherein the MLM is an ANN. However, one of ordinary skill in the art before the effective filing date of the invention would have found it obvious as a matter of simple substitution to replace the MLM taught in Eshghi in view of Cohen with an ANN as claimed to yield the predictable results of successfully leveraging such a network for document analysis. 

Method claim 15 recites similar limitations as claim 1, and is rejected under similar rationale.

Independent claim 8 recites a computer program product to utilize machine learning to facilitate document processing, the computer program product comprising: a tangible computer readable storage medium having program code embedded therewith, the program code executable by a processor to perform the method of claim 15, which Eshghi in view of Cohen further teaches (see Eshghi claim 19).

Regarding claim 2, Eshghi in view of Cohen teaches all the limitations of claim 1, and further teaches wherein the clustering is based on content comprising visual or textual, or a combination thereof (see Eshghi paragraph [0044]). 

Claim 9 recites similar limitations as claim 2, and is rejected under similar rationale.
Allowable Subject Matter
Claims 3, 4, 10, 11, 16, and 17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Although no prior art is used against claims 5-7, 12-14, and 18-20, this is not an indication that they are allowable. See MPEP 2173.06, section II, second paragraph. The 112 issues cause a great deal of confusion and uncertainty as to the proper interpretation of the limitations of the claims. It is therefore difficult for the Examiner to properly search for prior art for the invention.

NOTE: Claims 8-14 were not rejected under 35 U.S.C. 101 because paragraph [0081] of the published application explicitly excludes ineligible embodiments of computer readable storage media which the claims are directed to.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to CASEY L KRETZER whose telephone number is (571)272-5639. The examiner can normally be reached M-F 10:00-7:00 PM PDT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, DAVID C PAYNE can be reached on (571)272-3024. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CASEY L KRETZER/Examiner, Art Unit 2637