DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Responsive to the communication dated 9/24/2019.
Claims 1 – 10 are presented for examination.

Priority
ADS dated 9/14/2019 claims domestic priority to provisional 62735449 dated 2018-09-24.

Information Disclosure Statement
The Application does not provide any information disclosure statements.

Drawings
The drawings dated 9/24/2019 have been reviewed. They are accepted.

Specification

The abstract dated 9/24/2019 has 76 words and no legal phraseology. It is accepted.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1- 10 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception without significantly more. The claim(s) recite(s) 

Claim 1:
STEP 1: YES. The claim recites “a method”

STEP 2A PRONG ONE: YES.

The claim recites: “comprising: calculating a map of state change probabilities between a plurality of page types, the state change probabilities indicating at least the probability that a first page type will precede a second page type; determining, using a classifier, page type probability vectors for each of the plurality of page items; and calculating predicted page types for each of the plurality of page items based on the respective page type probability vector and the map of state change probabilities” which is a mathematical concept known as conditional probability/Probability Urn which is a mathematical exercise in which real items of interest are represented a items selected from a container (i.e., set) and the probability of selection is calculated. 

NOTE: The Examiner notes that the prior art of Orloff_2017 (page 4) explicitly teaches that calculating conditional probabilities (i.e. urn probabilities) are “mental” exercises. Therefore; the claim may also properly be considered to be a mental process of evaluating the state change probability.

STEP 2A PRONG TWO: NO. 
The Office finds that:
While the claim recites calculating the probability of “page types”; generally linking the use of the judicial exception to a particular technological environment (i.e., page types/documents) has been found by the Court NOT to be indicative of a practical application. See MPEP 2106.05(h).
While the claim recites “receiving a document package comprising a plurality of page items” this is simply insignificant extra solution activity. The Courts have found insignificant extra solution activity to NOT be indicative of a practical application. See MPEP 2106.05(g)
While the claim recites “using a classifier”; this is merely a mathematical calculation.

Therefore; the Office concludes that there are no additional elements in the claim that apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception.

STEP 2B: NO.

As outlined above the claim merely results in a calculation result. While the claim recites “receiving a document package comprising a plurality of page items” and this was found to be insignificant extra solution activity; this is recited at a high degree of generality and does not include anything other than what is well-understood, routine, conventional activity in the field. Therefore; merely receiving a set of documents (i.e., page/document package) is not found to be an inventive concept which is significantly more than the abstract idea itself.

Therefore; the claim is not eligible subject matter under 35 USC 101.

Claim 6:
STEP 1: YES. The claim recites “a system”

STEP 2A PRONG ONE: YES.
The claim recites: “comprising:calculate a map of state change probabilities between a plurality of page types, the state change probabilities indicating at least the probability that a first page type will precede a second page type; receive a document package comprising a plurality of page items; determine, using a classifier, page type probability vectors for each of the plurality of page items; and calculate predicted page types for each of the plurality of page items based on the respective page type probability vectors and the map of state change probabilities” which is a mathematical concept known as conditional probability/Probability Urn which is a mathematical exercise in which real items of interest are represented a items selected from a container (i.e., set) and the probability of selection is calculated. 

NOTE: The Examiner notes that the prior art of Orloff_2017 (page 4) explicitly teaches that calculating conditional probabilities (i.e. urn probabilities) are “mental” exercises. Therefore; the claim may also properly be considered to be a mental process of evaluating the state change probability.

STEP 2A PRONG TWO: NO. 
The Office finds that:
while the claim recites “at least one processor, and memory including instructions that, when executed by the at least one processor, cause the system to” this is merely instruction to implement the abstract idea on a computer, or to use the computer as a tool. Such elements are NOT indicative of a practical application. See MPEP 2106.05(f).
While the claim recites calculating the probability of “page types”; generally linking the use of the judicial exception to a particular technological environment (i.e., page types/documents) has been found by the Court NOT to be indicative of a practical application. See MPEP 2106.05(h).
While the claim recites “receiving a document package comprising a plurality of page items” this is simply insignificant extra solution activity. The Courts have found insignificant extra solution activity to NOT be indicative of a practical application. See MPEP 2106.05(g)
While the claim recites “using a classifier”; this is merely a mathematical calculation.

Therefore; the Office concludes that there are no additional elements in the claim that apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception.

STEP 2B: NO.

As outlined above the claim merely results in a calculation result. While the claim recites “receiving a document package comprising a plurality of page items” and this was found to be insignificant extra solution activity; this is recited at a high degree of generality and does not include anything other than what is well-understood, routine, conventional activity in the field. Therefore; merely receiving a set of documents (i.e., page/document package) is not found to be an inventive concept which is significantly more than the abstract idea itself.

Therefore; the claim is not eligible subject matter under 35 USC 101.





Claim 2, 7 recite: “determining a first chain pf predicted page types for at least a subset of the plurality of page items; calculating a first score for the first chain based on the page type probability vectors and the map of state change probabilities; determining a second chain of predicted page types for the at least a subset of the plurality of page items; calculating a second score for the second chain based on the page type probabilitiy vectors and the map of state change probabilities; and determine that the first chain of predicted page type is more likely than the second chain based on the first score and the second score” which merely further characterizes the mathematical calculation. Such limitations are not found to be a practical application nor are they found to be significantly more than the abstract idea. Therefore, the claims are not eligible subject matter under 35 UC 101.

Claims 3, 8 recites: “identifying one or more document type field regions of a particular page item of the plurality of page items based on a respective predicted page type for the particular page item;
obtaining data from the one or more document type field regions; validating the data based on at least one validation rule for the respective predicted page type; and storing the data in a database” merely gather data and save data. Gathering and saving data are merely extra solution activities. These elements are recited at a high degree of generality. Obtaining data and storing data in a database are conventional activities. Therefore, such limitations are not found to be a practical application nor are they found to be significantly more than the abstract idea. Therefore, the claims are not eligible subject matter under 35 UC 101.

Claims 4, 9 recite: “wherein the classifier calculates page type probability vectors using a convolutional neural network and/or optical character recognition of the respective page items” which merely further characterizes the mathematical calculation. These elements do not rely on or utilize the abstract idea. Rather they are utilized merely for performing the calculations. The convolutional neural network and optical character recognition are merely recited as a way to perform the calculation. These elements are also recited at a high degree of generality and convolutional neural networks and/or optical character recognition are conventional activities. Therefore, such limitations are not found to be a practical application nor are they found to be significantly more than the abstract idea. Therefore, the claims are not eligible subject matter under 35 UC 101.

Claims 5, 10 recite: “wherein the map of state change probabilities includes the probability that a third page type will follow the second page type” which merely further characterizes the mathematical calculation. Such limitations are not found to be a practical application nor are they found to be significantly more than the abstract idea. Therefore, the claims are not eligible subject matter under 35 UC 101.



Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


(1) Claim(s) 1- 10 are rejected under 35 U.S.C. 103 as being unpatentable over Behm_2008 (WO 2008/028018 A1) in view of Wiedemann_2017 (Page Stream Segmentation with Convolutional Neural Nets Combining Textual and Visual Features, Oct 2017, Proceedings of the 11th International Conference on Language Resource and Evaluation (LREC 2018) in view of Orloff_2017 (Conditional Probability, Independence and Baye’s Theorem Class 3, 18.05, Spring 2017).

Claim 1. Behm_2008 makes obvious “A method comprising (abstract: “a system and method are disclosed for automatically classifying images of pages of a source, such as a book into classifications such as front cover, copyright page, table of contents, text, index, etc...”): calculating a map ofprobabilities between a plurality of page types, the probabilities indicating at least the probability that a first page type will precede a second page type (page 9: lines 3 – 12: “... previous page is a dynamic feature which includes aggregate information. In one embodiment, the classification for a page image may be determined based on the classification of an image as a previous page. For example, a page image with a text classification most likely follows another page image with the same classification. In another embodiment, a table of observed probabilities may be constructed to provide the probability that a page image has a certain classification if it follows another page image with the same or different classification. Such a table may indicate that, for example, a page image with the classification of table of contents follows a page image with the classification of front matter 25% of the time, and a page image with the classification of front cover follows any other page image zero percent of the time...” NOTE: the table of probabilities is the map of probabilities.); receiving a document package comprising a plurality of page items (Fig. 6 block 622 “scanner/input device” receiving a document package 100 “pages”; page 1 lines 14 – 20: “... scanners equipped with automatic document feeders or scanning robots are not available that obtain digital images of pages of printed content and translate the images into computer-readable text using character recognition techniques. These “page images” may then be stored in a computing device and disseminated to users. Page images may also be provided from other sources, such as electronic files, including electronic files in .pdf format (Portable Document Format)...”); determining, using a classifier (Fig. 2 block 202 and 206 “classifier”; Fig 4 block 206 “classifier”; Fig. 6 block 612: “classifier”; Fig. 7 block 700: “classifier”), page type probability (page 5 lines 1 – 2: “... Bayesian classifier, which is well known in the art as a probability based method for classifying the outcome of an experiment...”; page 9 lines 7 – 9: “... probabilities may be constructed to provide the probability that a page image has a certain classification if it follows another page image...”; page 11 line 31: “... the probability that the respective classification criterion 702 correctly identifies the page image being classified by the classifier 700 as having the page image classification...”; page 4 lines 10 – 12: “... the classification may be repeated on the same page image if the probability that the page image has the determined classification falls short of a desired probability threshold...”) for each of the plurality of page items (Fig. 4 block 102 “page images”; Fig. 6 block 11 “pages”; page 3 lines 13 – 17: “... cover, copyright page... table of contents... text... index...”); and calculating predicted page types for each of the plurality of page items based on the respective page type probability and the map of probabilities” (Fig. 4 final page image classification 210; fig. 6 page image classification data 614; Fig. 7 page classification 706; Fig. 8 multi-page classification subroutine 1000; page 9: lines 3 – 12: “... previous page is a dynamic feature which includes aggregate information. In one embodiment, the classification for a page image may be determined based on the classification of an image as a previous page. For example, a page image with a text classification most likely follows another page image with the same classification. In another embodiment, a table of observed probabilities may be constructed to provide the probability that a page image has a certain classification if it follows another page image with the same or different classification. Such a table may indicate that, for example, a page image with the classification of table of contents follows a page image with the classification of front matter 25% of the time, and a page image with the classification of front cover follows any other page image zero percent of the time...”)

While Behm_2008 teaches to calculate the probability of the next page given the classification of the current page and while this may properly be found to make obvious to one of ordinary skill in the art the limitation of “state change”, because this teaches the change in the state of the probability given the knowledge of the previous page, Behm_2008 does not explicitly recites “state change” probability.

Further; while Behm_2008 teaches to use a “linear combinator classifier or other classifier, such as a Bayesian classifier” (page 14); and while one of ordinary skill in the art may properly be found to infer from this teaching the limitation of a probability “vector” because linear classifiers are known to utilize vectors, Behm_2008 does not explicitly recite “vector.”

Wiedemann_2018; however, does teach “(linear) support vector machines (SVM)” (page 3675) and to “use linear text classification” where “we rely on SVM with a linear kernel” (page 3677) and to classify using a previous page information using a SVM classification (page 3678) for page type classification and that probability distributions are used as feature vectors comprising latent semantics of the modeled documents” (3677). Therefore; Wiedemann_2018 teaches to use support vector machines which classify pages into page types using probability vectors.

Behm_2008 and Wiedemann_2018 are analogous art because they are from the same field of endeavor called finite calculating probabilities. Before the effective filing date it would have been obvious to a person of ordinary skill in the art to combine Behm_2008 and Wiedemann_2018. The rationale for doing so would have been that Behm_2008 teaches to calculate the probability of a second page given the occurrence of a first/previous page and to do this using a classifier. Wiedemann_2018 teaches to use a classifiers that use vectors. Therefore, it would have been obvious to combine Behm_2008 and Wiedemann_2018 for the benefit of using vectors that incorporate the probability distributions of the semantics in the documents to obtain the invention as specified in the claims.

While, Behm_2008 and Wiedemann_2018 both teach the notion of the change in probability when the type of the previous page is known (i.e., conditional probability) and while this implies to one of ordinary skill in the art, Behm_2008 and Wiedemann_2018 does not explicitly recite “state change.”

Orloff_2017; however, makes obvious “state change” of conditional probabilities (page 1: “... conditional probabilities answer the question ‘how does the probability of an event change if we have extra information... P(A|B)...”; page 5 – 6 section 5: illustrates a conditional probability tree which explicitly illustrates the change in the state of probability. NOTE: the trees clearly illustrate the change in probability going from R1 to R2 or R1 to G2 for example. This is interpreted in view of the FIG 2 of the instant application. Page 2 illustrates a conditional probability state change map. Page 10 and 11 also illustrate state change tables/maps.

Behm_2008 and Wiedemann_2018 and Orloff_2017 are analogous art because they are from the same field of endeavor called finite calculating conditional probabilities. Before the effective filing date it would have been obvious to a person of ordinary skill in the art to combine Behm_2008 and Orloff_2017. The rationale for doing so would have been that Behm_2008 teaches to use calculate conditional probabilities (page 9 lines 3 – 13) and Orloff_2017 teaches how to perform conditional probability calculations. Additionally, Orloff_2017 teaches that “urn problems” are to be applied to “objects of real interest” and Behm_2008 teaches a real object of interest is scanned document pages.
Therefore, it would have been obvious to combine Behm_2008 and Orloff_2017 for the benefit of calculating conditional probabilities to obtain the invention as specified in the claims.

Claim 6. The limitations of claim 6 are substantially the same as those of claim 1. Therefore, claim 6 is rejected due to the same reasons as outlined above for claim 1. Additionally, Behm_2008 makes obvious the further limitations of “A system, comprising: at least one processor; and memory including instructions that, when executed by the at least one processor, cause the system to” (Fig. 6: “processor” 602, “memory” 620, “operating system”, “OCR/Image processing application” 610, “classifier” 612”).

Claims 2, 7. Behm_2008 and Wiedemann_2018 and Orloff_2017 make obvious all the limitations of claims 1 and 6 as outlined above. Orloff_2017 makes obvious “determining a first chain pf predicted page types for at least a subset of the plurality of page items; calculating a first score for the first chain based on the page type probability vectors and the map of state change probabilities; determining a second chain of predicted page types for the at least a subset of the plurality of page items; calculating a second score for the second chain based on the page type probabilitiy vectors and the map of state change probabilities; and determine that the first chain of predicted page type is more likely than the second chain based on the first score and the second score” (page 5 – 6 section 5 the end of the “trees” demonstrate the probabilities of occurrence. For example in section 5.1 G1 followed by G2 is the least likely chain of events while R1 followed by R2 is the most likely chain of events.)

Claims 3, 8. Behm_2008 and Wiedemann_2018 and Orloff_2017 make obvious all the limitations of claims 1 and 6 as outlined above. Behm_2008 also makes obvious “identifying one or more document type field regions of a particular page item of the plurality of page items based on a respective predicted page type for the particular page item; obtaining data from the one or more document type field regions (page 6 – 7: “... keywords are pre-determined keywords such as “contents”, “index”... which indicated a possible classification for the page image in which they are found... contents found in a page image increases the likelihood that the image is of a page including a table of contents... keywords... a priori or deductive knowledge. For example ISBN is a known identifier for published books... therefore, if the ISBN keyword and number appear in a page image, then the page image may be classified as the copyright page...” NOTE: the location of the keywords is a field region); validating the data based on at least one validation rule for the respective predicted page type; and storing the data in a database” (Fig. 2 “verifier” “criteria” ; Fig. 5 “verifier” “criteria”; Fig. 8 “verify”).

Claims 4 and 9. Behm_2008 and Wiedemann_2018 and Orloff_2017 make obvious all the limitations of claims 1 and 6 as outlined above. Wiedemann_2018 also makes obvious “wherein the classifier calculates page type probability vectors using a convolutional neural network and/or optical character recognition of the respective page items” (Table 1: SVM CNN+MLP; page 3677: “... combination of CNN and MLP...”; page 3678: “... we first create two separate convolutional neural networks (CNN) for binary classification of pages into either SD or ND...”; Figure 3: “CNN”).

Claims 5 and 10.  Behm_2008 and Wiedemann_2018 and Orloff_2017 make obvious all the limitations of claims 1 and 6 as outlined above. Behm_2008 makes obvious “wherein the map of state change probabilities includes the probability that a third page type will follow the second page type” (abstract: “... classification for the page image based on multiple-pages and/or global criteria...”).
Wiedemann_2018 also makes obvious “wherein the map of state change probabilities includes the probability that a third page type will follow the second page type” (page 3678: “... predecessor pages... previous page...”; Figure 1: “... first page...”; Figure 2: “... subsequent pages...”).
Orloff_2017 also makes obvious “wherein the map of state change probabilities includes the probability that a third page type will follow the second page type” (page 1 and page 2 illustrates the conditional probability with a first, second, and third selection. This makes obvious a third type followed by a second type.). Further; MPEP 2144.04 indicates that duplication of parts is not non-obvious. Merely having additional page types (i.e., third, fourth, etc.) is not non-obvious. Particularly in light of Orloff_2017 which clearly teaches the fundamental principles of conditional probability which is clearly expandable to any number of selections that are conditioned upon any and all previous selections. 


Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRIAN S COOK whose telephone number is (571)272-4276. The examiner can normally be reached 8:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamini S. Shah can be reached on 571-272-2279. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BRIAN S COOK/Primary Examiner, Art Unit 2146