Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Claim Objections
Applicant is advised that should claims 1-10 be found allowable, claim 11-20 will be objected to under 37 CFR 1.75 as being a substantial duplicate thereof. When two claims in an application are duplicates or else are so close in content that they both cover the same thing, despite a slight difference in wording, it is proper after allowing one claim to object to the other as being a substantial duplicate of the allowed claim. See MPEP § 608.01(m).

The only difference between claims sets is in independent claim 11 which claims:

 scaling the digital image larger or smaller using the estimated character pixel height and a preferred character pixel height
Instead of:
scaling the digital image using the estimated character pixel height and a preferred character pixel height

The examiner notes that the scaling implies changing the size, which is necessarily either larger or smaller., as such this elements add no meaningful difference to the claims and the claims cover the same subject matter.  The additional features Claims 2-10 and 12-20 respectively are identical. 



Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.


Claim 1-3, 5, 6,  10  11-13,  15, 16  and 20  rejected on the ground of nonstatutory double patenting as being unpatentable over claim  8, 9 and 10 of U.S. Patent No. 11,176,410 in view of Guillo US 2010/0054585. 

Re claim 1 claim 8 discloses  A text extraction computing method, comprising: an estimated character pixel height of text from a digital source image having lines of text((see claim 8 “scaling the image using an estimated character pixel height of the image text;” note that an estimated height must necessarily be calculated before it may be used) scaling the digital image using the estimated character pixel height(see claim 8 “scaling the image using an estimated character pixel height of the image text;”); binarizing the digital image; and extracting characters from the digital image using an optical character recognizer.( see claim 8 binarizing the image; extracting text from the image using an optical character recognizer;)

Claim 8  does not expressly discloses an estimated character pixel height of text from a digital source image having lines of text ; scaling the digital image using the estimated character pixel height and a preferred character pixel height. Guillo discloses calculating an estimated character pixel height of text from a digital source image having lines of text scaling the digital image using the estimated character pixel height and a preferred character pixel height (see paragraph 82 if the if the height is less then 75 pixels (i.e. preferred height )  it is scaled up). The motivation to combine is that “Most OCR software can only recognize text with large enough resolution. So if the height of the text is less than about 75 pixels (currently), scaling up may be needed.”. Therefore it would have been obvious to one of ordinary skill in the art to  combine Guillo scaling according to an  estimated height and a preferred height  and with the scaling method of claim 8 to reach the aforementioned advantage. 

Re claim 2 claim 8 discloses where calculating the estimated character pixel height includes: summarizing the rows of pixels into a horizontal projection, and determining a line-repetition period from the horizontal projection, and quantifying the portion of the line-repetition period that corresponds to the text as the estimated character pixel height. (see claim 8 discloses “where the estimated character pixel height is calculated by: summarizing the rows of pixels into a horizontal projection, and determining a line-repetition period from the horizontal projection, and quantifying the portion of the line-repetition period that corresponds to the image text as the estimated character pixel height;”)


Re claim 3 claim 9 discloses where determining the line-repetition period uses signal autocorrelation. (see claim 9)

Re claim 5 Claim 8 discloses further comprising the steps of: de-skewing the digital source image by rotating the digital source image; removing distortion from the digital source image( see claim 8 “ de-skewing a digital image containing image text by rotating the image; Page: 4 of 5 scaling the image using an estimated character pixel height of the image text; removing distortion from the image;”; and postprocessing the extracted characters. (see claim 8 “postprocessing the extracted characters”). 

Re claim 6 Claim 10 discloses where removing distortion uses a neural network trained by a cycle generative adversarial network on a set of source text images and a set of clean text images, where the set of source text images and the set of clean text images are unpaired, and where the source text images are distorted images of text  (see claim 10 “where removing he distortion from the images uses a preprocessing neural network, and where the preprocessing neural network is developed with a cycle generative adversarial network, where the cycle generative adversarial network is trained on a set of source text images and a set of clean text images, where the set of source text images and the set of clean text images are unpaired.”).

Re claim 10 Claim 8 of the patent discloses where postprocessing includes a Levenshtein automaton model and a deep learning language model (see claim 8 “and where the postprocessing includes a levenshtein automaton model and a deep learning language model.”).

Re claims 11-13, 15, 16 and 20 cover substantially the same subject matter as claims 1-3, 5, 6 and 10 as discussed above.  


Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.



Claim(s) 1 and 11 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Guillo US 2010/0054585.

Re claim 1 Guillou discloses A text extraction computing method, comprising: calculating an estimated character pixel height of text from a digital source image having lines of text (see paragraph 82 if the if the height of the character  is less then 75 pixels   it is scaled up ); scaling the digital image using the estimated character pixel height and a preferred character pixel height (see paragraph 82 if the if the height is less then 75 pixels  (i.e. preferred character hight) it is scaled up); binarizing the digital image ( see paragraph 83 binarization applied to image); and extracting characters from the digital image using an optical character recognizer (see abstract note OCR is applied to binarized text).


Re claim 11 Guillou discloses A text extraction computing method, comprising: calculating an estimated character pixel height of text from a digital source image having lines of text (see paragraph 82 if the if the height of the character  is less then 75 pixels  it is scaled up ); scaling the digital image larger or smaller using the estimated character pixel height and a preferred character pixel height (see paragraph 82 if the if the height is less then 75 pixels  it is scaled up); binarizing the digital image ( see paragraph 83 binarization applied to image); and extracting characters from the digital image using an optical character recognizer (see abstract note OCR is applied to binarized text).



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 5 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Guillou in view of Hodge US 9,558,523.

Re claim 5 Guillou discloses all the elements of claim 1. Guillou does not expressly disclose further comprising the steps of: de-skewing the digital source image by rotating the digital source image; removing distortion from the digital source image; and postprocessing the extracted characters. discloses  further comprising the steps of: de-skewing the digital source image by rotating the digital source image (Column 13 lines 60-column 14 lines 1-5); removing distortion from the digital source image; (see column 14 lines 5-15) and postprocessing the extracted characters (see column 14 lines 25-40). The motivation to combine is to improve OCR accuracy rate (see column 13 lines 60-65) and detect particular words in the text t trigger actions (see column 14 lines 25-40) Therefore it would have been obvious before the effective filing date of the invention to combine the preprocessing of Guillo with the preprocessing of Hodge to reach the aforementioned advantage. 

Re claim 15 Guillou discloses all the elements of claim 11. Guillou does not expressly disclose further comprising the steps of: de-skewing the digital source image by rotating the digital source image; removing distortion from the digital source image; and postprocessing the extracted characters. discloses  further comprising the steps of: de-skewing the digital source image by rotating the digital source image (Column 13 lines 60-column 14 lines 1-5); removing distortion from the digital source image; (see column 14 lines 5-15) and postprocessing the extracted characters (see column 14 lines 25-40). The motivation to combine is to improve OCR accuracy rate (see column 13 lines 60-65) and detect particular words in the text t trigger actions (see column 14 lines 25-40) Therefore it would have been obvious before the effective filing date of the invention to combine the preprocessing of Guillo with the preprocessing of Hodge to reach the aforementioned advantage. 




Claim 10 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Guillou in view of Hodge US 9,558,523 in further view Mokhtar Et al  OCR Error Correction: state of the art vs and NMT based approach 2018 13th IAPR International Workshop on Document Analysis Systems (DAS).

Re claim 10 Guillou and Hodge discloses all of the elements of claim 5 they do not disclose where postprocessing includes a levenshtein automaton model and a deep learning language model.  Mokhtar discloses where postprocessing includes a levenshtein automaton (see section C note that Levenshtien distance is used to perform spelling correction suggestions) model and a deep learning language model (see abstract note deep learning is used ). The motivation to combine is perform OCR error correction (see abstract/title). It would have been obvious before the effective filing date of the invention to combine Guillou and Hodge with Mokhtar to reach the aforementioned advantage to process OCR results to correct errors.

Re claim 20 Guillou and Hodge discloses all of the elements of claim 5 they do not disclose where postprocessing includes a levenshtein automaton model and a deep learning language model.  Mokhtar discloses where postprocessing includes a levenshtein automaton (see section C note that Levenshtien distance is used to perform spelling correction suggestions) model and a deep learning language model (see abstract note deep learning is used ). The motivation to combine is perform OCR error correction (see abstract/title). It would have been obvious before the effective filing date of the invention to combine Guillou and Hodge with Mokhtar to reach the aforementioned advantage to process OCR results to correct errors.


Allowable Subject Matter
Claim 4, 7-9 ,14 and 17-19 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. 


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEAN T MOTSINGER whose telephone number is (571)270-1237. The examiner can normally be reached 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on (571)272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SEAN T MOTSINGER/Primary Examiner, Art Unit 2669