DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4-7, 10-13 and 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of USPNs 10,616,443 to Lund and 10,810,473 to Joseph et al.

	With regard to claim 1, Lund discloses a computer-implemented method for correcting an orientation of an image data object (col. 4, lines 3-6: "FIG. 1 depicts a flow chart illustrating an example of Al-augmented method 100 for document orientation detection and auto-rotation that can be performed by an auto-rotation module on a user device."), the computer-implemented method comprising: 
	receiving, by one or more processors, an original image data object (col. 5, lines 12-27: "The on-device ML approach will now be described with reference to FIGS. 3A-4. FIG. 3A depicts a diagrammatic representation of an original paper document. As illustrated in FIG. 4, user device 410 in networking computing environment 400 may include camera 401, application 420, and on-device auto-rotation module 430 with single-layer neural network 440. A user of user device 410 can use camera 401 (directly or through application 420) to take
a picture of the original paper document shown in FIG. 3A. With the proliferation of smart mobile
devices such as smartphones, tablets, etc., user device 410 can be any suitable Internet-enabled mobile device with a built-in camera. Camera 401 can store the picture of the original paper document on user device 410 and/or provide the picture of the original paper document to on-device auto-rotation module 430 (directly or through application 420) as a color image.");
	generating, by the one or more processors applying an optical character recognition (OCR) process, initial machine readable text for the original image data object  (col. 5, lines 50-65: "In some embodiments, the monochrome image is segmented to produce one bounding box (e.g., a square or a rectangle) for each connected segment of black pixels. In the example of FIG. 3C, the word “Text” is segmented into a bounding box for “T,” a bounding box for “e,” and a bounding box for “xt,” with the “x” and the “t” connected. The bounding boxes can and often will overlap, but their corresponding segments do not “touch” one another. Image segmentation can be done using block detection software (“block detector”). Any suitable block detector can be used, so long as the block detector can generate image segments from a black-and-white image. If the entire black-and-white image consists of nothing but text (i.e., no photographs, lines, or other non-textual elements), each bounding box would contain one to a few letters. In practice, however, it is often the case that a captured image will contain non-textual information. Accordingly, in some embodiments, auto-rotation module 430 is operable to convert an image captured by camera 401 into a monochrome image, calla

bounding boxes that contain non-textual information."); 
generating, by the one or more processors and using one or more machine learning models, an initial quality score for the initial machine readable text; 
determining whether the initial quality score satisfies one or more quality criteria (col. 6, lines 49-51: "Further, on-device single-layer neural network 440 is trained on text snippets to identify the orientation of the text (¢.g., what text is or is not right side up)." col. 7, lines 60-67: "Once the input images (each of which corresponds to a textual snippet) are prepared, they are provided as input to on-device single-layer neural network 440. In turn, on-device single-layer neural network 440 processes each snippet and outputs a set of four values (weights that add up to 1). Each value represents a probability that the snippet must be rotated 0°, 90°, 180°, or 270° in order to be correctly oriented.");
responsive to determining that the initial quality score does not satisfy the one or more quality criteria, generating a plurality of rotated image data objects each corresponding to a different rotational position, wherein each of the plurality of rotated image data objects comprise the original image data object rotated to a corresponding rotational position; 
	generating, by the one or more processors, a rotated machine readable text data object for each of the plurality of rotated image data objects, wherein each of the rotated machine readable text data objects are stored in association with corresponding rotated image data objects D1, col. 9, lines 15-41: "In some embodiments, auto-rotation module 430 is operable to examine the results provided by on-device single-layer neural network 440. This examination can determine what results should be counted in determining an overall orientation of a document image. In some embodiments, this can be done by comparing each output value (which represents a probability associated with a particular orientation) with a pre-determined ; 
generating, by the one or more processors and using one or more machine learning models, a rotated quality score for each of the rotated machine readable text data objects; 
determining that a first rotated quality score of the rotated quality scores satisfies the one or more quality criteria, wherein the first rotated quality score corresponds to a first rotated machine readable text data object of the rotated machine readable text data objects (In the example of Table 1, the fact that a few of the results for the input image of “e” are less than 0.1 percent and one is almost 0.9 indicates that on-device single-layer neural network 440 is quite confident that the input image needs to be rotated to the right (one turn, by 90°). However, this level of confidence may not be enough. The sensitivity of confidence can be configured by setting the threshold accordingly.  Suppose the threshold is 0.9 for all letters, even though on-device single-layer neural network 440 is quite confident that the input image needs to be rotated to the right, the output value associated with that orientation is 0.89, which
is less than the given threshold.  Accordingly, output values associated with this snippet “e” will
be discarded and not used by auto-rotation module 430 to make a decision on the overall
orientation of the document).
	Lund does not explicitly disclose the step of:  providing the first rotated machine readable text data object to a natural language processing (NLP) engine.  
Natural language processing is a well known technique in the art to supplement OCR in identifying the correct recognition.  Joseph teaches a similar text orientation determination to that of Lund and further teaches that a natural language processing (NLP) techniques are used to convert into sequence of lexical stream containing natural words, phrases, and syntactic markers etc. for better understanding of the contents of the test script description (column 9, lines 55-67).  Therefore it would have been obvious to one of ordinary skill in the art before the 

With regard to claim 4, Lund discloses wherein generating a plurality of rotated image data objects comprises: 
generating a first rotated image data object comprising the original image data object rotated to a first rotational position; 
generating a second rotated image data object comprising the original image data object rotated to a second rotational position; 
generating a third rotated image data object comprising the original image data object rotated to a third rotational position; and 
storing each of the first rotated image data object, the second rotated image data object, and the third rotated image data object in association with the original image data object (column 7, lines 60-67, Angles 0, 90, 180 and 270 are examined for performing OCR for each orientation).  

With regard to claim 5,  Lund discloses wherein generating an initial quality score for the initial machine readable text comprises: 
generating text metadata comprising text summarization metrics for the initial machine readable text; 
processing the text metadata using one or more machine learning models to generate the initial quality score and associating the initial quality score with the initial machine readable text (column 8, lines 18-54, Lund discloses that scores are calculated for the OCR recognition processing and that the neural network is used to compute the scores).

wherein the text summarization metrics comprise one or more of: 
a count of words not evaluated within the initial machine readable text; 
a count of words evaluated within the initial machine readable text; 
a count of words within the initial machine readable text not found in a dictionary;
a count of words within the initial machine readable text found in the dictionary; 
a count of words within the initial machine readable text; or 
a count of space characters within the initial machine readable text (column 8, line 55-column 9, line 67, Lund disclose that confidence probability for different letters are determined for each orientation and the orientation is determined based on the text scores).

With regard to claim 7, the discussion of claim 1 applies.

With regard to claim 10-12, the discussion of claims 4-6 apply respectively.

With regard to claim 13, the discussion of claim 1 applies.

With regard to claims 16-18, the discussion of claims 4-6 apply respectively.




Claims 2, 8 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of USPNs 10,616,443 to Lund and 10,810,473 to Joseph et al. and further in view of 2011/02858195 to Welling et al.

wherein generating an initial quality score comprises: identifying one or more words within the initial machine readable text based at least in part on a machine-learning model for identifying spaces between words; 32 
LEGAL2/39846700v5comparing each of the one or more words identified within the initial machine readable text against words within a dictionary retrieved for checking spelling within the initial machine readable text; 
generating a spelling error detection rate for the initial machine readable text; determining the initial quality score based at least in part on the spelling error detection rate for the initial machine readable text (paragraphs  [0144], and [0162]-[0164], [0170]).  Welling teaches that spelling is determined by matching words to a dictionary and that the words that match entries in the dictionary are passed to the SVM classification system.  IT would have been obvious to one of ordinary skill in the art before time of filing to use a dictionary to check spelling of OCR recognized text in order to judge effectiveness of the recognition as taught by Welling in combination with the text recognition of Lund and the NLP recognition of Joseph.
With regard to claims 8 and 14, the discussion of claim 2 applies.


Claims 3, 9 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of USPNs 10,616,443 to Lund and 10,810,473 to Joseph et al. and further in view of 2011/02858195 to Welling et al. and 8,224,131 to Maekawa.

identifying, within metadata associated with the original image data object, a language associated with the original image data object; and retrieving the dictionary based at least in part on the language associated with the original image data object (Fig. 8, step S201 and column 4, lines 11-51).  Maekawa determines which language is present and accordingly uses the corresponding dictionary in performing character recognition.  Therefore it would have been obvious to one of ordinary skill in the art before time of filing to use the appropriate language dictionary for performing character recognition as taught by Maekawa in order to most efficiently perform OCR in the systems of Lund, Joseph and Welling.  

With regard to claims 9 and 15, the discussion of claim 3 applies.



Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WESLEY J TUCKER whose telephone number is (571)272-7427. The examiner can normally be reached 9AM-5PM Monday-Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CHAN PARK can be reached on 571-272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


WESLEY J. TUCKER
Primary Examiner
Art Unit 2669



/WESLEY J TUCKER/Primary Examiner, Art Unit 2669