Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
References Cited in Prior Art Rejections 
The following references are cited in the prior art rejections set forth below and are referred to as noted:
Welinder et al., US 20180024974 A1, published on 2018-01-25, hereinafter Welinder.
Rashid et al., "A discriminative learning approach for orientation detection of urdu document images." In 2009 IEEE 13th International Multitopic Conference, pp. 1-5. IEEE, 2009, hereinafter Rashid.  
Gordo Soldevila et al., US 20170177965 A1, published on 2017-06-22, hereinafter Soldevila.
Fu et al., US 20190095730 A1, published on 2019-03-28, hereinafter Fu.  
Comay et al., US 20110166934 A1, published on 2011-07-07, hereinafter Comay.  
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-4, 6-7, 9-12 and 15-19 are rejected under 35 U.S.C. 103 as being unpatentable over Welinder, in view of Rashid, and further in view of Soldevila.
Regarding claim 1, Welinder discloses a method, (Welinder: Abstract) comprising: 
processing a plurality of digital images utilizing a document detection neural network to identify a digital image comprising a depiction of a document; (Welinder: [0037, 0052]. “[0052] … As shown in FIG. 2, mobile computing device 106 and/or online content management system application 108 performs an act 202 of detecting a displayed document in an image frame received from a live image feed. For instance, in one or more embodiments, document enhancement system 100 analyzes the received image frame to identify the displayed document. Document enhancement system 100 can utilize a trained neural network to identify the displayed document in the image frame.”)
utilizing Welinder: [0059-0060, 0063, 0066]. “[0059] Furthermore, as part of act 302 document enhancement system 100 also rectifies the displayed document. For example, the displayed document may be skewed due to the camera angle when the original image frame was captured (e.g., the edges of the displayed document may not be square or rectangular because the camera was not parallel to the document, or the document was not on a flat surface, etc.). Thus, document enhancement system 100 rectifies the displayed document utilizing geometric transformations to correct any skew or orientation abnormality in the displayed document.” “[0066] … document enhancement system 100 can bring the angle of the corner to ninety degrees using an affine transformation, and aligning the displayed document's edges with the vertical and horizontal directions in the rectified displayed document.” Detecting the orientation of the document is implied for the document enhancement system 100 to rectify the displayed document by correcting its orientation.) and 
utilizing a Welinder: “[0085] … document enhancement system 100 can optionally perform additional procedures in combination with the enhanced document image (e.g., optical character recognition, text searching, etc.). “ The “text searching” implies generation of the computer searchable text since otherwise the system cannot perform text searching.)
Welinder does not disclose explicitly but Rashid teaches, in an analogous art, utilizing an orientation neural network to detect an orientation of the document within the digital image. (Rashid: section II describes “the proposed method for orientation detection of Urdu documents” (first paragraph in section II titled “Method Description”) by exploiting “the properties of convolutional neural network as discriminative learning model for orientation detection of scanned Urdu document images” (first paragraph in subsection II.B titled “CNN Architecture and Training Criteria”))
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Welinder’s disclosure with Rashid’s teachings by combining the document enhancement method (from Welinder) with the technique of detecting document orientation using a neural network (from Rashid) to yield no more than predictable use of prior art elements according to their established functions since all the claimed elements, which are taught by prior art references, would continue to operate in the same manner, particularly, the document enhancement method would still work in the way according to Welinder and the technique of detecting document orientation using a neural network would continue to function as taught by Rashid. In fact, the inclusion of Rashid's technique of detecting document orientation using a neural network would provide a practical and/or alternative implementation of the document enhancement method and as a result would enable a better and more flexible document enhancement method due to the alternative implementation provided by Rashid.
The combination of Welinder and Soldevila, or Welinder {modified by Soldevila}, does not disclose explicitly but Soldevila teaches, in an analogous art, utilizing a text prediction neural network to generate computer searchable text for the depiction of the document. (Soldevila: Abstract, Figs. 1 and 4, [0057, 0074]. The “license plate transcription” is interpreted as the claimed “searchable text”. The image 210 in Fig. 4 shows an image comprising a depiction of a document.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Welinder {modified by Soldevila}’s disclosure with Soldevila’s teachings by combining the document enhancement method (from Welinder {modified by Soldevila}) with the technique of utilizing a text prediction neural network to generate computer searchable text (from Soldevila) to yield no more than predictable use of prior art elements according to their established functions since all the claimed elements, which are taught by prior art references, would continue to operate in the same manner, particularly, the document enhancement method would still work in the way according to Welinder {modified by Soldevila} and the technique of utilizing a text prediction neural network to generate computer searchable text would continue to function as taught by Soldevila. In fact, the inclusion of Soldevila's technique of utilizing a text prediction neural network to generate computer searchable text would provide a practical and/or alternative implementation of the document enhancement method to “perform additional procedures” (Welinder: [0085]) and as a result would broaden the application of the document enhancement method due to the technique provided by Soldevila.
Therefore, it would have been obvious to combine Welinder with Rashid and Soldevila to obtain the invention as specified in claim 1. 
Regarding claim 2, Welinder {modified by Rashid and Soldevila} discloses the method recited in claim 1, wherein the document detection neural network comprises a convolutional neural network trained to classify digital images portraying one or more documents. (Welinder: Figs. 6-7, [0037, 0097-0098])
Regarding claim 3, Welinder {modified by Rashid and Soldevila} discloses the method recited in claim 1, further comprising processing the digital image to generate an enhanced digital image by: determining boundaries and corners of the document depicted in the digital image; and cropping the digital image utilizing the boundaries and corners of the document. (Welinder: Abstract, Figs. 3-4, [0030, 0057-0058])
Regarding claim 4, Welinder {modified by Rashid and Soldevila} discloses the method recited in claim 3, wherein the orientation neural network comprises a convolutional neural network and utilizing the orientation neural network to detect the orientation of the document within the digital image further comprises processing the enhanced digital image utilizing the convolutional neural network to classify the document from the digital image into an orientation category. (Rashid: “The output layer consists of 4 units corresponding to 4 orientation classes.” (See second paragraph in subsection II.B). Fig. 3 shows examples of 4 orientation classes.)
Regarding claim 6, Welinder {modified by Rashid and Soldevila} discloses the method recited in claim 4, wherein utilizing the text prediction neural network to generate the computer searchable text comprises: rotating the enhanced digital image according to the orientation category; extracting word images from the rotated, enhanced digital image; and processing the word images utilizing the text prediction neural network to generate the computer searchable text. (Welinder: Figs. 3-4, [0057, 0060, 0085]) (Rashid: subsection II.B and Fig. 3)( Soldevila: Abstract, Figs. 1 and 4, [0057, 0074]) (See more discussions regarding claim 1)
Regarding claim 7, Welinder {modified by Rashid and Soldevila} discloses the method recited in claim 1, further comprising indexing the digital image by associating a token with the digital image, the token comprising the computer searchable text. (Soldevila: Figs. 1 and 4, [0056-0057, 0075]. The token is interpreted as the recognized text (such as ABC123 in Fig. 4) in a license plate transcription which provides indexing to the license plate image 210 in Fig. 4.) 
Claims 9-12 are the apparatus (Welinder: Fig. 12, [0119, 0136, 0141-0142, 0145-0147]) claims, respectively, corresponding to the method claims 1, 3-4 and 6. Therefore, since claims 9-12 are similar in scope to claims 1, 3-4 and 6, claims 9-12 are rejected on the same grounds as claims 1, 3-4 and 6.
Claims 15-19 are the computer readable storage medium (Welinder: Fig. 12, [0119, 0136, 0141-0142, 0145-0147]) claims, respectively, corresponding to the method claims 1-4 and 6. Therefore, since claims 15-19 are similar in scope to claims 1-4 and 6, claims 15-19 are rejected on the same grounds as claims 1-4 and 6.

Claims 5 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Welinder {modified by Rashid and Soldevila} as applied to claims 4 and 9 discussed above, and further in view of Fu.
Regarding claim 5, which depends on claim 4, Welinder {modified by Rashid and Soldevila} discloses the text prediction neural network but does not disclose explicitly a neural network comprising a stack of convolutional layers, a stack of bidirectional long short term memory layers, and a connectionist temporal classification layer. However, Fu teaches, in an analogous art, a neural network comprising a stack of convolutional layers, a stack of bidirectional long short term memory layers, and a connectionist temporal classification layer. (Fu: [0152])
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Welinder {modified by Rashid and Soldevila}’s disclosure with Fu’s teachings by combining the document processing method (from Welinder {modified by Rashid and Soldevila}) with the technique of implementing a neural network with convolutional layers, bidirectional long short term memory layers, and a connectionist temporal classification layer (from Fu) to yield no more than predictable use of prior art elements according to their established functions since all the claimed elements, which are taught by prior art references, would continue to operate in the same manner, particularly, the document processing method would still work in the way according to Welinder {modified by Rashid and Soldevila} and the technique of implementing a neural network with convolutional layers, bidirectional long short term memory layers, and a connectionist temporal classification layer would continue to function as taught by Fu. In fact, the inclusion of Fu's technique of implementing a neural network with convolutional layers, bidirectional long short term memory layers, and a connectionist temporal classification layer would provide a practical and/or alternative implementation of the document processing method and as a result would enable a better and more flexible document processing method due to the alternative implementation provided by Fu.
Therefore, it would have been obvious to combine Welinder {modified by Rashid and Soldevila} with Fu to obtain the invention as specified in claim 5. 
Claim 13 is similarly rejected as claim 5 discussed above.
Claims 8, 14 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Welinder {modified by Rashid and Soldevila} as applied to claims 1, 9 and 15 discussed above, and further in view of Comay.
Regarding claim 8, which depends on claim 1, Welinder {modified by Rashid and Soldevila} does not disclose explicitly but Comay teaches, in an analogous art, utilizing the computer searchable text to identify a document category corresponding to the digital image comprising the depiction of the document; and providing the digital image to a client device associated with the document category. (Comay: [0053, 0056, 0077]. “[0053] … Image recognition operations may also include recognizing a type of document, and identifying particular data within the document. For example, a document recognition operation may include identifying a title of the document and classifying the document on the basis of the text content of the identified title. Identification of particular data may include identifying text on the basis of its position within the document, or on its proximity to a key word.” “[0077] … For example, a user may subscribe to a general document archive and access service. A general document archive and access module may analyze a received image of documents and extract data from the document. For example, a user device may transmit an image of a variety of documents to a processing unit that includes a general document archive and access module. The general document archive and access module may identify content in the document image. Documents may be categorized based on content found in the documents. The documents may be archived and later searched and retrieved base on text found in the documents.” The claimed “computer searchable text” is interpreted as the disclosed “text content of the identified title” or “text”. Searching and retrieving categorized documents from the archive implies providing the digital image to a client device associated with the document category.) 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Welinder {modified by Rashid and Soldevila}’s disclosure with Comay’s teachings by combining the document processing method (from Welinder {modified by Rashid and Soldevila}) with the technique of using computer searchable text to categorize documents to be searched and retrieved (from Comay) to yield no more than predictable use of prior art elements according to their established functions since all the claimed elements, which are taught by prior art references, would continue to operate in the same manner, particularly, the document processing method would still work in the way according to Welinder {modified by Rashid and Soldevila} and the technique of using computer searchable text to categorize documents to be searched and retrieved would continue to function as taught by Comay. In fact, the inclusion of Comay's technique of using computer searchable text to categorize documents to be searched and retrieved would provide a practical and/or alternative implementation of the document processing method and as a result would enable a better and more flexible document processing method due to the alternative implementation provided by Comay.
Therefore, it would have been obvious to combine Welinder {modified by Rashid and Soldevila} with Comay to obtain the invention as specified in claim 8. 
Claims 14 and 20 is similarly rejected as claim 8 discussed above.


Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.   A nonstatutory obviousness-type double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and  In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 C.F.R. § 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. 
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 C.F.R. § 3.73(b).
Claims 1 and 15 are rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 2 and 12 of U.S. Patent No. US 10783400 B2.  Although the conflicting claims are not identical, they are not patentably distinct from each other because the claims of the instant application are anticipated by the claims of the ‘400 patent.
With respect to claim 1 of the instant application, claim 2, which depends on claim 1, of the ‘400 patent stipulates a method, comprising: processing a plurality of digital images utilizing a document detection neural network to identify a digital image comprising a depiction of a document; (col. 40, lines 48-49 and 60-65) utilizing an orientation neural network to detect an orientation of the document within the digital image; (col. 40, lines 50-51) and utilizing a text prediction neural network to generate computer searchable text for the depiction of the document based on the detected orientation of the document, (col. 40, lines 55-59) so that the invention defined by claim 1 of the instant application is fully anticipated by claim 2 of the '400 patent.  
Similarly, claim 15 of the instant application is anticipated by the claim 12 of the ‘400 patent.

Claim 9 is rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claim 17 of the ‘400 patent in view of Welinder. 
Regarding claim 9, the claim 17 of the ‘400 patent discloses all elements claimed by the claim 9 of the instant application except utilizing a document detection neural network to identify the digital image, which are taught by Welinder as discussed in art rejection in regard to claim 1. (Welinder: [0037, 0052].) 
One of ordinary skill in the art before the effective filing date of the claimed invention would be motivated to make modifications to the disclosure of the claim 17 of the ‘400 patent with the technique from Welinder since Welinder’s technique provides a practical and/or alternative implementation of the system recited in the claim 17 of the ‘400 patent and as a result the combination would enable a better and more flexible system recited in the claim 17 of the ‘400 patent. Furthermore, such a combination would not change the function of either the system of the claim 17 of the ‘400 patent or Welinder’s technique and would yield no more than predictable use of elements from both the system and the technique according to their established functions. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the claim 1 of the ‘400 patent with Welinder to obtain the invention as specified in claim 9.
The dependent claims 2-8, 10-14 and 16-20 are rejected as being obvious over the claims 1, 11 and 17 of the ‘400 patent in view of the art of record relied upon in the rejections above, as applied to the claims above.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Li (US 20170372169 A1): The present disclosure provides a method performed at a computing device for recognizing an image's content. The method includes: extracting one or more features from an image to be recognized; comparing the features of the image with a set of image classifiers and obtaining a probability value for each image classifier; selecting, from the set of image classifiers, at least one image classifier as a target image classifier according to the probability value of the image classifier; determining a degree of similarity between each image and the image to be recognized, and selecting, from the target image classifier, multiple images as target images when their respective degrees of similarity with the image to be recognized exceed a predefined threshold; and labeling, the image to be recognized according to a class label corresponding to the target image classifier and entity labels corresponding to the target images in the target image classifier. (Abstract)

    PNG
    media_image1.png
    443
    481
    media_image1.png
    Greyscale

Turkelson et al. (US 20090067729 A1): An automatic document classification system is described that uses lexical and physical features to assign a class c.sub.i.epsilon.C{c.sub.1, c.sub.2, . . . , c.sub.i} to a document d. The primary lexical features are the result of a feature selection method known as Orthogonal Centroid Feature Selection (OCFS). Additional information may be gathered on character type frequencies (digits, letters, and symbols) within d. Physical information is assembled through image analysis to yield physical attributes such as document dimensionality, text alignment, and color distribution. The resulting lexical and physical information is combined into an input vector X and is used to train a supervised neural network to perform the classification. (Abstract)
Braun et al. (US 20180336466 A1): A sensor transformation attention network (STAN) model including sensors configured to collect input signals, attention modules configured to calculate attention scores of feature vectors corresponding to the input signals, a merge module configured to calculate attention values of the attention scores, and generate a merged transformation vector based on the attention values and the feature vectors, and a task-specific module configured to classify the merged transformation vector is provided. (Abstract)
“[0013] The attention modules may include any at least one of a fully-connected neural network (FCNN), a convolutional neural network (CNN), and a recurrent neural network (RNN).”
“[0017] The task-specific module may include two layers of bidirectional GRUs and a long short term memory (LSTM).”
“[0111] Connected digit sequences may consider a sequence-to-sequence mapping task. In order to automatically learn alignments between speech frames and to label sequences, a connectionist temporal classification (CTC) objective may be adopted. All models may be trained with an ADAM optimizer for a maximum of 100 epochs, with early stopping to prevent overfitting.”
Comay et al. (US 20110052075 A1): “[0056] A household management module in accordance with embodiments of the present invention, may maintain a database of products and prices at various stores. For example, information from a received receipt image may be added to the database as the information is acquired. A received receipt image may also be used to query the database. For example, a query to the database based on information from the receipt image may be used to compare prices. For example, the household management module may send to a user a listing or sum of what the same products would have cost if purchased at another store. The query may be limited to a particular geographical area. For example, the query may be limited to stores in a limited geographical area near the store that issued the receipt. Such a query may help the user select a store for future purchases. On the other hand, the query may include stores in a wide geographical region so as to enable regional comparisons of prices. The results of the query, such as a price comparison, may be sent to the user or a user device in the form of a message.”
Simske et al. (US 20060218110 A1): “[0019] Once in a digital format, document 12 is applied to OCR engine 18, if necessary, to convert any text in document 12 that is represented in image format into recognizable text. After any image data in the document is converted to searchable text, document 12 is applied to classifier engine 20, which predicts an appropriate classification for document 12. Thus, an association is drawn between document 12 (to be subsequently indexed) and one of the existing classifications. Further, classifier engine 20 may generate a list of classifications that is ordered according to the likelihood that the new document appropriately falls within each classification. For example, the more likely document 12 is properly classified in a given classification, the higher the priority assigned to the classification in the list. Initially, document 12 is classified as belonging to the highest priority classification on the list. As known by a person skilled in the art, classifier engine 20 may employ winnowing algorithms, predefined rules (e.g., assigning all documents entered by a billing clerk to one particular classification), and other techniques to predict an appropriate classification for the new document 12.”

    PNG
    media_image2.png
    492
    759
    media_image2.png
    Greyscale

Lefebvre (US 10126825 B2): “(23) When, in step 209, a correspondence is detected between the FVS sequence obtained in step 207 and an element from the knowledge base containing the examples of input end FVS sequences, the CFVS list contains the FVS sequences corresponding to a word input by the user. This list is then used in step 211 to perform the recognition of the input word. According to one particular embodiment, the terminal uses a BLSTM (for Bidirectional Long Short-Term Memory in English) neural network with a connectionist temporal classification. A neural network of this type is described in the document entitled “Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks”, Alex Graves and Santiago Fernandez and Faustino Gomez, ICML 2006.” (col. 8, lines 10-23)
Cullen et al. (US 5995978 A): An interactive database organization and searching system employs text search and image feature extraction to automatically group documents together by appearance. The system automatically determines visual characteristics of document images and collect documents together according to the relative similarity of their document images. (Abstract)
Wang et al. (US 10891476 B2): A method, system, and neural network for identifying direction of a document where the method comprises: extracting a text line in the document; calculating a first normal direction result indicative of the text line probably being in a normal direction and a first upside-down direction result indicative of the text line probably being in a direction upside-down with respect to the normal direction; calculating a second normal direction result indicative of the text line after being rotated by 180 degrees probably being in the normal direction and a second upside-down direction result indicative of the text line after being rotated by 180 degrees probably being in the direction upside-down with respect to the normal direction; and determining the direction of the document according to the first normal direction result, the first upside-down direction result, the second normal direction result and the second upside-down direction result.

    PNG
    media_image3.png
    367
    444
    media_image3.png
    Greyscale

Any inquiry concerning this communication or earlier communications from the examiner should be directed to FENG NIU whose telephone number is (571)272-9592.  The examiner can normally be reached on Monday - Friday, 8am-5pm PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on (571) 272-7409.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/FENG NIU/Primary Examiner, Art Unit 2669