DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2, 12-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term “specified dynamic denominator” in claims 2, 12 renders the claim indefinite because it is not clear what applicant intends to cover. The scope and bound of this limitation is not clear. The term “specified dynamic denominator” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. For examination purpose, the term has been interpreted to be filter convolution.
Claims 13-20 are also rejected under 35 U.S.C. 112(b) as being dependent upon a rejected base claim.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 and 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al (2017 CVPR, pp. 5551-5560) in view of Flament et al (U.S PG-PUB NO. 20190019020 A1), and further in view of Odate et al (U.S PG-PUB NO. 20200387733 A1).
	-Regarding claim 1, Zhou discloses a computer implemented method of performing single pass optical character recognition (OCR) (Abstract: “text detection … predict words or text lines”; Figures 1-6 (in Figure 2(e), an input image is processed once through a fully convolutional neural network to determine the words within the image)) including at least one fully convolutional neural network (FCN) engine (Page 5551, 2nd Col., 3rd paragraph, “utilizes a fully convolutional network (FCN)”; Figure 2 (e) 
    PNG
    media_image1.png
    393
    1048
    media_image1.png
    Greyscale
; Figure 3) Figures 5-6); extracting image features from the input image using a plurality of convolutional layers included in the FCN engine (Figures 1-2), wherein image features of the input image are extracted at each convolutional layer in the FCN engine (Figure 3, feature extractor stem (PVANet)); aggregating the image features extracted at each convolutional layer of the FCN engine (Figure 3,

    PNG
    media_image2.png
    417
    383
    media_image2.png
    Greyscale
 feature-merging branch); determining at least one optical character recognition (OCR) feature from the aggregated extracted image features (Figure 3, output layer); building word boxes using the determined at least one optical character recognition feature (Figure 3, text box, RBOX, QUAD; Figure 4); determining each character within each word box based on character predictions (Abstract; Figure 2 (e); Page 5551, 2nd Col., 3rd paragraph; Page 5553, 1st Col., 1st paragraph, Section 3.1, 3rd paragraph; Page 5558, Section 5) ; and Figures 5-6, 2).
	Zhou does not teach including at least one processor and at least one memory, the at least one memory including instructions that, when executed by the at least processor, cause the processor to perform the above steps. Zhou does not teach pre-processing input image, and transmitting for display recognition result.
	In the same field of endeavor, Flament teaches a system and method for image recognition and character recognition (OCR) associated with fully convolutional neural network (Flament: Abstract; FIGS 1-9). Flament further teaches at least one fully convolutional neural network (FCN) engine (Flament: FIG. 1, neural network 124; Abstract, “fully convolutional neural network”; [0043]; FIG. 6) including at least one processor and at least one memory (Flament: FIG.1, processor 122, memory 126; FIG. 2), the at least one memory including instructions that, when executed by the at least processor, cause the processor to perform character recognition steps (Flament: Abstract; [0074]-[0075]; FIGS 1-4). Although Zhou does not teach pre-processing input image, pre-processing is a basic step for any image processing such as image recognition or OCR. Flament also teaches pre-processing input image (Flament: FIG. 4, step 404; [0036]; [0043])
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Zhou with the teaching of Flament by using the similar system implementation in order to implement a feasible Text detector.
Zhou in view of Flament does not teach transmitting for display recognition result.
However, Odate is an analogous art pertinent to the problem to be solved in this application and discloses a system and method transmitting an image to a cloud server. A recognition result integration unit receives recognition result of the character recognition processing for the image from the cloud server. Odate further discloses transmitting for display recognition result (Odate: FIG. 1, Character recognition unit 104, Recognition result integration unit 106; [0089], “display”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament with the teaching of Odate by using a recognition result integration unit and system implementation method for transmitting for display each word box including its corresponding determined characters in order to improve recognition efficiency and user experience.
-Regarding claim 6, the modification further discloses wherein the step of determining at least one OCR feature from the aggregated extracted image features includes calculating character predictions (Zhou: Abstract, “predicts words”; Page 5552, 1st Col., 1st paragraph).
-Regarding claim 7, the modification further discloses wherein the step of building word boxes includes determining boundaries between words (Zhou: Page 5554, Section 3.3.2, “boundaries of text box”; Page 5556, 1st Col., Section 4.1, 2nd paragraph, “word region … bounding box”).
Claims 2-3, 12, and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al (2017 CVPR, pp. 5551-5560) in view of Flament et al (U.S PG-PUB NO. 20190019020 A1), and further in view of Odate et al (U.S PG-PUB NO. 20200387733 A1), in view of Such et al (U.S PG-PUB NO. 20180137350 A1).
-Regarding claim 2, Zhou in view of Flament, and further in view of Odate discloses the method of claim 1.
Zhou in view of Flament, and further in view of Odate does not teach preprocess the input image by padding the image with zero values to a specified dynamic denominator.
However, Such is an analogous art pertinent to the problem to be solved in this application and discloses a system and method of character recognition using fully convolutional neural networks with attention (Such: Abstract; FIGS. 1-22). Such further teaches preprocess the input image by padding the image with zero values to a specified dynamic denominator (Such: [0081], “preprocessing”; [0097]; [0099], “padding with 0”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament, and further in view of Odate with the teaching of Such by padding the image with zeros in order to provide convenience for performing convolution with convolution filters.
-Regarding claim 3, Zhou in view of Flament, and further in view of Odate discloses the method of claim 1.
Zhou in view of Flament, and further in view of Odate does not teach wherein the step of extracting image features from the input image includes filtering the input image at each convolutional layer in the FCN engine thereby extracting features from the image at each convolutional layer.
However, Such is an analogous art pertinent to the problem to be solved in this application and discloses a system and method of character recognition using fully convolutional neural networks with attention (Such: Abstract; FIGS. 1-22). Such further teaches wherein the step of extracting image features from the input image includes filtering the input image at each convolutional layer in the FCN engine thereby extracting features from the image at each convolutional layer (Such: [0081]; [0097]; [0099]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament, and further in view of Odate with the teaching of Such by extracting image features from the input image includes filtering the input image at each convolutional layer in the FCN engine thereby extracting features from the image at each convolutional layer in order to help extracting specific features from input image.
-Regarding claim 12, Zhou discloses a system for performing single pass optical character recognition (OCR), the system comprising (Abstract: “text detection … predict words or text lines”; Figures 1-6 (in Figure 2(e), an input image is processed once through a fully convolutional neural network to determine the words within the image)): Figures 5-6); Figures 1-2), wherein image features of the input image are extracted at each convolutional layer of the FCN engine (Figure 3, feature extractor stem (PVANet)); aggregate the image features extracted at each convolutional layer of the FCN engine (Figure 3, feature-merging branch); determine at least one optical character recognition (OCR) feature from the aggregated extracted image features (Figure 3, output layer); build word boxes using the determined at least one optical character recognition feature (Figure 3, text box, RBOX, QUAD; Figure 4); determine each character within each word box based on character predictions (Abstract; Figure 2 (e); Page 5551, 2nd Col., 3rd paragraph; Page 5553, 1st Col., 1st paragraph, Section 3.1, 3rd paragraph; Page 5558, Section 5Figures 5-6, 2).
Zhou does not teach a network interface configured to receive an input image, and a memory configured to store electronic program guide data and computer executable instructions; an FCN engine including at least one processor configured to execute the computer executable instructions to: preprocess the input image by padding the image with zero values to a specified dynamic denominator. Zhou dose not teach transmitting for display recognition result. However, pre-processing is a basic step for any image processing such as image recognition or OCR.
	In the same field of endeavor, Flament teaches a network interface configured to receive an input image (Flament: FIG.1, input/output 101; FIG. 2, I/O 228, 258, network 214), and a memory configured to store electronic program guide data and computer executable instructions (FIG. 1, memory 126; FIG. 2; [0025]; [0032]; [0079]); an FCN engine (Flament: FIG. 1, neural network 124; Abstract, “fully convolutional neural network”; [0043]; FIG. 6) including at least one processor configured to execute the computer executable instructions to (Flament: Abstract; [0074]-[0075]; FIGS 1-4; [0025]): preprocess the input image (Flament: FIG. 4, step 404; [0036]; [0043]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Zhou with the teaching of Flament by using the similar system implementation in order to implement a feasible Text detector.
Zhou in view of Flament does not teach transmitting for display recognition result.
However, Odate is an analogous art pertinent to the problem to be solved in this application and discloses a system and method transmitting an image to a cloud server. A recognition result integration unit receives recognition result of the character recognition processing for the image from the cloud server. Odate further discloses transmitting for display recognition result (Odate: FIG. 1, Character recognition unit 104, Recognition result integration unit 106; [0089], “display”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament with the teaching of Odate by using a recognition result integration unit and system implementation method for transmitting for display each word box including its corresponding determined characters in order to improve recognition efficiency and user experience.
Zhou in view of Flament, and further in view of Odate does not teach preprocess the input image by padding the image with zero values to a specified dynamic denominator.
However, Such is an analogous art pertinent to the problem to be solved in this application and discloses a system and method of character recognition using fully convolutional neural networks with attention (Such: Abstract; FIGS. 1-22). Such further teaches preprocess the input image by padding the image with zero values to a specified dynamic denominator (Such: [0081], “preprocessing”; [0097]; [0099], “padding with 0”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament, and further in view of Odate with the teaching of Such by padding the image with zeros in order to provide convenience for performing convolution with convolution filters.
-Regarding claim 15, Zhou in view of Flament, and further in view of Odate , in view of Such discloses the method of claim 12. The modification further discloses the determined at least one OCR feature is calculated character predictions (Zhou: Abstract, “predicts words”; Page 5552, 1st Col., 1st paragraph).
-Regarding claim 16, Zhou in view of Flament, and further in view of Odate , in view of Such discloses the method of claim 12. The modification further discloses wherein building word boxes includes determining boundaries between words (Zhou: Page 5554, Section 3.3.2, “boundaries of text box”; Page 5556, 1st Col., Section 4.1, 2nd paragraph, “word region … bounding box”).
Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al (2017 CVPR, pp. 5551-5560) in view of Flament et al (U.S PG-PUB NO. 20190019020 A1), and further in view of Odate et al (U.S PG-PUB NO. 20200387733 A1), in view of Cao et al (CN 110032998 A).
-Regarding claim 4, Zhou in view of Flament, and further in view of Odate discloses the method of claim 1.
Zhou in view of Flament, and further in view of Odate does not teach wherein the step of determining at least one OCR feature from the aggregated extracted image features includes calculating a wordiness score.
However, Cao is an analogous art pertinent to the problem to be solved in this application and discloses wherein the step of determining at least one OCR feature from the aggregated extracted image features includes calculating a wordiness score (Cao: Page 3, 2rd paragraph; Page 7, 2nd paragraph, “each pixel point represents the score on the map of the corresponding pixel point on the picture belongs to the word probability”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament, and further in view of Odate with the teaching of Cao by determining at least one OCR feature from the aggregated extracted image features includes calculating a wordiness score in order to improve precision rate of word detection.
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al (2017 CVPR, pp. 5551-5560) in view of Flament et al (U.S PG-PUB NO. 20190019020 A1), and further in view of Odate et al (U.S PG-PUB NO. 20200387733 A1), in view of Reisswig et al (U.S PATENT NO. 10540579 B2).
-Regarding claim 8, Zhou in view of Flament, and further in view of Odate discloses the method of claim 1.
Zhou in view of Flament, and further in view of Odate does not teach wherein the step of building word boxes includes determining centers of words.
However, Reisswig is an analogous art pertinent to the problem to be solved in this application and discloses wherein the step of building word boxes includes determining centers of words (Reisswig: FIG. 4; Col. 5, lines 44-56, “center location”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament, and further in view of Odate with the teaching of Reisswig by determining centers of words in order to improve precision rate of words detection.
Claims 9-10 are rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al (2017 CVPR, pp. 5551-5560) in view of Flament et al (U.S PG-PUB NO. 20190019020 A1), and further in view of Odate et al (U.S PG-PUB NO. 20200387733 A1), in view of Hoehne et al (U.S PG-PUB NO. 20200082218 A1).
-Regarding claim 9, Zhou in view of Flament, and further in view of Odate discloses the method of claim 1.
Zhou in view of Flament, and further in view of Odate does not teach wherein the step of determining each character within each word box based on character predictions includes determining character boxes for each character in the word box by delineating each character in the word box by character gaps.
However, Hoehne is an analogous art pertinent to the problem to be solved in this application and discloses wherein the step of determining each character within each word box based on character predictions includes determining character boxes for each character in the word box by delineating each character in the word box by character gaps (Hoehne: FIGS. 1, 2H; [0032], “bounding boxes may represent delineations between groups of characters”; [0050], “bounding boxes … identify words … space and/or gaps between groups of letters”; [0055]; [0055]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament, and further in view of Odate with the teaching of Hoehne by delineating each character in the word box by character gaps in order to improve precision rate of words detection.
-Regarding claim 10, Zhou in view of Flament, and further in view of Odate, in view of Hoehne discloses the method of claim 9.
Zhou in view of Flament, and further in view of Odate does not teach wherein a number of character predictions are made for each character box and the number of character predictions are aggregated.
However, Hoehne is an analogous art pertinent to the problem to be solved in this application and discloses wherein a number of character predictions are made for each character box and the number of character predictions are aggregated (Hoehne: FIGS. 1, 3, step 320; [0073], “combine … bounding box to generate … recognized test”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament, and further in view of Odate with the teaching of Hoehne by making character prediction for each character box and aggregating the number of character predictions in order to improve precision rate of words detection.
Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al (2017 CVPR, pp. 5551-5560) in view of Flament et al (U.S PG-PUB NO. 20190019020 A1), and further in view of Odate et al (U.S PG-PUB NO. 20200387733 A1), ), in view of Such et al (U.S PG-PUB NO. 20180137350 A1 A1), in view of Cao et al (CN 110032998 A).
-Regarding claim 13, Zhou in view of Flament, and further in view of Odate , in view of Such discloses the method of claim 12.
Zhou in view of Flament, and further in view of Odate, in view of Such does not teach wherein the step of determining at least one OCR feature from the aggregated extracted image features includes calculating a wordiness score.
However, Cao is an analogous art pertinent to the problem to be solved in this application and discloses wherein the step of determining at least one OCR feature from the aggregated extracted image features includes calculating a wordiness score (Cao: Page 3, 2rd paragraph; Page 7, 2nd paragraph, “each pixel point represents the score on the map of the corresponding pixel point on the picture belongs to the word probability”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament, and further in view of Odate, in view of Such  with the teaching of Cao by determining at least one OCR feature from the aggregated extracted image features includes calculating a wordiness score in order to improve precision rate of word detection.
Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al (2017 CVPR, pp. 5551-5560) in view of Flament et al (U.S PG-PUB NO. 20190019020 A1), and further in view of Odate et al (U.S PG-PUB NO. 20200387733 A1), in view of Such et al (U.S PG-PUB NO. 20180137350 A1 A1), in view of Reisswig et al (U.S PATENT NO. 10540579 B2).
-Regarding claim 17, Zhou in view of Flament, and further in view of Odate, in view of Such discloses the method of claim 12.
Zhou in view of Flament, and further in view of Odate, in view of Such does not teach wherein the step of building word boxes includes determining centers of words.
However, Reisswig is an analogous art pertinent to the problem to be solved in this application and discloses wherein the step of building word boxes includes determining centers of words (Reisswig: FIG. 4; Col. 5, lines 44-56, “center location”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament, and further in view of Odate, in view of Such with the teaching of Reisswig by determining centers of words to improve precision rate of words detection.
Claims 18-19 is rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al (2017 CVPR, pp. 5551-5560) in view of Flament et al (U.S PG-PUB NO. 20190019020 A1), and further in view of Odate et al (U.S PG-PUB NO. 20200387733 A1), in view of Such et al (U.S PG-PUB NO. 20180137350 A1 A1), in view of Reisswig et al Hoehne et al (U.S PG-PUB NO. 20200082218 A1).
-Regarding claim 18, Zhou in view of Flament, and further in view of Odate, in view of Such does not teach wherein the step of determining each character within each word box based on character predictions includes determining character boxes for each character in the word box by delineating each character in the word box by character gaps.
However, Hoehne is an analogous art pertinent to the problem to be solved in this application and discloses wherein the step of determining each character within each word box based on character predictions includes determining character boxes for each character in the word box by delineating each character in the word box by character gaps (Hoehne: FIGS. 1, 2H; [0032], “bounding boxes may represent delineations between groups of characters”; [0050], “bounding boxes … identify words … space and/or gaps between groups of letters”; [0055]; [0055]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament, and further in view of Odate with the teaching of Hoehne by delineating each character in the word box by character gaps in order to improve precision rate of words detection.
-Regarding claim 19, Zhou in view of Flament, and further in view of Odate, in view of Such does not teach wherein a number of character predictions are made for each character box and the number of character predictions are aggregated.
However, Hoehne is an analogous art pertinent to the problem to be solved in this application and discloses wherein a number of character predictions are made for each character box and the number of character predictions are aggregated (Hoehne: FIGS. 1, 3, step 320; [0073], “combine … bounding box to generate … recognized test”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Zhou in view of Flament, and further in view of Odate, in view of Such with the teaching of Hoehne by making character prediction for each character box and aggregating the number of character predictions in order to improve precision rate of words detection.
Allowable Subject Matter
Claims 5, 11, 14, and 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and overcome 112(b) rejections in the section of “Claim Rejections - 35 USC § 112”.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to XIAO LIU whose telephone number is (571)272-4539. The examiner can normally be reached Monday-Thursday and Alternate Fridays 8:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nay Maung can be reached on (571) 272-7882. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/XIAO LIU/Examiner, Art Unit 2664                                                                                                                                                                                                        /NANCY BITAR/Primary Examiner, Art Unit 2664