Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 2/10/2021 has been entered.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.

4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-5, 7-24, 26-43, 45-57, 74-76 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication 20170098138 (hereinafter referred to as Wang) in view of US Patent Application Publication 10032072 (hereinafter referred to as Tran).
Consider claims 1, 39, Wang discloses a computing device (see at least ¶ [0032], Fig. 1, “…a computing device 102…”) implemented method comprising: 
receiving an image that includes textual content in at least one font (see at least ¶ [0024], “…text localization techniques are described in which a digital medium environment is configured to localize text in an image for an arbitrary font…” and see at least ¶ [0056], Fig. 3, “…The collection includes a plurality of training images having text rendered using a corresponding font (block 302). A model is trained by the machine learning module 212 to predict a bounding box for text in an image. The model is trained using machine learning as applied to the plurality of training images having text rendered using the corresponding font (block 304) …”); and 
identifying the at least one font represented in the received image using a machine learning system (see at least ¶ [0025], “…text localization techniques described herein train a model using machine learning (e.g., a convolutional neural network) using training images. The model is then used to localize text in a subsequently received image, and may do so automatically and without user intervention…” and further see at least ¶ [0026], “…font similarity may be used to determine which fonts are similar to a font used to render text in an image, which may be used to navigate through hundreds or even 
Wang disclose all the subject matters of the claimed invention concept. However, Wang does not particularly disclose wherein a portion of the training images include synthetic text located in the foreground and being positioned over captured background imagery.  In an analogous field of endeavor, attention is directed to Tran, which teaches wherein a portion of the training images include synthetic text located in the foreground and being positioned over captured background imagery (see Tran, at least Col. 7, Line 56-67, Col. 8, Lines 1-26, “…generating synthetic text data can include, for example, first, generating a background layer, where background patches of random sizes can be drawn randomly from a database of images which includes clean simple patterns, smooth transition images, regular textures, natural texture and images. Images from a training set of text can be used or other training data. The background layer can be used as-is or undergo several iterations of blending with each other to get more a diversified mix. These patches are then resized to match the target output size. Second, the foreground text can be generated in the following manner. First a font can be randomly selected from 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the Wang disclosed invention, and have wherein a portion of the training images include synthetic text located in the foreground and being positioned over captured background imagery, as taught by Tran,  thereby, to provide for identifying text represented in image data as well as determining a location or region of the image data that includes the text represented in the image data, as discussed by Tran, (see Col. 1, Lines 61-67). 

a computing device (see at least ¶ [0032], Fig. 1, “…a computing device 102…”) comprising: 
a memory configured to store instructions (see at least ¶ [0034], Fig. 1, “…The processing system 106 is representative of functionality to perform operations through execution of instructions stored in the memory 108…”); and 
a processor to execute the instructions to perform operations (see at least ¶ [0121], “…processor-executable instructions may be electronically-executable instructions…”) comprising: 
receiving an image that includes textual content in at least one font (see at least ¶ [0024], “…text localization techniques are described in which a digital medium environment is configured to localize text in an image for an arbitrary font…” and see at least ¶ [0056], Fig. 3, “…The collection includes a plurality of training images having text rendered using a corresponding font (block 302). A model is trained by the machine learning module 212 to predict a bounding box for text in an image. The model is trained using machine learning as applied to the plurality of training images having text rendered using the corresponding font (block 304) …”); and 
identifying the at least one font represented in the received image using a machine learning system (see at least ¶ [0025], “…text localization techniques described herein train a model using machine learning (e.g., a convolutional neural network) using training images. The model is then used to localize text in a subsequently received image, and 
Wang disclose all the subject matters of the claimed invention concept. However, Wang does not particularly disclose wherein a portion of the training images include synthetic text located in the foreground and being positioned over captured background imagery.  In an analogous field of endeavor, attention is directed to Tran, which teaches wherein a portion of the training images include synthetic text located in the foreground and being positioned over captured background imagery (see Tran, at least Col. 7, Line 56-67, Col. 8, Lines 1-26, “…generating synthetic text data can include, for example, first, generating a background layer, where background patches of random sizes can be drawn randomly from a database of images which includes clean simple patterns, smooth transition images, regular textures, natural texture and images. Images from a training set of text can be used or other training data. The background layer can be used as-is or undergo several iterations of blending with each other to get more a diversified mix. These patches are then resized to match the target output size. Second, the foreground 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the Wang disclosed invention, and have wherein a portion of the training images include synthetic text located in the foreground and being positioned over captured background imagery, as taught by Tran,  thereby, to provide for identifying text represented in image data as well as determining a 
Consider claims 2, 21, 40, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches the text located in the foreground is synthetically augmented (see at least ¶ [0053], “…a training set generation module 208 is utilized to generate a training image and font collection 210 that includes training images that are to serve as a basis for training the text localization model 206. The training image and font collection 210 …” and see at least ¶ [0054], “…background/foreground intensity, text color flipping, shading, rotation, squeezing, cropping and noise…”).
Consider claims 3, 22, 41, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches synthetic augmentation is provided in a two-step process (see at least ¶ [0098], “…the machine learning module 1108 employs a neural network 110 have at least two machine learning subnets 1112, 1114 configured as a Siamese network to learn the weight function and compare fonts. The two machine learning subnets 1112, 1114 of the Siamese network are identical in this example. The end of each machines learning subnet 1112, 1114 includes a weight prediction layer that is used to predict a scalar values …”).

Wang teaches the text is synthetically augmented based upon one or more predefined conditions (see at least ¶ [0098], “…the machine learning module 1108 employs a neural network 110 have at least two machine learning subnets 1112, 1114 configured as a Siamese network to learn the weight function and compare fonts…, …The end of each machines learning subnet 1112, 1114 includes a weight prediction layer that is used to predict a scalar values. An additional layer illustrated as the classifier 1116 positioned "on top" of the two identical machine learning subnets 1112, 1114 takes the two scalars and forms a binary classifier…”).
Consider claims 5, 24, 43, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches the text located in the foreground is undistorted (see at least ¶ [0053], “…a training set generation module 208 is utilized to generate a training image and font collection 210 that includes training images that are to serve as a basis for training the text localization model 206. The training image and font collection 210 …” and see at least ¶ [0054], “…background/foreground intensity, text color flipping, shading, rotation, squeezing, cropping and noise…”).
Consider claims 7, 26, 45, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:

Consider claims 8, 27, 46, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches the text located in the foreground is randomly positioned in the portion of training images (see at least ¶ [0054], “…in order to make the training set more diversified and more robust to noises, random perturbations may be added by the training set generation module 208 during and/or after rendering of text…”).
Consider claims 9, 28, 47, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches prior to the text being located in the foreground, a portion of the text is removed (see at least ¶ [0054], “…background/foreground intensity, text color flipping, shading, rotation, squeezing, cropping and noise…”).
Consider claims 10, 29, 48, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches the captured background imagery is distorted when captured (see at least ¶ [0054], “…background/foreground intensity, text color flipping, shading, rotation, squeezing, cropping and noise…”).

Wang teaches font similarity is used to identify the at least one font (see at least ¶ [0026], “…font similarity may be used to determine which fonts are similar to a font used to render text in an image, which may be used to navigate through hundreds or even thousands of fonts to find a similar font of interest …”).
Consider claims 12, 31, 50, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches similarity of fonts in multiple image segments is used to identify the at least one font (see at least ¶ [0026], “…font similarity may be used to determine which fonts are similar to a font used to render text in an image, which may be used to navigate through hundreds or even thousands of fonts to find a similar font of interest …”).
Consider claims 13, 32, 51, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches the machine learning system is trained by using transfer learning (see at least ¶ [0087],  “…a deep convolutional neural network is trained to recognize and find similar fonts. …, the font attribute system 126 employs metadata 816 in order to improve accuracy and efficiency of these techniques. For instance, within a font family, different fonts 130 have different weights; fonts may have a notation of relative and 
Consider claims 14, 33, 52, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches an output of the machine learning system represents each font used to train the machine learning system (see at least ¶ [0087],  “…a deep convolutional neural network is trained to recognize and find similar fonts. …, the font attribute system 126 employs metadata 816 in order to improve accuracy and efficiency of these techniques. For instance, within a font family, different fonts 130 have different weights; fonts may have a notation of relative and italic; may come as pairs; may have classification information such as Serif, San Serif; and so forth. Accordingly, these attributes may also be leveraged to recognize a font having the described attributes as well as locate similar fonts having similar attributes as defined by associated…”).
Consider claims 15, 34, 53, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches the output of the machine learning system provides a level of confidence for each font used to train the machine learning system (see at least ¶ [0087],  “…a deep convolutional neural network is trained to recognize and find similar fonts. …, the font attribute system 126 employs metadata 816 in order to improve accuracy and 
Consider claims 16, 35, 54, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches a subset of the output of the machine learning system is scaled and a remainder of the output is removed (see at least ¶ [0089],  “…The two machine learning subnets 1112, 1114 of the Siamese network are identical in this example. The end of each machines learning subnet 1112, 1114 includes a weight prediction layer that is used to predict a scalar values. An additional layer illustrated as the classifier 1116 positioned "on top" of the two identical machine learning subnets 1112, 1114 takes the two scalars and forms a binary classifier…”).
Consider claims 17, 36, 55, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches some of the training images are absent identification (see at least ¶ [0054], “…background/foreground intensity, text color flipping, shading, rotation, squeezing, cropping and noise…”).

Wang teaches identifying the at least one font represented in the received image using the machine learning system includes using additional images received by the machine learning system (see at least ¶ [0025], “…text localization techniques described herein train a model using machine learning (e.g., a convolutional neural network) using training images. The model is then used to localize text in a subsequently received image, and may do so automatically and without user intervention…” and further see at least ¶ [0026], “…font similarity may be used to determine which fonts are similar to a font used to render text in an image, which may be used to navigate through hundreds or even thousands of fonts to find a similar font of interest …”).
Consider claims 19, 38, 57, (depends on at least claims 1, 20, 39), Wang in view of Tran discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches outputs of the machine learning system for the received image and the additional images are combined to identify the at least one font (see at least ¶ [0110], “…Each kind of image pair is uniformly sampled during training. If there are font families with only a single font, those font images may be combined to form a fifth type of image pairs with "z=4." In this way, the Siamese neural network 1110 may still be optimized with the softmax loss…”).

Wang teaches a portion of the training images is distorted when captured (see at least ¶ [0054], “…background/foreground intensity, text color flipping, shading, rotation, squeezing, cropping and noise…”).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHUONG A NGO whose telephone number is (571)270-7264.  The examiner can normally be reached on Monday-Thursday from 5:30AM-4:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Anthony S Addy can be reached on (571) 272-7795.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to 






/CHUONG A NGO/            Primary Examiner, Art Unit 2645