Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This action is in response to the communication mailed on 05/25/2021 and applicant has submitted an amendment, filed on 11/24/2021.
Claims 1-5, 7-24, 26-43, 45-57 are pending, with claims 1, 20, 39 have been amended and claims 6, 25, 44, 58-76  have been canceled.

Response to Arguments
Applicant's arguments with respect to claims 1-5, 7-24, 26-43, 45-57 filed on 11/24/2021 have been considered but are moot in view of the new grounds of rejection. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:

2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-5, 7-24, 26-43, 45-57 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication 20170098138 (hereinafter referred to as Wang) in view of US Patent Application Publication 20170011279 (hereinafter referred to as Soldevila).
Consider claims 1, 39, Wang discloses a computing device (see at least ¶ [0032], Fig. 1, “…a computing device 102…”) implemented method comprising: 
receiving an image that includes textual content in at least one font (see at least ¶ [0036], “…The image 114 can include a variety of different objects, such as text, shapes or other visual objects, spreadsheets, as a document, a multimedia content, slide presentation, and so on…” and see at least ¶ [0037], “…edit images is illustrated as a font recognition and similarity system 120. This system 120 is representative of functionality to perform text localization, find similar fonts, and employ font attributes for font recognition and similarity, examples of which are represented by the text localization system 122, font similarity system 124, and font attribute system 126…” and further see at least ¶ [0040], “…These attributes are learnable as part of a machine learning process to improve accuracy and efficiency of font recognition and similarity determinations, further discussion of which is included in the following in relation to FIGS. 8-14.…”); and 

Wang disclose all the subject matters of the claimed invention concept. However, Wang does not particularly disclose a portion of the training images is distorted when captured.  In an analogous field of endeavor, attention is directed to Soldevila, which teaches a portion of the training images is distorted when captured (see Wang at least ¶ [0054], “…background/foreground intensity, …, shading, … and noise…”) and in view of see Soldevila, at least   ¶ [0074] see Soldevila, at least   ¶ [0074] “…some or all the 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the Wang disclosed invention, and have a portion of the training images is distorted when captured, as taught by Soldevila,  thereby, to provide a system and method relates to text recognition and image retrieval based on semantic information and finds particular application in connection with assigning semantic labels to word images and to recognition of word images corresponding to semantic labels, as discussed by Soldevila (see at least ¶ [0001]). 
Consider claim 20, Wang discloses a system see at least ¶ [0032], Fig. 1, “…environment 100 includes a computing device 102, which may be configured in a variety of ways…”) comprising: 
a computing device  (see at least ¶ [0033], Fig. 1, “…a computing device 102…”) comprising: a memory configured to store instructions (see at least ¶ [0034], Fig. 1, “…a computer-readable storage medium illustrated as memory 108…”); and a processor to execute the instructions to perform operations (see at least ¶ [0034], Fig. 1, “…The 
receiving an image that includes textual content in at least one font (see at least ¶ [0036], “…The image 114 can include a variety of different objects, such as text, shapes or other visual objects, spreadsheets, as a document, a multimedia content, slide presentation, and so on…” and see at least ¶ [0037], “…edit images is illustrated as a font recognition and similarity system 120. This system 120 is representative of functionality to perform text localization, find similar fonts, and employ font attributes for font recognition and similarity, examples of which are represented by the text localization system 122, font similarity system 124, and font attribute system 126…” and further see at least ¶ [0040], “…These attributes are learnable as part of a machine learning process to improve accuracy and efficiency of font recognition and similarity determinations, further discussion of which is included in the following in relation to FIGS. 8-14.…”); and 
identifying the at least one font represented in the received image using a machine learning system (see at least ¶ [0025], “…text localization techniques described herein train a model using machine learning (e.g., a convolutional neural network) using training images. The model is then used to localize text in a subsequently received image, and may do so automatically and without user intervention…” and further see at least ¶ [0026], “…font similarity may be used to determine which fonts are similar to a font used to render text in an image, which may be used to navigate through hundreds or even thousands of fonts to find a similar font of interest …”), the machine learning system being trained using images representing a plurality of training fonts, wherein a portion of 
Wang disclose all the subject matters of the claimed invention concept. However, Wang does not particularly disclose a portion of the training images is distorted when captured.  In an analogous field of endeavor, attention is directed to Soldevila, which teaches a portion of the training images is distorted when captured (see Wang at least ¶ [0054], “…background/foreground intensity, …, shading, … and noise…”) and in view of see Soldevila, at least   ¶ [0074] “…some or all the training word images 72 may be generated synthetically by applying realistic distortions to word images that have been automatically rendered by a computing device, using a set of different font types. …. The distortions may be used to give the appearance of a word image obtained when captured at an angle or when partially occluded.…” and see at least ¶ [0086], “…the neural network can be retrained or fine-tuned. The activations 124 of the previous-to-last layer 100 of the neural network can be used as an inductive embedding of the input word images 12. Word images containing words that have not been observed during training can still be embedded in this space and matched with known concepts…”).

Consider claims 2, 21, 40, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches the text located in the foreground is synthetically augmented (see at least ¶ [0053], “…a training set generation module 208 is utilized to generate a training image and font collection 210 that includes training images that are to serve as a basis for training the text localization model 206. The training image and font collection 210 …” and see at least ¶ [0054], “…background/foreground intensity, text color flipping, shading, rotation, squeezing, cropping and noise…”).
Consider claims 3, 22, 41, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches synthetic augmentation is provided in a two-step process (see at least ¶ [0098], “…the machine learning module 1108 employs a neural network 110 have at least two machine learning subnets 1112, 1114 configured as a Siamese network to learn the weight function and compare fonts. The two machine learning subnets 1112, 
Consider claims 4, 23, 42, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches the text is synthetically augmented based upon one or more predefined conditions (see at least ¶ [0098], “…the machine learning module 1108 employs a neural network 110 have at least two machine learning subnets 1112, 1114 configured as a Siamese network to learn the weight function and compare fonts…, …The end of each machines learning subnet 1112, 1114 includes a weight prediction layer that is used to predict a scalar values. An additional layer illustrated as the classifier 1116 positioned "on top" of the two identical machine learning subnets 1112, 1114 takes the two scalars and forms a binary classifier…”).
Consider claims 5, 24, 43, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches the text located in the foreground is undistorted (see at least ¶ [0053], “…a training set generation module 208 is utilized to generate a training image and font collection 210 that includes training images that are to serve as a basis for training the text localization model 206. The training image and font collection 210 …” and see at least ¶ [0054], “…background/foreground intensity, text color flipping, shading, rotation, squeezing, cropping and noise…”).

Wang teaches the captured background imagery is predominately absent text (see at least ¶ [0054], “…background/foreground intensity, text color flipping, shading, rotation, squeezing, cropping and noise…”).
Consider claims 8, 27, 46, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches the text located in the foreground is randomly positioned in the portion of training images (see at least ¶ [0054], “…in order to make the training set more diversified and more robust to noises, random perturbations may be added by the training set generation module 208 during and/or after rendering of text…”).
Consider claims 9, 28, 47, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches prior to the text being located in the foreground, a portion of the text is removed (see at least ¶ [0054], “…background/foreground intensity, text color flipping, shading, rotation, squeezing, cropping and noise…”).
Consider claims 10, 29, 48, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:

Consider claims 11, 30, 49, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches font similarity is used to identify the at least one font (see at least ¶ [0026], “…font similarity may be used to determine which fonts are similar to a font used to render text in an image, which may be used to navigate through hundreds or even thousands of fonts to find a similar font of interest …”).
Consider claims 12, 31, 50, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches similarity of fonts in multiple image segments is used to identify the at least one font (see at least ¶ [0026], “…font similarity may be used to determine which fonts are similar to a font used to render text in an image, which may be used to navigate through hundreds or even thousands of fonts to find a similar font of interest …”).
Consider claims 13, 32, 51, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches the machine learning system is trained by using transfer learning (see at least ¶ [0087],  “…a deep convolutional neural network is trained to recognize and 
Consider claims 14, 33, 52, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches an output of the machine learning system represents each font used to train the machine learning system (see at least ¶ [0087],  “…a deep convolutional neural network is trained to recognize and find similar fonts. …, the font attribute system 126 employs metadata 816 in order to improve accuracy and efficiency of these techniques. For instance, within a font family, different fonts 130 have different weights; fonts may have a notation of relative and italic; may come as pairs; may have classification information such as Serif, San Serif; and so forth. Accordingly, these attributes may also be leveraged to recognize a font having the described attributes as well as locate similar fonts having similar attributes as defined by associated…”).
Consider claims 15, 34, 53, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:

Consider claims 16, 35, 54, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches a subset of the output of the machine learning system is scaled and a remainder of the output is removed (see at least ¶ [0089],  “…The two machine learning subnets 1112, 1114 of the Siamese network are identical in this example. The end of each machines learning subnet 1112, 1114 includes a weight prediction layer that is used to predict a scalar values. An additional layer illustrated as the classifier 1116 positioned "on top" of the two identical machine learning subnets 1112, 1114 takes the two scalars and forms a binary classifier…”).
Consider claims 17, 36, 55, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:

Consider claims 18, 37, 56, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches identifying the at least one font represented in the received image using the machine learning system includes using additional images received by the machine learning system (see at least ¶ [0025], “…text localization techniques described herein train a model using machine learning (e.g., a convolutional neural network) using training images. The model is then used to localize text in a subsequently received image, and may do so automatically and without user intervention…” and further see at least ¶ [0026], “…font similarity may be used to determine which fonts are similar to a font used to render text in an image, which may be used to navigate through hundreds or even thousands of fonts to find a similar font of interest …”).
Consider claims 19, 38, 57, (depends on at least claims 1, 20, 39), Wang in view of Soldevila discloses the limitations of claims 1, 20, 39 as applied to claim rejection 1, 20, 39 above and further discloses:
Wang teaches outputs of the machine learning system for the received image and the additional images are combined to identify the at least one font (see at least ¶ [0110], “…Each kind of image pair is uniformly sampled during training. If there are font families with only a single font, those font images may be combined to form a fifth type .


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHUONG A NGO whose telephone number is (571)270-7264. The examiner can normally be reached Monday-Thursday from 5:30AM-4:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CHUONG A NGO/            Primary Examiner, Art Unit 2645