Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Compact Prosecution
Allowable Subject Matter
Claims 3-4, 6,10,13,17 and 18 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Claims 3,4 and 6 (It is also applicable to claims 10,13, 17 and 18) 
Claims are objected to as allowable because at this time Examiner have not identify any reference or combination of references that teaches the limitations “a computer implemented method wherein adjusting one or more inferences of the machine learning model comprises identifying a plurality of negative words, each of the plurality of negative words having a quantitative risk ranging from no or low risk to very high risk; and monotonic mapping from a risk level of each of the plurality of negative words to an amount of suppression to apply to the machine learning model's likelihood of selecting those respective negative words and wherein the step of monotonic mapping comprises: for each time the machine learning model is to infer a next word of the text descriptive of the image, multiply post-softmax likelihoods or pre-softmax logits by a factor inversely proportional to the quantitative risk of the respective word, ranging from a multiplicative factor of 0.0 for full suppression to 1.0 for no suppression and wherein applying the gender mitigation mechanism comprises, for each of a triplet of a gender-triplet- reference collection at each point at which the machine learning model is to infer the next word: calculating a likelihood of each pair of two gendered words; calculating a percentage of the lower of the two likelihoods; subtracting the calculated percentage from both gendered words; and assigning that subtracted calculated percentage to a gender neutral version.
Amending the independent claims such as 1, 8 and 15 to include the above limitation will overcome the current rejection and possibly moving the application into condition for allowance. 
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.


Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitations are:

“an inference adjustment module for adjusting one or more inferences of the machine learning model in Claim 15”. 
“a post-processing module for processing the generated updated text descriptive of the image ....” in claims 15 and 16.

Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. 
 In Sections 0054 and 0068  of applicant’s specification it is disclosed that the inference adjustment module  and the post processing module causes the machine learning models to generate adjectives which are very frequent and may not directly describe a noun.
 This therefore means the function of the inference adjustments module  and the post processing modules are performed by the machine learning model see lines 8-9 of claim 1 (Machine learning model comprising the adjusted inferences to post process the image …).
 The machine learning model therefore corresponds to both inference adjustment module  and the post processing module.  
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

	

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-2, 5,7-9,12,14 -16,19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Li (US20200117951) in view Wang et al. (US20170200066).
Claim 1, Li discloses a computer-implemented method of generating text descriptive of digital images, (Section 0005 “Captioning Image”) the method comprising: 
using a machine learning model to pre-process an image to generate initial text descriptive of the image; (Section 0047, lines  6-9- thus using the previously generated words reads on the generated initial text descriptive of the image see fig. 4)
adjusting one or more weights of the machine learning model, (Section 0056, lines 5-11 modifying the weights of the neural network) the weight biasing the machine learning model away from associating negative words with the image; (Section 0040, lines 26-28 “removes negative weights to reduce it impact in the predicted output” Means applying a biasing that reduces the impact of negative weights)
using the machine learning model comprising the adjusted weights to post- process the image to generate updated text descriptive of the image; (Section 0056, lines 3-6 modifying a set of weights  (updating) generates predicted caption labels see Section 0057, lines 3-4 “Predicted Caption”) 
(secondary references Wang in section 0063 discloses a post processing steps of employing mapping the points to words and convert the word vector as an updated caption)  
 and processing the generated updated text descriptive of the image outputted by the machine learning model to fine-tune the updated text descriptive of the image. (Section 0042, lines 10-13 – “Fine-tuning using the multi-label classification objective” reads on processing the generated updated text descriptive of the image- See Fig. 5)
 

    PNG
    media_image1.png
    473
    652
    media_image1.png
    Greyscale

Figure 1 shows how Li's image captioning system fine tune the updated caption of an input images.
The caption “a man riding on the back of a horse” can be fine-tuned or update “with Attention map to generate an updated caption such as “A person riding a horse in a field” Compare this screenshot to applicant’s figs. (4A-4D) 

Li does not disclose a neural network architecture that deals with inferences. 
 Wang discloses a neural network architecture that deals with inferences. (Section 0074, lines 19-23  “leads to more robust inference of the semantic gap…” shows that the neural network deals with inferences). 
Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching of architecture of the Neural network of Li. The motivation is that using the inferences will allow the system to adapt to new unknow inputs. 




Claim 2, Li in view of Wang discloses wherein the method comprising adjusting the one or more inferences of the machine learning model  (Wang: Section 0074, lines 19-23) during beam search of the machine learning model to adjust posterior probabilistic of selected words of the text; (Li: Section 0040, lines 14-16 - Obtaining gradient through backpropagation reads on adjusting the posterior probabilistic of selected captions or words)  and processing the generated updated text descriptive of the image using pure natural language processing or text-based rules overlaid on part-of-speech (POS) libraries. (Li Section 0048, lines 7-10 “using Stanford CoreNLP to parse caption into tokens” reads on the using Natural language processing) 
Claim 5, Li in view of Wang discloses wherein processing the generated updated text descriptive of the image to fine-tine the updated text descriptive of the image comprises applying one or more of a gender mitigation mechanism;  (Li: Based on Section 0052, lines 9-13 – thus generating words such as man is replaced with a person in the processed image -see the screenshot below) offensive adjective mitigation mechanism; a low confidence adjective mitigation mechanism; a geo-location generalizing mechanism; and an image with text templatizing mechanism. 


    PNG
    media_image2.png
    500
    716
    media_image2.png
    Greyscale

Clearly the updated text description of the input image shows “a man” which is processed as a person and a truck is processed to be qualified by an adjective black truck- this reads on “a low confidence adjective mitigation mechanism”- thus the adjective black in the phrase “black Truck” is removed when the system is processed “With Attention”



Claim 7, Li in view of Wang discloses wherein the text descriptive of the image is a title of the image or a caption of the image. (Li: Section 0059, lines 6-7 thus training an ML model for captioning images)
Claim 8, Li discloses a computer-implemented method for generating text descriptive of digital images, (Section 0005 “Captioning Image”) the method comprising: 
generating, using a first machine learning model, initial text descriptive of an image to pre-process the image; (Section 0013, lines 5-6- thus Machine learning model for captioning images)
adjusting, using an weight adjustment module, ( Section 0040, lines 12-18 gradient through backpropagation and Weighted combination reads on adjusting inferences ) one or more weight of the first machine learning model, the weight biasing the first machine learning model away from associating negative words with the image; (Section 0040, lines 26-28 “removes negative weights to reduce it impact in the predicted output”)
generating, using a second machine learning model comprising the adjusted weights, updated text descriptive of the image to post-process the image; (Section 0056, lines 3-6 modifying a set of weights generates predicted caption labels see Section 0057, lines 3-4 “Predicted Caption”) 
(secondary references Wang in section 0063 discloses a post processing steps of employing mapping the points to words and convert the word vector as an updated caption)  
and process the generated updated text descriptive of the image outputted by the second machine learning model to fine-tine the updated text descriptive of the image. (Section 0042, lines 10-13 – “Fine-tuned using the multi-label classification objective” )

    PNG
    media_image3.png
    200
    400
    media_image3.png
    Greyscale


Li does not disclose a neural network architecture that deals with inferences. 
 Wang discloses a neural network architecture that deals with inferences. (Section 0074, lines 19-23  “leads to more robust inference of the semantic gap…” shows that the neural network deals with inferences). 
Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching of architecture of the Neural network of Li. The motivation is that using the inferences will allow the system to adapt to new unknow inputs. 


Claim 9, Li in view of Wang discloses the method further comprising adjust the one or more inferences of the first machine learning model (Wang: Section 0074, lines 19-23) during beam search of the first machine learning model to adjust posterior probabilistic of selected words of the text; (Li: Section 0040, lines 14-16 - Obtaining gradient through backpropagation reads on adjusting the posterior probabilistic of selected captions or words)  and process the generated updated text descriptive of the image using pure natural language processing or text-based rules overlaid on part-of-speech (POS) libraries. (Li Section 0048, lines 7-10 “using Stanford CoreNLP to parse caption into tokens” reads on the using Natural language processing) 

Claim 12, Li in view of Wang discloses wherein processing the generated updated text descriptive of the image to fine-tine the updated text descriptive of the image comprises applying one or more of a gender mitigation mechanism; (Li: Based on Section 0052, lines 9-13 – thus generating words such as man is replaced with a person in the processed image -see the screenshot below)
an offensive adjective mitigation mechanism; a low confidence adjective mitigation mechanism; a geo-location generalizing mechanism; and an image with text templatizing mechanism. 


    PNG
    media_image2.png
    500
    716
    media_image2.png
    Greyscale

Clearly the updated text description of the input image shows “a man” which is processed as a person and a truck is processed to be qualified by an adjective black truck- this reads on “a low confidence adjective mitigation mechanism”- thus the adjective black in the phrase “black Truck” is removed when the system is processed “With Attention”



Claim 14, Li in view of Wang discloses wherein the text descriptive of the image is a title of the image or a caption of the image. (Li: Section 0059, lines 6-7 thus training an ML model for captioning images)

Claim 15, Li discloses a system comprising a first machine learning model for pre-processing an image to generate initial text descriptive of the image; (Section 0013, lines 5-6- thus Machine learning model for captioning images)
 an weight adjustment module for adjusting one or more inferences of the machine learning model, ( Section 0040, lines 12-18 gradient through backpropagation and Weighted combination reads on adjusting inferences” )
 the inferences biasing the machine learning model away from associating negative words with the image; (Section 0040, lines 26-28 “removes negative weights to reduce it impact in the predicted output”)
a second machine learning model for using the adjusted inferences to post- process the image to generate updated text descriptive of the image; (Section 0056, lines 3-6 modifying a set of weights generates predicted caption labels see Section 0057, lines 3-4 “Predicted Caption”) 
and a post-processing module for processing the generated updated text descriptive of the image outputted by the second machine learning model to fine-tune the updated text descriptive of the image. (Section 0042, lines 10-13 – “Fine-tuned using the multi-label classification objective”)

    PNG
    media_image3.png
    200
    400
    media_image3.png
    Greyscale

Li does not disclose a neural network architecture that deals with inferences. 
 Wang discloses a neural network architecture that deals with inferences. (Section 0074, lines 19-23  “leads to more robust inference of the semantic gap…” shows that the neural network deals with inferences). 
Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching of architecture of the Neural network of Li. The motivation is that using the inferences will allow the system to adapt to new unknow inputs. 
Claim 16, Li in view of Wang discloses the inference adjustment module for adjusting the one or more inferences of the first machine learning model during beam search to adjust posterior probabilistic of selected words of the text; (Li: Section 0040, lines 14-16 - Obtaining gradient through backpropagation reads on adjusting the posterior probabilistic of selected captions or words) and a third machine learning model for processing the generated updated text descriptive of the image using pure natural language processing or text-based rules overlaid on part-of-speech (POS) libraries. (Li Section 0048, lines 7-10 “using Stanford CoreNLP to parse caption into tokens” reads on the using Natural language processing) 

Claim 19, Li in view of Wang discloses wherein the post-processing module for processing the generated updated text descriptive of the image to fine-tune the updated text descriptive of the image is further configured to apply one or more of: a gender mitigation mechanism; (Li: Based on Section 0052, lines 9-13 – thus generating words such as man is replaced with a person in the processed image -see the screenshot below)
 an offensive adjective mitigation mechanism; a low confidence adjective mitigation mechanism; a geo-location generalizing mechanism; and an image with text templatizing mechanism.

    PNG
    media_image2.png
    500
    716
    media_image2.png
    Greyscale

Clearly the updated text description of the input image shows “a man” which is processed as a person and a truck is processed to be qualified by an adjective black truck- this reads on “a low confidence adjective mitigation mechanism”- thus the adjective black in the phrase “black Truck” is removed when the system is processed “With Attention” 


Claim 20, Li in view of Wang discloses wherein the text descriptive of the image is a title of the image or a caption of the image. (Li: Section 0059, lines 6-7 thus training an ML model for captioning images)


                                   	           Cited Art

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Mao (US20170098153) discloses an intelligent image captioning system that deals with the task of generating novel sentences descriptions for images and the task of image and sentence retrieval. The system also deals with a language model part learns a dense feature embedding for each word in the dictionary and stores the semantic temporal context in recurrent layers. The m-RNN model may be learned using a log-likelihood cost function (see details in Section D). In embodiments, the errors may be backpropagated to the three parts of the m-RNN model to update the model parameters simultaneously.
Yang (US20200320353) discloses a dense captioning system and method for providing an image to generate proposed bounding regions for a plurality of visual concepts within the image, generating a region feature for each proposed bounding region. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Akwasi M Sarpong whose telephone number is (571)270-3438. The examiner can normally be reached Mon-Fri. 8:00am-4:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KING D POON can be reached on 571-272-7440. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





	/AKWASI M SARPONG/           Primary  Examiner, Art Unit 2675                                                                                                                                                                                                          10/17/2022