Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This communication is a non-Final office action on merit.  Claims 1-20, as originally filed, are presently pending and have been considered below.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 7/8/2020, 12/9/2021, 9/9/2022 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement has been considered by the examiner.

Claim Objections
Claims 5 and 7 objected to because of the following informalities:  RNN and CNN are first recited in claims 5 and 7 without a full description.  
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or
    nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.


Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over US 2018/0144248 A1 Lu et al. (hereinafter Lu) in view of “Generating Diverse and Descriptive Image Captions Using Visual Paraphrases”, Liu et al., 2019 IEEE/CVF International Conference on Computer Vision (ICCV), a IDS submission, (hereinafter Liu) and further in view of US 2021/0012486 A1, Huang et al. (hereinafter Huang).


As to claim 1, Lu discloses a computer-implemented method comprising: 
encoding, by one or more computer processors, an image utilizing an image encoder (Figs 1, 7, CNN encoder for image encoding; pars 0011, 0014), wherein the image is contained within a triplet comprising the image, one or more high- resource captions, and one or more low-resource captions (pars 0008, 0014-0015, 0018, encoding image, and captions with attention based model, note the captions with high quality consume higher resources while captions with lower quality consume lower resource); 
generating, by one or more computer processors, one or more high-resource captions utilizing the encoded image (pars 0014-0016, 0018, 0063, generate high quality captions for the image) and the triplet inputted into a high-resource decoder (Figs 2A-2B, 6-7, 13; pars 0008, 0011-0012, 0014, 0054-0055, RNN base decoder for the image and captions decoding); encoding, by one or more computer processors, the one or more generated high- resource captions utilizing a high-resource encoder (Figs 1, 7; pars 0011, 0054); and 
generating, by one or more computer processors, one or more low-resource captions by simultaneously inputting the encoded image, the encoded high-resource caption, and the triplet into a trained low-resource decoder (pars 0008, 0019, 0025, 0028, 0054-0056, 0063, 0069-0070, encoding/decoding model with CNN encoder for multiple image features, regions, captions).  

Lu does not expressly disclose triplet or multimodal aspect of the image and adding adaptive cycle consistency constraints on a set of calculated attention weights associated the triplet. 

Liu, in the same or similar field of endeavor, further teaches the image is contained within a triplet comprising the image, one or more high- resource captions, and one or more low-resource captions (page 4240, left col. generate a preliminary caption then more diverse and descriptive captions, e.g. multi-modality or triplet; pages 4241-4242; Fig 2) and generating, by one or more computer processors, one or more low-resource captions by simultaneously inputting the encoded image, the encoded high-resource caption, and the triplet into a trained low-resource decoder (page 4241-4241, encoding image with different captions).

Huang, in the same or similar field of endeavor, additionally teaches adding, by one or more computer processors, adaptive cycle consistency constraints on a set of calculated attention weights associated the triplet (Fig 5; pars 0033, 0062, 0073-0074, 0080, 0087, performing cycle consistency constraints on attention weights associated with multi-modality image (e.g. triplet). The adaptive cycle consistency constraints being performed by minimizing the cycle consistency loss function).

Therefore, consider Lu, Liu, and Huang’s teachings as a whole, it would have been obvious to one of skill in the art before the filing date of invention to incorporate Liu’s teachings on different captions generation and decoding approach and Huang’s adaptive cycle consistency constraints in Lu’s method for providing more diverse and descriptive machine generated captions with given datasets.

As to claim 2, Lu as modified discloses the method of claim 1, wherein adding adaptive cycle consistency constraints on attentions of the triplet, further comprises: aligning, by one or more computer processors, one or more attention weights associated with one or more triplets fed into a plurality of decoders (Huang: Fig 5; pars 0030, 0033, 0042, 0052, 0061-0062, 0073-0074, 0080, 0087, modality, image domain matching using consistency constraints; Liu: page 4240, right col.; page 4241- 4243, left col. matches the attention weights).  

As to claim 3, Lu as modified discloses the method of claim 2, wherein the plurality of decoders is associated with an image to low-resource caption decoding, an image to high-resource caption decoding, or a high-resource caption to low-resource caption decoding (Liu: page 4240, left col. decoders to first decode and generate a preliminary caption then more diverse and descriptive captions; pages 4241-4242; Fig 2).  

As to claim 4, Lu as modified discloses the method of claim 1, further comprising: adding, by one or more computer processors, a sentinel weight to the set of calculated attention weights associated with one or more triplets fed into a decoder, providing a latent representation of a memory of the decoder (Lu: Figs 8-9; pars 0086-0088, 0101-0102, 0104, 0114, 0182, producing latent representation with spatial attention model and Sentinel LSTM).  

As to claim 5, Lu as modified discloses the method of claim 1, wherein the high-resource encoder and the high-resource decoder, and are each, respectively, an RNN (Lu: Figs 9, 16, RNN encoder/decoder; pars 0017, 0059; Liu: page 4240, left col., RNN for generating more diverse and descriptive captions, e.g. high-resource decoder).  

As to claim 6, Lu as modified discloses the method of claim 1, wherein the low-resource decoder is an attention-based RNN trained with the one or more generated high-resource captions and associated low-resource translations (Lu: pars 0059, 0251, 0260, recurrent neural network (RNN) as a decoder; Liu: page 4240, left col., RNN for decoder).  

As to claim 7, Lu as modified discloses the method of claim 1, wherein the image encoder is a trained CNN (Lu: Figs 1, 7, CNN image encode; Liu: page 4240, left col., CNN for encoder). 

As to claim 8, it recites a computer program product with computer readable storage media (non transitory as indicated in specification) storing program instructions to perform functions and features recited in claim 1. Rejection of claim 1 is therefore incorporated herein.
As to claims 9-14 are rejected with the same reason as set forth in claims 2-7, respectively.

As to claim 15, it is a system claim encompassed claim 1. Rejection of claim 1 is therefore incorporated herein.
As to claims 16-20, they are rejected with the same reason as set forth in claims 2-6, respectively.
 
Examiner’s Note
Examiner has cited particular column, line number, paragraphs and/or figure(s) in the reference(s) as applied to the claims for the convenience of the Applicant. Although the specified citations are representative of the teachings of the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the reference(s) in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner. 

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Qun Shen whose telephone number is (571) 270-7927.  The examiner can normally be reached on Mon-Friday from 9:00-5:00. If attempts to reach the examiner by telephone are unsuccessful, the examiner's Supervisor, Vincent Rudolph can be reached on (571) 272-8243.  The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.  Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/QUN SHEN/
Primary Examiner, Art Unit 2661