DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Applicant Response to Remarks/Arguments
Applicant’s Response to the Non-Final Rejection, received on 26 April 2022, (“Response”) is acknowledged. The Response appears to be a bona fide attempt at furthering prosecution and the Examiner will treat the Response as fully responsive. 
The Office respects the applicant’s arguments. However, it appears that the applicant did not apply the correct standard of review for claim interpretation. For claim interpretation, the correct stand of review would be the broadest reasonable interpretation in light of the specification. However, the applicant did not follow this. All the arguments are based on interpreting the claims by reading the specification into the claims or bringing outside knowledge for claim interpretation. That is not how the scopes or meaning of the terms in the claims are determined. See MPEP § 2111.  For example, arguing that the prior art cited and the claimed invention are from different technical fields is a very weak assertion when the claims are very broad. It does not matter whether the prior art cited was from a different technical field unless the claims recite that the claims are directed to a specific technical field or claims terms has specific meaning within the claims. Here, the original claims are very broad that each term can have several meanings from different technical fields. 
	The applicant argues that the prior art cited does not teach two different vectors. However, that is not true. The prior art cited teaches the text encoder and digital image encoder, which each of them generates vectors from the text or digital image.
  	Additionally, the third argument results from applicant’s misunderstanding of the prior art cited and the cited portion in the previous Office action. It was not the operation module that corresponds to the training module. It was the machine-training module that utilizes the text embedding and digital image embedding to use a loss function to that is fed to the training the model was the portion corresponding to the second limitation of claim 1.
	Because of the foregoing reasons, the rejection of the claims will be maintained.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “a training module,” “a processing module,” “a discriminator module,” or “a feedback module” in claims 1-12
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 3-4, and 11 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Aggarwal et al. (US 2020/0380027 A1).
a.	Regarding claim 1, Aggarwal discloses a computing device for handling anomaly detection, comprising: 
an encoder, for receiving an input image, to generate a first latent vector comprising a semantic latent vector and a visual appearance latent vector according to the input image and at least one first parameter of the encoder (Aggarwal discloses a machine-learning training module comprising two encoders “generating the text embedding 306 (e.g., a vector also having a length of 2048) based on the text using a recurrent neural network (RNN) language encoder” and “the digital image embedding 308, e.g., as vectors having a length of 2048. The digital image encoder 304, for instance, includes a convolutional neural network (CNN) image encoder 312 to generate the digital image embeddings 308 as describing content included in the digital images using vectors” after receiving training dataset, which is a digital image and text pairs at Figs. 3-202, 130, 302, 310, 306, 304, 3012, 308 at ¶¶ 0030, 0062-0070); and 
a training module, coupled to the encoder, for receiving the input image and the first latent vector, to update the at least one first parameter according to the input image and the first latent vector and a loss function (Aggarwal discloses a model “supporting a visually guided language embedding space 122, on the other hand, may support a variety of functionality through use by an operation module 314 which represents functionality of “use” of the trained model 120” based on the loss function generated by the text embedding and digital image embedding” at Figs. 3-306, 308, 132, 120, and 314 and ¶¶ 0044, 0074-0077). 
b.	Regarding claim 3, Aggarwal discloses wherein the semantic latent vector comprises semantic information of the input image, and does not comprise visual appearance information of the input image (Aggarwal discloses a machine-learning training module comprising a text encoder “generating the text embedding 306 (e.g., a vector also having a length of 2048) based on the text using a recurrent neural network (RNN) language encoder” at Figs. 3-302, 310, and 306 and ¶¶0062-0070).
c.	Regarding claim 4, Aggarwal discloses wherein the visual appearance latent vector comprises visual appearance information of the input image, and does not comprise semantic information of the input image (Aggarwal discloses a machine-learning training module comprising a digital image encoder generating ““the digital image embedding 308, e.g., as vectors having a length of 2048. The digital image encoder 304, for instance, includes a convolutional neural network (CNN) image encoder 312 to generate the digital image embeddings 308 as describing content included in the digital images using vectors” at Figs. 3-304, 312, and 308 and ¶¶0062-0070).
d.	Regarding claim 11, Aggarwal discloses wherein the loss function comprises at least one regularizer, a categorical loss function, a Kullback-Leibler (KL) divergence function and a Wasserstein Generative Adversarial Network (WGAN) loss function (Aggarwal discloses a model “supporting a visually guided language embedding space 122, on the other hand, may support a variety of functionality through use by an operation module 314 which represents functionality of “use” of the trained model 120” based on the loss function generated by the text embedding and digital image embedding” at Figs. 3-306, 308, 132, 120, and 314 and ¶¶0074-0077. Here, the loss function could be a categorical loss function).  

. Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Aggarwal et al. (US 2020/0380027 A1) in view of Lee (US 2020/0152188 A1).
a.	Regarding claim 2, Aggarwal discloses all the previous claim limitations.
However, Aggarwal does not explicitly disclose wherein the input image is determined to be an anomaly, if a maximum confidence score of the semantic latent vector is lower than a threshold value. 
Lee discloses wherein the input image is determined to be an anomaly, if a maximum confidence score of the semantic latent vector is lower than a threshold value (Lee discloses comparing the confidence score with the first threshold at Fig. 7-704 and ¶¶0186-0187 and 0195. When the confidence score is lower than the threshold, rejects the recognition result as a failure at Fig. 7-711 and ¶¶0195).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the process of comparing the confidence with the threshold of Lee to the machine-learning training module of Aggarwal.
The suggestion/motivation would have been to “effectively reduce the loads to be manipulated by the user” (Lee; ¶0005).

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Aggarwal et al. (US 2020/0380027 A1) in view of Turner et al. (US 10,558, 913 B1).
a.	Regarding claim 14, Aggarwal discloses all the previous claim limitations.
However, Aggarwal does not explicitly disclose wherein the at least one regularizer is a L1-norm function for calculating differences between the input image and the at least one reconstructed image, if the at least one vector is the first latent vector. 
Turner discloses wherein the at least one regularizer is a L1-norm function for calculating differences between the input image and the at least one reconstructed image, if the at least one vector is the first latent vector. (Turner discloses “L−1 norm of the weight vector w into the modified loss function” at col. 13, line 53 – col. 14, line 31).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the L1 norm function to the loss function of Aggarwal.
The suggestion/motivation would have been to provide “the structure of the neural network … [being] simplified by using fewer connections in the neural network. As a result, the training of the neural network becomes faster, requires the consumption of fewer resources” (Turner; col. 14, lines 20-25).

Allowable Subject Matter
Claims 5-10 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Following is a list of references related to the claimed invention:
Nikola et al. (US 2021/0166340 A1): A method comprises obtaining a real-time input from a sensor mounted on a vehicle, that captures a front view of a road ahead of the vehicle and processing thereof by a neural network to generate a functional map of the road ahead of the vehicle. Each pixel in the functional map is associated with a predetermined relative position to the vehicle. A content of each pixel is assigned a set of values, each of which represents a functional feature relating to a location at a corresponding predetermined relative position to the pixel. The processing is performed without relying on a pre-determined precise mapping. The method further comprises providing the functional map to an autonomous navigation system of the vehicle, to autonomously drive the vehicle in accordance with functional features represented by the functional map.
Wu et al. (US 2020/0383623 A1): method and apparatus for providing emotional care in a session between a user and an electronic conversational agent. A first group of images may be received in the session, the first group of images comprising one or more images associated with the user. A user profile of the user may be obtained. A first group of textual descriptions may be generated from the first group of images based at least on emotion information in the user profile. A first memory record may be created based at least on the first group of images and the first group of textual descriptions.
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHN W LEE whose telephone number is (571)272-9554. The examiner can normally be reached Mon-Fri 8:00AM-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, NAY MAUNG can be reached on 571-272-7882. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JOHN W LEE/Primary Examiner, Art Unit 2664