DETAILED ACTIONNotice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Applicant Response to Official Action
The response filed on 7/26/2022 has been entered and made of record.
Acknowledgment 
Claims 2, 7, and 13, canceled on 7/6/2022, are acknowledged by the examiner. 
Claims 1, 3-6, 8-12, 14-15, and 17-20, amended on 7/6/2022, are acknowledged by the examiner.  
Claims 16 and 21-23, added on 7/6/2022, are acknowledged by the examiner. 

Response to Arguments
Applicant’s arguments with respect to claims 1, 6, 12, 18-20 and their dependent claims have been considered but they are moot in view of the new grounds of rejection necessitated by amendments initiated by the applicant.  Examiner addresses the main arguments of the Applicant as below.
Regarding the drawing objection, the amendment filed on 7/6/2022 addresses the issue.  As a result, the drawing objection is withdrawn.
Regarding the 35 U.S.C. 112(a) rejection, the amendment filed on 7/6/2022 addresses the issue.  As a result, the 35 U.S.C. 112(a) rejection is withdrawn.
Regarding the 35 U.S.C. 112(b) rejection, the amendment filed on 7/21/2020 addresses several issues.  As a result, the 35 U.S.C. 112(b) rejections for several issues are withdrawn.

Objections 
Claim 16 is objected.  Claim 16 did not exist in the original filed on 3/13/2018.  As a result, claim 16 should be labeled as (New) in the amendment filed on 7/6/2022.  Appropriate correction is required. Please see MPEP 1.121.

Claim Rejection – 35 U.S.C. § 112
The following is a quotation of 35 U.S.C. 112(a): 
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention. 
The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112: 
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same and shall set forth the best mode contemplated by the inventor of carrying out his invention.

The following is a quotation of 35 U.S.C. 112(b): 
(B) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. 
The following is a quotation of pre-AIA  35 U.S.C. 112, second paragraph: 
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1, 3-6, 8-12, and 14-17 are rejected under 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph because of a new matter. The amended claims include the following claim limitation “when the target domain is not completely covered by the source domain associated with one or more specified pretrained deep learning models of the one or more pretrained deep learning models, performing transfer learning on the one or more specified pretrained deep learning models, wherein performing the transfer learning comprises training a final layer of each of the one or more specified pretrained deep learning models, and wherein, during the training of the final layer of each of the one or more specified pretrained deep learning models, a learning rate for one or more convolutional layers of the model is set to zero”. It is noted that the specification several times discloses that “when the target domain is not completely covered by the source domain associated with a pretrained deep learning model of the one or more pretrained deep learning models, perform transfer learning on the pretrained deep learning model” [para. 0005-0007, 0100].  However, there is nowhere in the specification teach that “when the target domain is not completely covered by the source domain associated with one or more specified pretrained deep learning models of the one or more pretrained deep learning models, performing transfer learning on the one or more specified pretrained deep learning models”. In fact, the specification mentions the word “specified” only one but it is not in the relevant context.  As a result, the claim limitation “when the target domain is not completely covered by the source domain associated with one or more specified pretrained deep learning models of the one or more pretrained deep learning models, performing transfer learning on the one or more specified pretrained deep learning models, wherein performing the transfer learning comprises training a final layer of each of the one or more specified pretrained deep learning models, and wherein, during the training of the final layer of each of the one or more specified pretrained deep learning models, a learning rate for one or more convolutional layers of the model is set to zero” is a new matter, which is not described in the application as originally filed. The new matter is required to be canceled from the claims. Please see MPEP 608.04. Since the “specified pretrained deep learning models” is not supported by the specification, it has no patentable weight in this Office action.  
Claims 6, 8-11, 19, and 22 are rejected under 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph because of a new matter. The amended claims 6 and 19 include “receiving, at a processing device, one or more pretrained deep learning models”. It is noted that the specification only mentions “processing device” once in paragraph [0032].  However, the specification does not mention that the processing device receives one or more pretrained deep learning models.  Hence, the claim limitation “receiving, at a processing device, one or more pretrained deep learning models” is a new matter, which is not described in the application as originally filed.  As a result, claims 6, 19 and their dependent claims are rejected under 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph because of a new matter. The new matter is required to be canceled from the claims (Please see MPEP 608.04).
Claims 6, 8-11, 19, and 22 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter, which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention. It is not clear from the claim language that whether a person or  a device, or a combination of them, performs all operations in claims 6 and 19.  As a result, claims 6, 19, and their dependent claims are indefinite and are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under pre-AIA  35 U.S.C. 103(a) are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
            This application currently names joint inventors. In considering patentability of the claims under pre-AIA  35 U.S.C. 103(a), the examiner presumes that the subject matter of the various claims was commonly owned at the time any inventions covered therein were made absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and invention dates of each claim that was not commonly owned at the time a later invention was made in order for the examiner to consider the applicability of pre-AIA  35 U.S.C. 103(c) and potential pre-AIA  35 U.S.C. 102(e), (f) or (g) prior art under pre-AIA  35 U.S.C. 103(a).
Claims 1, 6, and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Baradel (US Patent 10,289,909 B2), (“Baradel”), in view of Malpani et al. (US Patent 9,753,949 B1), (“Malpani”), in view of Chatwin et al. (US Patent Application Publication 2018/0211303 A1), (“Chatwin”).
Regarding claim 1, Baradel meets the claim limitations as follow.
An apparatus (i.e. an apparatus) [Baradel: col. 2, line 15-16] for labeling of image data (i.e. generate one or more labels for each unlabeled target image of the one or more images associated with the target domain and classify each one of the one or more images in the target domain using the one or more labels) [Baradel: col. 2, line 10-14], the apparatus (i.e. an apparatus) [Baradel: col. 2, line 15-16] comprising: a processor (i.e. an apparatus comprising a processor) [Baradel: col. 2, line 15-16]; and a memory containing instructions that when executed by the processor, cause the apparatus to (i.e. an apparatus comprising a processor and a computer-readable
medium storing a plurality of instructions which, when executed by the processor, cause the processor to perform operations that) [Baradel: col. 2, line 15-19]: receive one or more pretrained deep learning models ((i.e. using a pretrained deep learning model) [Baradel: col. 4, line 58-59; Figs. 2-3]; (i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source and target image features only) [Baradel: col. 3, line 23-29; Figs. 2-3]), each pretrained deep learning model ((i.e. using a pretrained deep learning model) [Baradel: col. 4, line 58-59; Figs. 2-3]; (i.e. trains a conditional maximum mean discrepancy (CMMD) engine) [Baradel: col. 1, line 52-53]) associated with a source domain (i.e. trains a conditional maximum mean discrepancy (CMMD) engine based on a difference between the one or more source domain features and the one or more target domain features) [Baradel: col. 1, line 52-55], receive image data (i.e. receive one or more images) [Baradel: col. 2, line 19] to be labeled (i.e. to generate one or more labels for each unlabeled target image) [Baradel: col. 2, line 10-11], input the received image data to each of the one or more pretrained deep learning models ((i.e. when executed by the processor, cause the processor to perform
operations that receive one or more images associated with a source domain) [Baradel: col. 2, line 17-20]; (i.e. In one embodiment, the CAN 300 may receive one or more images 114 from the source domain 110) [Baradel: col. 8, line 19-20]; (i.e. the CAN 200 may use any number of deep neural networks 206 depending on the amount of training or learning that is desired at a cost of efficiency) [Baradel: col. 5, line 2-5];  (i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source and target image features only) [Baradel: col. 3, line 23-29; Figs. 2-3]), perform a model adaptation on at least one of the one or more of the pretrained deep learning models ((i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source) [Baradel: col. 3, line 23-29; Figs. 2-3]; (i.e. the adaptation pipeline) [Baradel: col. 7, line 35-36; Figs. 2-3]), provide (i.e. generate) [Baradel: col. 9, line 10], from each of one or more the pretrained deep learning models ((i.e. using a pretrained deep learning model) [Baradel: col. 4, line 58-59; Figs. 2-3]; (i.e. trains a conditional maximum mean discrepancy (CMMD) engine) [Baradel: col. 1, line 52-53]), one or more outputs in a target domain (i.e. The CMMD engine 212 may then apply the shift or the differences to generate an output of one or more source labels 214 and one or more target labels 216) [Baradel: col. 9, line 9-11], provide (i.e. generate) [Baradel: col. 9, line 10] an ensemble output (i.e. to generate an output of one or more target labels 2161 to 216n (hereinafter referred to individually as a target label 216 or collectively as target labels 216)) [Baradel: col. 5, line 52-57; Fig. 2], the ensemble output comprising labels for the image data determined based on the one or more outputs from each of the one or more pretrained deep learning models (i.e. The CMMD engine 212 may then apply the shift or the differences to generate an output of , and when the target domain is not completely covered ((i.e. In domain adaptation, the most challenging is the unsupervised case when labeled instances are not available in the target domain) [Baradel: col. 1, line 23-25]; (i.e. when target labels are unavailable) [Baradel: col. 1, line 32]) by the source domain (i.e. the processor to perform operations that receive one or more images associated with a source domain) [Baradel: col. 2, line 17-20] associated with one or more specified pretrained deep learning models of the one or more pretrained deep learning models ((i.e. In one embodiment, the CAN 300 may receive one or more images 114 from the source domain 110) [Baradel: col. 8, line 19-20]; (i.e. the CAN 200 may use any number of deep neural networks 206 depending on the amount of training or learning that is desired at a cost of efficiency) [Baradel: col. 5, line 2-5];  (i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source) [Baradel: col. 3, line 23-29; Figs. 2-3]), perform transfer learning on one or more specified pretrained deep learning models (i.e. Multiple shallow transfer learning methods bridge the source and target domains by learning invariant feature representations or estimating the instance importance without using target labels) [Baradel: col. 1, line 13-16], 
wherein the instructions that when executed cause the apparatus to perform (i.e. an apparatus comprising a processor and a computer-readable medium storing a plurality of instructions which, when executed by the processor, cause the processor to perform operations that) [Baradel: col. 2, line 15-19] the transfer learning  (i.e. Multiple shallow transfer learning methods bridge the source and target domains by learning invariant feature representations or estimating the instance importance without using target labels) [Baradel: col. 1, line 13-16] comprise instructions that when executed cause the apparatus to (i.e. an apparatus comprising a processor and a computer-readable medium storing a plurality of instructions which, when executed by the processor, cause the processor to perform operations that) [Baradel: col. 2, line 15-19] train a final layer of each of the one or more specified pretrained deep learning models, and 
wherein, during the training of the final layer of each of the one or more specified pretrained deep learning models, a learning rate for one or more convolutional layers of the model is set to zero.   
Baradel does not explicitly disclose the following claim limitations (Emphasis added).
An apparatus for labeling of image data, the apparatus comprising: a processor; and a memory containing instructions that when executed by the processor, cause the apparatus to: receive one or more pretrained deep learning models, each pretrained deep learning model associated with a source domain, receive image data to be labeled, input the received image data to each of the one or more pretrained deep learning models, perform a model adaptation on at least one of the one or more pretrained deep learning models, provide, from each of the one or more pretrained deep learning models, one or more outputs in a target domain, provide an ensemble output, the ensemble output comprising labels for the image data determined based on the one or more outputs from each of the one or more pretrained deep learning models, and when the target domain is not completely covered by the source domain associated with one or more specified pretrained deep learning models of the one or more pretrained deep learning models, perform transfer learning on the one or more specified pretrained deep learning models, 
wherein the instructions that when executed cause the apparatus to perform the transfer learning comprise instructions that when executed cause the apparatus to train a final layer of each of the one or more specified pretrained deep learning models, and wherein, during the training of the final layer of each of the one or more specified pretrained deep learning models, a learning rate for one or more convolutional layers of the model is set to zero.  
However, in the same field of endeavor Malpani further discloses the claim limitations and the deficient claim limitations, as follows:
to train a final layer of each of the one or more specified pretrained deep learning models ((i.e. the convolutional neural network 234 can consist of a stack of eight layers with weights, the first five layers being convolutional layers and the remaining three layers being fully-connected layers) [Malpani: col. 11, line 47-50]; (i.e. for each image-geographic region pair, the features extracted using the model generated by the convolutional neural network 234 as trained in step 303 is implemented with three fully connected layers of the convolutional neural network 234) [Malpani: col. 12, line 63-67] – Note: Malpani discloses that there are three fully connected layers at the bottom of the stack of eight layers. One of them should be a final layer, and it is trained), and wherein, during the training of the final layer of each of the one or more specified pretrained deep learning models, a learning rate for one or more convolutional layers of the model is set to zero (i.e. Training with the first set of training images 240 may be regularized by weight decay (e.g., reducing the size of all weight to prevent the convolutional neural network 234 from focusing too much on any single feature of an image) and dropout regularization (e.g., randomly zeroing features so that the convolutional neural network 234 does not rely too much on combinations of features that are a coincidence) for the first two fully-connected layers with a dropout ratio (e.g., a proportion of values set to zero in each training step)) [Malpani: col. 12, line 4-13].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Baradel with Malpani to program the system to implement method’s Malpani.  
Therefore, the combination of Baradel with Malpani will enable the system to provide a trained model specialized to understand and identify features in images most important to visual tastes of a geographic region [Malpani: col. 13, line 10-16]. 
In addition, in the same field of endeavor Chatwin further discloses the transfer learning operation as follows:
when the target domain is not completely covered by the source domain associated with one or more specified pretrained deep learning models of the one or more pretrained deep learning models ((i.e. In a typical transfer learning setting, the source domain is assumed to have a large amount of labeled data and is well understood. The target domain can comprise a small budget of labeled data and is not well understood. As such, it is necessary for the predictor to adapt to solve the problem of classifying the target domain) [Chatwin:  para 0047]; (i.e. In a target domain, very little labeled data is available, and labels can only be attained at a cost. Therefore, leveraging the labeled source data can be used to acquire as few labels as possible from the target data in order to adapt a statistical model to perform well on the target domain. This process can be referred to as transfer learning or supervised domain adaptation) [Chatwin: para 0052]), perform transfer learning on the one or more specified pretrained deep learning model (i.e. Various active learning techniques can be used to actively sample data to perform transfer learning between domains.) [Chatwin:  para 0049].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Baradel and Malpani with Chatwin to program the system to implement the transfer learning scheme.  
Therefore, the combination of Baradel and Malpani with Chatwin will enable the system to adapt to solve the problem of classifying the target domain [Morris: para 0047]. 

Regarding claim 6, Baradel meets the claim limitations as follow.
A method (i.e. a method) [Baradel: col. 1, line 44] for labeling of image data (i.e. generate one or more labels for each unlabeled target image of the one or more images associated with the target domain and classify each one of the one or more images in the target domain using the one or more labels) [Baradel: col. 2, line 10-14], the method (i.e. a method) [Baradel: col. 1, line 44] comprising:receiving (i.e. receive) [Baradel: col. 2, line 19], at a processing device (i.e. Another disclosed feature of the embodiments is an apparatus comprising a processor and a computer-readable medium storing a plurality of instructions which, when executed by the processor, cause the processor to perform operations that receive) [Baradel: col. 2, line 15-19], one or more pretrained deep learning models ((i.e. using a pretrained deep learning model) [Baradel: col. 4, line 58-59; Figs. 2-3]; (i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source and target image features only) [Baradel: col. 3, line 23-29; Figs. 2-3]), each pretrained deep learning model ((i.e. using a pretrained deep learning model) [Baradel: col. 4, line 58-59; Figs. 2-3]; (i.e. trains a conditional maximum mean discrepancy (CMMD) engine) [Baradel: col. 1, line 52-53]) associated with a source domain (i.e. trains a conditional maximum mean discrepancy (CMMD) engine based on a difference between the one or more source domain features and the one or more target domain features) [Baradel: col. 1, line 52-55], receiving image data (i.e. receive one or more images) [Baradel: col. 2, line 19] to be labeled (i.e. to generate one or more labels for each unlabeled target image) [Baradel: col. 2, line 10-11], inputting the received image data to each of the one or more pretrained deep learning models ((i.e. when executed by the processor, cause the processor to perform
operations that receive one or more images associated with a source domain) [Baradel: col. 2, line 17-20]; (i.e. In one embodiment, the CAN 300 may receive one or more images 114 from the source domain 110) [Baradel: col. 8, line 19-20]; (i.e. the CAN 200 may use any number of deep neural networks 206 depending on the amount of training or learning that is desired at a cost of efficiency) [Baradel: col. 5, line 2-5];  (i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source and target image features only) [Baradel: col. 3, line 23-29; Figs. 2-3]), performing a model adaptation on at least one or more pretrained deep learning models ((i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source) [Baradel: col. 3, line 23-29; Figs. 2-3]; (i.e. the adaptation pipeline) [Baradel: col. 7, line 35-36; Figs. 2-3]), providing (i.e. generate) [Baradel: col. 9, line 10], from each of the one or more pretrained deep learning models ((i.e. using a pretrained deep learning model) [Baradel: col. 4, line 58-59; Figs. 2-3]; (i.e. trains a conditional maximum mean discrepancy (CMMD) engine) [Baradel: col. 1, line 52-53]), one or more outputs in a target domain (i.e. The CMMD engine 212 may then apply the shift or the differences to generate an output of one or more source labels 214 and one or more target labels 216) [Baradel: col. 9, line 9-11]; providing (i.e. generate) [Baradel: col. 9, line 10] an ensemble output (i.e. to generate an output of one or more target labels 2161 to 216n (hereinafter referred to individually as a target label 216 or collectively as target labels 216)) [Baradel: col. 5, line 52-57; Fig. 2], the ensemble output comprising labels for the image data determined based on the one or more outputs from each of the one or more pretrained deep learning models (i.e. The CMMD engine 212 may then apply the shift or the differences to generate an output of , and when the target domain is not completely covered ((i.e. In domain adaptation, the most challenging is the unsupervised case when labeled instances are not available in the target domain) [Baradel: col. 1, line 23-25]; (i.e. when target labels are unavailable) [Baradel: col. 1, line 32]) by the source domain (i.e. the processor to perform operations that receive one or more images associated with a source domain) [Baradel: col. 2, line 17-20] associated with one or more specified pretrained deep learning model of the one or more pretrained deep learning models ((i.e. In one embodiment, the CAN 300 may receive one or more images 114 from the source domain 110) [Baradel: col. 8, line 19-20]; (i.e. the CAN 200 may use any number of deep neural networks 206 depending on the amount of training or learning that is desired at a cost of efficiency) [Baradel: col. 5, line 2-5];  (i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source) [Baradel: col. 3, line 23-29; Figs. 2-3]), performing transfer learning on the one or more specified pretrained deep learning models (i.e. Multiple shallow transfer learning methods bridge the source and target domains by learning invariant feature representations or estimating the instance importance without using target labels) [Baradel: col. 1, line 13-16];
wherein performing (i.e. an apparatus comprising a processor and a computer-readable medium storing a plurality of instructions which, when executed by the processor, cause the processor to perform operations that) [Baradel: col. 2, line 15-19] the transfer learning  (i.e. Multiple shallow transfer learning methods bridge the source and target domains by learning invariant feature representations or estimating the instance importance without using target labels) [Baradel: col. 1, line 13-16] comprise training a final layer of each of the one or more specified pretrained deep learning models, and 
wherein, during the training of the final layer of each of the one or more specified pretrained deep learning models, a learning rate for one or more convolutional layers of the model is set to zero.   
Baradel does not explicitly disclose the following claim limitations (Emphasis added).
A method for labeling of image data, the method comprising: receiving, at a processing device, one or more pretrained deep learning models, each pretrained deep learning model associated with a source domain; receiving image data to be labeled; inputting the received image data to each of the one or more pretrained deep learning models; performing an adaptation on one or more of the pretrained deep learning models; providing, from each of the one or more pretrained deep learning models, one or more outputs in a target domain; providing an ensemble output, the ensemble output comprising labels for the image data determined based on the one or more outputs from each of the one or more pretrained deep learning models; and when the target domain is not completely covered by the source domain associated with one or more specified pretrained deep learning models of the one or more pretrained deep learning models, performing transfer learning on the one or more specified pretrained deep learning models, wherein performing comprise training a final layer of each of the one or more specified pretrained deep learning models, and 
wherein, during the training of the final layer of each of the one or more specified pretrained deep learning models, a learning rate for one or more convolutional layers of the model is set to zero.   
However, in the same field of endeavor Malpani further discloses the claim limitations and the deficient claim limitations, as follows:
training a final layer of each of the one or more specified pretrained deep learning models ((i.e. the convolutional neural network 234 can consist of a stack of eight layers with weights, the first five layers being convolutional layers and the remaining three layers being fully-connected layers) [Malpani: col. 11, line 47-50]; (i.e. for each image-geographic region pair, the features extracted using the model generated by the convolutional neural network 234 as trained in step 303 is implemented with three fully connected layers of the convolutional neural network 234) [Malpani: col. 12, line 63-67] – Note: Malpani discloses that there are three fully connected layers at the bottom of the stack of eight layers. One of them should be a final layer, and it is trained), and wherein, during the training of the final layer of each of the one or more specified pretrained deep learning models, a learning rate for one or more convolutional layers of the model is set to zero (i.e. Training with the first set of training images 240 may be regularized by weight decay (e.g., reducing the size of all weight to prevent the convolutional neural network 234 from focusing too much on any single feature of an image) and dropout regularization (e.g., randomly zeroing features so that the convolutional neural network 234 does not rely too much on combinations of features that are a coincidence) for the first two fully-connected layers with a dropout ratio (e.g., a proportion of values set to zero in each training step)) [Malpani: col. 12, line 4-13].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Baradel with Malpani to program the system to implement method’s Malpani.  
Therefore, the combination of Baradel with Malpani will enable the system to provide a trained model specialized to understand and identify features in images most important to visual tastes of a geographic region [Malpani: col. 13, line 10-16].
In addition, in the same field of endeavor Chatwin further discloses the transfer learning operation as follows:
when the target domain is not completely covered by the source domain associated with a pretrained deep learning model of the one or more pretrained deep learning models ((i.e. In a typical transfer learning setting, the source domain is assumed to have a large amount of labeled data and is well understood. The target domain can comprise a small budget of labeled data and is not well understood. As such, it is necessary for the predictor to adapt to solve the problem of classifying the target domain) [Chatwin: para 0047]; (i.e. In a target domain, very little labeled data is available, and labels can only be attained at a cost. Therefore, leveraging the labeled source data can be used to acquire as few labels as possible from the target data in order to adapt a statistical model to perform well on the target domain. This process can be referred to as transfer learning or supervised domain adaptation) [Chatwin:  para 0052]), perform transfer learning on the pretrained deep learning model (i.e. Various active learning techniques can be used to actively sample data to perform transfer learning between domains.) [Chatwin:  para 0049].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Baradel and Malpani with Chatwin to program the system to implement the transfer learning scheme.  
Therefore, the combination of Baradel and Malpani with Chatwin will enable the system to adapt to solve the problem of classifying the target domain [Morris: para 0047]. 

Regarding claim 12, Baradel meets the claim limitations as follow.
A non-transitory computer-readable medium including program code that, when executed by a processor, causes an apparatus to (i.e. an apparatus comprising a processor and a computer-readable medium storing a plurality of instructions which, when executed by the processor, cause the processor to perform operations that) [Baradel: col. 2, line 15-19]:receive one or more pretrained deep learning models ((i.e. using a pretrained deep learning model) [Baradel: col. 4, line 58-59; Figs. 2-3]; (i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source and target image features only) [Baradel: col. 3, line 23-29; Figs. 2-3]), each pretrained deep learning model ((i.e. using a pretrained deep learning model) [Baradel: col. 4, line 58-59; Figs. 2-3]; (i.e. trains a conditional maximum mean discrepancy (CMMD) engine) [Baradel: col. 1, line 52-53]) associated with a source domain (i.e. trains a conditional maximum mean discrepancy (CMMD) engine based on a difference between the one or more source domain features and the one or more target domain features) [Baradel: col. 1, line 52-55], receive image data (i.e. receive one or more images) [Baradel: col. 2, line 19] to be labeled (i.e. to generate one or more labels for each unlabeled target image) [Baradel: col. 2, line 10-11],
input the received image data to each of the one or more pretrained deep learning models ((i.e. when executed by the processor, cause the processor to perform
operations that receive one or more images associated with a source domain) [Baradel: col. 2, line 17-20]; (i.e. In one embodiment, the CAN 300 may receive one or more images 114 from the source domain 110) [Baradel: col. 8, line 19-20]; (i.e. the CAN 200 may use any number of deep neural networks 206 depending on the amount of training or learning that is desired at a cost of efficiency) [Baradel: col. 5, line 2-5];  (i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source and target image features only) [Baradel: col. 3, line 23-29; Figs. 2-3]), perform a model adaptation on at least one of the one or more pretrained deep learning models ((i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source) [Baradel: col. 3, line 23-29; Figs. 2-3]; (i.e. the adaptation pipeline) [Baradel: col. 7, line 35-36; Figs. 2-3]), provide (i.e. generate) [Baradel: col. 9, line 10], from each of the one or more pretrained deep learning models ((i.e. using a pretrained deep learning model) [Baradel: col. 4, line 58-59; Figs. 2-3]; (i.e. trains a conditional maximum mean discrepancy (CMMD) engine) [Baradel: col. 1, line 52-53]), one or more outputs in a target domain (i.e. The CMMD engine 212 may then apply the shift or the differences to generate an output of one or more source labels 214 and one or more target labels 216) [Baradel: col. 9, line 9-11], provide (i.e. generate) [Baradel: col. 9, line 10] an ensemble output (i.e. to generate an output of one or more target labels 2161 to 216n (hereinafter referred to individually as a target label 216 or collectively as target labels 216)) [Baradel: col. 5, line 52-57; Fig. 2], the ensemble output comprising labels for the image data determined based on the one or more outputs from each of the one or more pretrained deep learning models (i.e. The CMMD engine 212 may then apply the shift or the differences to generate an output of , and when the target domain is not completely covered ((i.e. In domain adaptation, the most challenging is the unsupervised case when labeled instances are not available in the target domain) [Baradel: col. 1, line 23-25]; (i.e. when target labels are unavailable) [Baradel: col. 1, line 32]) by the source domain (i.e. the processor to perform operations that receive one or more images associated with a source domain) [Baradel: col. 2, line 17-20] associated with one or more specified pretrained deep learning models of the one or more pretrained deep learning models ((i.e. In one embodiment, the CAN 300 may receive one or more images 114 from the source domain 110) [Baradel: col. 8, line 19-20]; (i.e. the CAN 200 may use any number of deep neural networks 206 depending on the amount of training or learning that is desired at a cost of efficiency) [Baradel: col. 5, line 2-5];  (i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source) [Baradel: col. 3, line 23-29; Figs. 2-3]), perform transfer learning on the one or more specified pretrained deep learning models (i.e. Multiple shallow transfer learning methods bridge the source and target domains by learning invariant feature representations or estimating the instance importance without using target labels) [Baradel: col. 1, line 13-16], 
wherein the program code that when executed causes the apparatus to perform (i.e. an apparatus comprising a processor and a computer-readable medium storing a plurality of instructions which, when executed by the processor, cause the processor to perform operations that) [Baradel: col. 2, line 15-19] the transfer learning (i.e. Multiple shallow transfer learning methods bridge the source and target domains by learning invariant feature representations or estimating the instance importance without using target labels) [Baradel: col. 1, line 13-16] comprises program code that when executed causes the apparatus (i.e. an apparatus comprising a processor and a computer-readable medium storing a plurality of instructions which, when executed by the processor, cause the processor to perform operations that) [Baradel: col. 2, line 15-19] to train a final layer of each of the one or more specified pretrained deep learning models, and 
wherein, during the training of the final layer of each of the one or more specified pretrained deep learning models, a learning rate for one or more convolutional layers of the model is set to zero.   
Baradel does not explicitly disclose the following claim limitations (Emphasis added).
A non-transitory computer-readable medium including program code that, when executed by a processor, causes an apparatus to: receive one or more pretrained deep learning models, each pretrained deep learning model associated with a source domain, receive image data to be labeled, input the received image data to each of the one or more pretrained deep learning models, perform a model adaptation on at least one of the one or more pretrained deep learning models, provide, from each of the one or more pretrained deep learning models, one or more outputs in a target domain, provide an ensemble output, the ensemble output comprising labels for the image data determined based on the one or more outputs from each of the one or more pretrained deep learning models, and when the target domain is not completely covered by the source domain associated with one or more specified pretrained deep learning models of the one or more pretrained deep learning models, perform transfer learning on the one or more specified pretrained deep learning models, wherein the program code that when executed causes the apparatus to perform the transfer learning comprises program code that when executed causes the apparatus to train a final layer of each of the one or more specified pretrained deep learning models, and wherein, during the training of the final layer of each of the one or more specified pretrained deep learning models, a learning rate for one or more convolutional layers of the model is set to zero.  
However, in the same field of endeavor Malpani further discloses the claim limitations and the deficient claim limitations, as follows:
to train a final layer of each of the one or more specified pretrained deep learning models ((i.e. the convolutional neural network 234 can consist of a stack of eight layers with weights, the first five layers being convolutional layers and the remaining three layers being fully-connected layers) [Malpani: col. 11, line 47-50]; (i.e. for each image-geographic region pair, the features extracted using the model generated by the convolutional neural network 234 as trained in step 303 is implemented with three fully connected layers of the convolutional neural network 234) [Malpani: col. 12, line 63-67] – Note: Malpani discloses that there are three fully connected layers at the bottom of the stack of eight layers. One of them should be a final layer, and it is trained), and wherein, during the training of the final layer of each of the one or more specified pretrained deep learning models, a learning rate for one or more convolutional layers of the model is set to zero (i.e. Training with the first set of training images 240 may be regularized by weight decay (e.g., reducing the size of all weight to prevent the convolutional neural network 234 from focusing too much on any single feature of an image) and dropout regularization (e.g., randomly zeroing features so that the convolutional neural network 234 does not rely too much on combinations of features that are a coincidence) for the first two fully-connected layers with a dropout ratio (e.g., a proportion of values set to zero in each training step)) [Malpani: col. 12, line 4-13].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Baradel with Malpani to program the system to implement method’s Malpani.  
Therefore, the combination of Baradel with Malpani will enable the system to provide a trained model specialized to understand and identify features in images most important to visual tastes of a geographic region [Malpani: col. 13, line 10-16].
In addition, in the same field of endeavor Chatwin further discloses the transfer learning operation as follows:
when the target domain is not completely covered by the source domain associated with a pretrained deep learning model of the one or more pretrained deep learning models ((i.e. In a typical transfer learning setting, the source domain is assumed to have a large amount of labeled data and is well understood. The target domain can comprise a small budget of labeled data and is not well understood. As such, it is necessary for the predictor to adapt to solve the problem of classifying the target domain) [Chatwin: para 0047]; (i.e. In a target domain, very little labeled data is available, and labels can only be attained at a cost. Therefore, leveraging the labeled source data can be used to acquire as few labels as possible from the target data in order to adapt a statistical model to perform well on the target domain. This process can be referred to as transfer learning or supervised domain adaptation) [Chatwin:  para 0052]), perform transfer learning on the pretrained deep learning model (i.e. Various active learning techniques can be used to actively sample data to perform transfer learning between domains.) [Chatwin:  para 0049].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Baradel and Malpani with Chatwin to program the system to implement the transfer learning scheme.  
Therefore, the combination of Baradel and Malpani with Chatwin will enable the system to adapt to solve the problem of classifying the target domain [Morris: para 0047]. 

Claims 11 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Baradel (US Patent 10,289,909 B2), (“Baradel”), in view of Malpani et al. (US Patent 9,753,949 B1), (“Malpani”), in view of Chatwin et al. (US Patent Application Publication 2018/0211303 A1), (“Chatwin”), in view of Banadaki et al. (US Patent 10,152,557 B2), (“Scott”).
Regarding claim 11, Baradel, Morris, and Chatwin meet the claim limitations as set forth in claim 6.Baradel further meets the claim limitations as follow.
The method of claim 6 (i.e. a method) [Baradel: col. 1, line 44], further comprising: updating a bipartite graph of elements of the target domain at one or more crowd-worker devices (i.e. one or more images in the target domain using the one or more labels) [Baradel: col. 2, line 32-33].   
Baradel, Morris, and Chatwin do not explicitly disclose the following claim limitations (Emphasis added).
The method of claim 6, further comprising: updating a bipartite graph of elements of the target domain at one or more crowd-worker devices.    
However, in the same field of endeavor Banadaki further discloses the claim limitations and the deficient claim limitations, as follows:
updating a bipartite graph (i.e. Graph system 100 may be in communication with
client(s) 180 over network 160. Clients 180 may allow a user to submit queries to the similarity ranking engine 110, to maintain bipartite graph 140, to schedule creation of weighted category subgraphs 150, etc. Network 160 may be, for example, the Internet, or the network 160 can be a wired or wireless local area network (LAN), wide area network (WAN), etc., implemented using, for example, gateway devices, bridges, switches, and/or so forth. Via the network 160, the graph system 100 may communicate with and transmit data to/from clients 180. In some implementations, graph system 100 may be in communication with or include other computing devices that provide updates to the bipartite graph) [Banadaki: col. 6, line 6-20],
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Baradel, Malpani, and Chatwin with Banadaki to connect the apparatus with the graph system.  
Therefore, the combination of Baradel, Malpani, and Chatwin with Banadaki will enable the graph system to provide updates to the bipartite graph [Banadaki: col. 6, line 6-20]. 

Regarding claim 17, Baradel and Chatwin meet the claim limitations as set forth in claim 12.Baradel further meets the claim limitations as follow.
The non-transitory computer-readable medium of claim 12 (i.e. an apparatus comprising a processor and a computer-readable medium storing a plurality of instructions which, when executed by the processor, cause the processor to perform operations that) [Baradel: col. 2, line 15-19], further including program code that, when executed by the processor, causes the apparatus to (i.e. an apparatus comprising a processor and a computer-readable medium storing a plurality of instructions which, when executed by the processor, cause the processor to perform operations that) [Baradel: col. 2, line 15-19] update a bipartite graph of elements of the target domain at one or more crowd-worker devices (i.e. one or more images in the target domain using the one or more labels) [Baradel: col. 2, line 32-33].   
Baradel and Chatwin do not explicitly disclose the following claim limitations (Emphasis added).
The non-transitory computer-readable medium of claim 12, further including program code that, when executed by the processor, causes the apparatus to update a bipartite graph of elements of the target domain at one or more crowd-worker devices.    
However, in the same field of endeavor Banadaki further discloses the claim limitations and the deficient claim limitations, as follows:
update a bipartite graph (i.e. Graph system 100 may be in communication with
client(s) 180 over network 160. Clients 180 may allow a user to submit queries to the similarity ranking engine 110, to maintain bipartite graph 140, to schedule creation of weighted category subgraphs 150, etc. Network 160 may be, for example, the Internet, or the network 160 can be a wired or wireless local area network (LAN), wide area network (WAN), etc., implemented using, for example, gateway devices, bridges, switches, and/or so forth. Via the network 160, the graph system 100 may communicate with and transmit data to/from clients 180. In some implementations, graph system 100 may be in communication with or include other computing devices that provide updates to the bipartite graph) [Banadaki: col. 6, line 6-20],
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Baradel, Malpani, and Chatwin with Banadaki to connect the apparatus with the graph system.  
Therefore, the combination of Baradel, Malpani, and Chatwin with Banadaki will enable the graph system to provide updates to the bipartite graph [Banadaki: col. 6, line 6-20]. 

Claims 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over in view of Morris et al. (US Patent 10,223,067 B2), (“Morris”), in view of Malpani et al. (US Patent 9,753,949 B1), (“Malpani”), in view of Hsiao (US Patent 10,540,378 B1), (“Hsiao”).

Regarding claim 18, Morris meets the claim limitations as follow.
An apparatus (i.e. system) [Morris: col. 7, line 4] comprising: a processor (i.e. one or more processors) [Morris: col. 21, line 4]; and a memory containing instructions that, when executed by the processor, cause the apparatus to (i.e. computer-readable media having stored thereon computer-executable instructions that, when executed, configure the processors to perform operations) [Morris: col. 21, line 5-7]: receive (i.e.  receiving an input indicative of a selection from the user) [Morris: col. 21, line 43], from a user terminal ((i.e. terminals) [Morris: col. 8, line 25]; (i.e.  human-machine interface) [Morris: col. 21, line 46]; (i.e. a camera, a microphone, a network interface, one or more sensors configured to monitor movements of the user) [Morris: col. 21, line 19-21]), a control input associated with an intent ((i.e.  receiving an input indicative of a selection from the user) [Morris: col. 21, line 43]; (i.e.  an activity identifier inferred from motions of the user received from the one or more sensors) [Morris: col. 21, line 24-25]), L:\SAMS12\00124- 45 -DOCKET NO. MPS17-BX05 (SAMS12-00124)PATENT obtain (i.e.  obtain a portion of the contextual data) [Morris: col. 13, line 12-13] location data associated with a location of the user terminal ((i.e.  operation system information or information provided by the operation system (e.g., a time of day, device and/or software information); a location of the device 210 (e.g., a geo-location corresponding to an IP address of the device 210, a GPS location of the device 210)) [Morris: col. 13, line 29-35]; (i.e.  the contextual data 204(1) can include contextual data regarding the physical surroundings of a user such as, for example, one or more of: an image of the surroundings 208 of the user 206; salient object identifiers of salient objects 220(1)-(3) (e.g., an onion 220(1), a banana 220(2), a sale sign 220(3)); a location of the surroundings 208 (e.g., coordinates)) [Morris: col. 12, line 61-67]), determine (i.e.  the techniques herein can use the weighting, sorting, ranking, and/or filtering of contextual data to determine) [Morris: col. 13, line 15-20-22] a scored set of execution options associated with the control input ((i.e.  In some examples, the area 404 of the GUI 400 can include an N number of spaces for representing suggestions. In some examples, the techniques discussed herein can leverage contextual data to weight, sort, rank, and/or filter the suggestions to reduce a number of the suggestions greater than N to N suggestions. In some examples, the techniques herein can include providing an option to the user to view more suggestions) [Morris: col. 15, line 42-49]; (i.e.  the techniques discussed herein can use weighting, sorting, ranking, and/or filtering to determine a location in a display to display more weighted suggestions since different portions of a display are more prime because users are more likely to look at them and/or to determine a subset of suggestions to represent when not all of the suggestions can be represented at once due to display space,  understandability, settings based on a user's cognitive abilities, a duration of time)) [Morris: col. 6, line 22-31]; (i.e.  The GUI 300 depicted in FIG. 3 also demonstrates an instance of weighting, sorting, ranking, and/or filtering that results) [Morris: col. 14, line 21-23]), obtain a contextual label (i.e.  obtain a portion of the contextual data) [Morris: col. 13, line 12-13] associated with the location data (i.e.  the environment detection service can include machine learning (e.g., a classifier, deep learning model, convolutional deep neural network) that takes contextual data as an input) [Morris: col. 11, line 29-32], the contextual label (i.e. the techniques discussed herein can generate words and/or phrases, based at least in part on the contextual data, to provide to the user for selection by the user and/or to be output on behalf of the use) [Morris: col. 3, line 18-21]) determined based on an application of one or more adapted pretrained deep learning models ((i.e.  the suggestion generator 122 can start with a set of heuristic phrases which the suggestion generator 122 can augment using machine learning (e.g., by a deep learning network, Naive Bayes classifier, directed graph) using data regarding the suggestion selection activity and/or utterance patterns of one or more users (e.g., by accessing past utterances of a user, by accessing past utterances and/or selections of a multiplicity of users stored on a cloud service). In some examples, the suggestion generator 122 can include human services (e.g., providing via network interface(s) 130 contextual data and/or generated words to a human for suggestion generation)) [Morris: col. 21, line 4]; (i.e.  outputs an environment identifier based at least in part on co-occurrence(s) of one or more discrete elements of contextual data. For example, if the contextual data includes the salient object labels) [Morris: col. 11, line 32-35]) to the location data (i.e.  the contextual data 204(1) can include contextual data regarding the physical surroundings of a user such as, for example, one or more of: an image of the surroundings 208 of the user 206; salient object identifiers of salient objects 220(1)-(3) (e.g., an onion 220(1), a banana 220(2), a sale sign 220(3)); a location of the surroundings 208 (e.g., coordinates)) [Morris: col. 12, line 61-67], the one or more adapted pretrained deep learning models comprising one or more unchanged convolutional layers and a final fully-connected layer trained on a data set containing classifiers of a target domain of execution options,rescore (i.e.  the techniques herein can use the weighting, sorting, ranking, and/or filtering of contextual data to determine) [Morris: col. 13, line 15-20-22] the scored set of execution options associated with the control input (i.e. In some examples, the techniques herein can include providing an option to the user to view more suggestions. In that example, the GUI 400 can refresh the suggestions 406 with another N suggestions according to the weighting, sorting, ranking, and/or filtering. In some examples, as FIG. 4 illustrates, the contextual data can include a selection 412 (i.e., selection of a representation of the letter "D") of a portion of the keyboard 402. The example suggestions depicted in FIG. 4 include suggestions that appear as a result of using this selection 412 (which can be part of context data obtained by the techniques described herein) to weight, sort, rank, and/or filter generated suggestions. For example, suggestions such as suggestion 406(2) and 406(4) can be sentence completions that start a same letter (or letter combination in some examples). In some examples, the keyboard selection 412 can otherwise be used to weight, sort, rank, and/or filter generated suggestions) [Morris: col. 15, line 47-63] based on the contextual label (i.e.  In some examples the ECTF service(s) 126 (and/or the suggestion generator 124) can additionally or alternatively sort, rank, and/or filter the generated words and/or phrases based at least in part on contextual data) [Morris: col. 12, line 4], and provide (i.e. provide to the user for selection by the user) [Morris: col. 1, line 43-44] a highest-scored execution option ((i.e. greatest relevance) [Morris: col. 5, line 46] ; (i.e.  The GUI 300 depicted in FIG. 3 also demonstrates an instance of weighting, sorting, ranking, and/or filtering that results) [Morris: col. 14, line 21-23]) to a processor of the user terminal ((i.e.   the ECTF 202 can be communicatively coupled to a human-machine interface such that the ECTF 202 can provide suggestions to a user for selection) [Morris: col. 12, line 51-53; Fig. 11]; (i.e.  The GUI 300 depicted in FIG. 3 also demonstrates an instance of weighting, sorting, ranking, and/or filtering that results) [Morris: col. 14, line 21-23]). 
Morris does not explicitly disclose the following claim limitations (Emphasis added).
An apparatus comprising: a processor; and a memory containing instructions that, when executed by the processor, cause the apparatus to: receive, from a user terminal, a control input associated with an intent, L:\SAMS12\00124- 45 -DOCKET NO. MPS17-BX05 (SAMS12-00124)PATENT obtain location data associated with a location of the user terminal, determine a scored set of execution options associated with the control input, obtain a contextual label associated with the location data, the contextual label determined based on application of one or more adapted pretrained deep learning models to the location data, the one or more adapted pretrained deep learning models comprising one or more unchanged convolutional layers and a final fully-connected layer trained on a data set containing classifiers of a target domain of execution options,rescore the scored set of execution options associated with the control input based on the contextual label, and provide a highest-scored execution option to a processor of the user terminal. 
However, in the same field of endeavor Malpani further discloses the claim limitations and the deficient claim limitations, as follows:
the one or more adapted pretrained deep learning models comprising one or more unchanged convolutional layers and a final fully-connected layer ((i.e. the convolutional neural network 234 can consist of a stack of eight layers with weights, the first five layers being convolutional layers and the remaining three layers being fully-connected layers) [Malpani: col. 11, line 47-50]; (i.e. for each image-geographic region pair, the features extracted using the model generated by the convolutional neural network 234 as trained in step 303 is implemented with three fully connected layers of the convolutional neural network 234) [Malpani: col. 12, line 63-67] – Note: Malpani discloses that there are three fully connected layers at the bottom of the stack of eight layers. One of them should be a final layer, and it is trained) trained on a data set containing classifiers of a target domain of execution options (i.e. Training with the first set of training images 240 may be regularized by weight decay (e.g., reducing the size of all weight to prevent the convolutional neural network 234 from focusing too much on any single feature of an image) and dropout regularization (e.g., randomly zeroing features so that the convolutional neural network 234 does not rely too much on combinations of features that are a coincidence) for the first two fully-connected layers with a dropout ratio (e.g., a proportion of values set to zero in each training step)) [Malpani: col. 12, line 4-13].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Morris with Malpani to program the system to implement method’s Malpani.  
Therefore, the combination of Morris with Malpani will enable the system to provide a trained model specialized to understand and identify features in images most important to visual tastes of a geographic region [Malpani: col. 13, line 10-16]. 
Morris and Malpani do not explicitly disclose the following claim limitations (Emphasis added).
provide a highest-scored execution option to a processor of the user terminal.
However, in the same field of endeavor Hsiao further discloses the claim limitations and the deficient claim limitations, as follows:
provide the highest scored execution option (i.e. The search suggestion with the highest relevancy score) [Hsiao: col. 9, line 10-11] 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Morris and Malpani with Hsiao to program the system to implement the visual search using relevant function of Hsiao.  
Therefore, the combination of Morris and Malpani with Hsiao will enable the system to provide highest relevancy suggestions to users [Hsiao: col. 9, line 10-11]. 

Regarding claim 19, Morris meets the claim limitations as follow.
A method (i.e. a method) [Morris: col. 25, line 3] comprising:receiving (i.e.  receiving an input indicative of a selection from the user) [Morris: col. 21, line 43], at a processing device (i.e. computing device such as one or more separate processor device(s), such as CPU-type processors (e.g., micro-processors)) [Baradel: col. 8, line 30-32], from a user terminal ((i.e. terminals) [Morris: col. 8, line 25]; (i.e.  human-machine interface) [Morris: col. 21, line 46]; (i.e. a camera, a microphone, a network interface, one or more sensors configured to monitor movements of the user) [Morris: col. 21, line 19-21]), a control input associated with an intent ((i.e.  receiving an input indicative of a selection from the user) [Morris: col. 21, line 43]; (i.e.  an activity identifier inferred from motions of the user received from the one or more sensors) [Morris: col. 21, line 24-25]), L:\SAMS12\00124- 45 -DOCKET NO. MPS17-BX05 (SAMS12-00124)PATENT obtaining (i.e.  obtain a portion of the contextual data) [Morris: col. 13, line 12-13] location data associated with a location of the user terminal ((i.e.  operation system information or information provided by the operation system (e.g., a time of day, device and/or software information); a location of the device 210 (e.g., a geo-location corresponding to an IP address of the device 210, a GPS location of the device 210)) [Morris: col. 13, line 29-35]; (i.e.  the contextual data 204(1) can include contextual data regarding the physical surroundings of a user such as, for example, one or more of: an image of the surroundings 208 of the user 206; salient object identifiers of salient objects 220(1)-(3) (e.g., an onion 220(1), a banana 220(2), a sale sign 220(3)); a location of the surroundings 208 (e.g., coordinates)) [Morris: col. 12, line 61-67]), determining (i.e.  the techniques herein can use the weighting, sorting, ranking, and/or filtering of contextual data to determine) [Morris: col. 13, line 15-20-22] a scored set of execution options associated with the control input ((i.e.  In some examples, the area 404 of the GUI 400 can include an N number of spaces for representing suggestions. In some examples, the techniques discussed herein can leverage contextual data to weight, sort, rank, and/or filter the suggestions to reduce a number of the suggestions greater than N to N suggestions. In some examples, the techniques herein can include providing an option to the user to view more suggestions) [Morris: col. 15, line 42-49]; (i.e.  the techniques discussed herein can use weighting, sorting, ranking, and/or filtering to determine a location in a display to display more weighted suggestions since different portions of a display are more prime because users are more likely to look at them and/or to determine a subset of suggestions to represent when not all of the suggestions can be represented at once due to display space,  understandability, settings based on a user's cognitive abilities, a duration of time)) [Morris: col. 6, line 22-31]; (i.e.  The GUI 300 depicted in FIG. 3 also demonstrates an instance of weighting, sorting, ranking, and/or filtering that results) [Morris: col. 14, line 21-23]), obtaining a contextual label (i.e.  obtain a portion of the contextual data) [Morris: col. 13, line 12-13] associated with the location data (i.e.  the environment detection service can include machine learning (e.g., a classifier, deep learning model, convolutional deep neural network) that takes contextual data as an input) [Morris: col. 11, line 29-32], the contextual label (i.e. the techniques discussed herein can generate words and/or phrases, based at least in part on the contextual data, to provide to the user for selection by the user and/or to be output on behalf of the use) [Morris: col. 3, line 18-21]) determined based on an application of one or more adapted pretrained deep learning models ((i.e.  the suggestion generator 122 can start with a set of heuristic phrases which the suggestion generator 122 can augment using machine learning (e.g., by a deep learning network, Naive Bayes classifier, directed graph) using data regarding the suggestion selection activity and/or utterance patterns of one or more users (e.g., by accessing past utterances of a user, by accessing past utterances and/or selections of a multiplicity of users stored on a cloud service). In some examples, the suggestion generator 122 can include human services (e.g., providing via network interface(s) 130 contextual data and/or generated words to a human for suggestion generation)) [Morris: col. 21, line 4]; (i.e.  outputs an environment identifier based at least in part on co-occurrence(s) of one or more discrete elements of contextual data. For example, if the contextual data includes the salient object labels) [Morris: col. 11, line 32-35]) to the location data (i.e.  the contextual data 204(1) can include contextual data regarding the physical surroundings of a user such as, for example, one or more of: an image of the surroundings 208 of the user 206; salient object identifiers of salient objects 220(1)-(3) (e.g., an onion 220(1), a banana 220(2), a sale sign 220(3)); a location of the surroundings 208 (e.g., coordinates)) [Morris: col. 12, line 61-67], the one or more adapted pretrained deep learning models comprising one or more unchanged convolutional layers and a final fully-connected layer trained on a data set containing classifiers of a target domain of execution options,rescoring (i.e.  the techniques herein can use the weighting, sorting, ranking, and/or filtering of contextual data to determine) [Morris: col. 13, line 15-20-22] the scored set of execution options associated with the control input (i.e. In some examples, the techniques herein can include providing an option to the user to view more suggestions. In that example, the GUI 400 can refresh the suggestions 406 with another N suggestions according to the weighting, sorting, ranking, and/or filtering. In some examples, as FIG. 4 illustrates, the contextual data can include a selection 412 (i.e., selection of a representation of the letter "D") of a portion of the keyboard 402. The example suggestions depicted in FIG. 4 include suggestions that appear as a result of using this selection 412 (which can be part of context data obtained by the techniques described herein) to weight, sort, rank, and/or filter generated suggestions. For example, suggestions such as suggestion 406(2) and 406(4) can be sentence completions that start a same letter (or letter combination in some examples). In some examples, the keyboard selection 412 can otherwise be used to weight, sort, rank, and/or filter generated suggestions) [Morris: col. 15, line 47-63] based on the contextual label (i.e.  In some examples the ECTF service(s) 126 (and/or the suggestion generator 124) can additionally or alternatively sort, rank, and/or filter the generated words and/or phrases based at least in part on contextual data) [Morris: col. 12, line 4], and providing (i.e. provide to the user for selection by the user) [Morris: col. 1, line 43-44] a highest-scored execution option ((i.e. greatest relevance) [Morris: col. 5, line 46]; (i.e.  The GUI 300 depicted in FIG. 3 also demonstrates an instance of weighting, sorting, ranking, and/or filtering that results) [Morris: col. 14, line 21-23]) to a processor of the user terminal ((i.e.   the ECTF 202 can be communicatively coupled to a human-machine interface such that the ECTF 202 can provide suggestions to a user for selection) [Morris: col. 12, line 51-53; Fig. 11]; (i.e.  The GUI 300 depicted in FIG. 3 also demonstrates an instance of weighting, sorting, ranking, and/or filtering that results) [Morris: col. 14, line 21-23]). 
Morris does not explicitly disclose the following claim limitations (Emphasis added).
A method comprising: receiving, at a processing device, from a user terminal, a control input associated with an intent, obtaining location data associated with a location of the user terminal, determining a scored set of execution options associated with the control input, obtaining a contextual label associated with the location data, the contextual label determined based on an application of one or more adapted pretrained deep learning models to the location data, the one or more adapted pretrained deep learning models comprising one or more unchanged convolutional layers and a final fully-connected layer trained on a data set containing classifiers of a target domain of execution options, rescoring the scored set of execution options associated with the control input based on the contextual label, and providing a highest-scored execution option to a processor of the user terminal. 
However, in the same field of endeavor Malpani further discloses the claim limitations and the deficient claim limitations, as follows:
the one or more adapted pretrained deep learning models comprising one or more unchanged convolutional layers and a final fully-connected layer ((i.e. the convolutional neural network 234 can consist of a stack of eight layers with weights, the first five layers being convolutional layers and the remaining three layers being fully-connected layers) [Malpani: col. 11, line 47-50]; (i.e. for each image-geographic region pair, the features extracted using the model generated by the convolutional neural network 234 as trained in step 303 is implemented with three fully connected layers of the convolutional neural network 234) [Malpani: col. 12, line 63-67] – Note: Malpani discloses that there are three fully connected layers at the bottom of the stack of eight layers. One of them should be a final layer, and it is trained) trained on a data set containing classifiers of a target domain of execution options (i.e. Training with the first set of training images 240 may be regularized by weight decay (e.g., reducing the size of all weight to prevent the convolutional neural network 234 from focusing too much on any single feature of an image) and dropout regularization (e.g., randomly zeroing features so that the convolutional neural network 234 does not rely too much on combinations of features that are a coincidence) for the first two fully-connected layers with a dropout ratio (e.g., a proportion of values set to zero in each training step)) [Malpani: col. 12, line 4-13].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Morris with Malpani to program the system to implement method’s Malpani.  
Therefore, the combination of Morris with Malpani will enable the system to provide a trained model specialized to understand and identify features in images most important to visual tastes of a geographic region [Malpani: col. 13, line 10-16]. 
Morris and Malpani do not explicitly disclose the following claim limitations (Emphasis added).
provide a highest-scored execution option to a processor of the user terminal.
However, in the same field of endeavor Hsiao further discloses the claim limitations and the deficient claim limitations, as follows:
providing a highest-scored execution option (i.e. The search suggestion with the highest relevancy score) [Hsiao: col. 9, line 10-11] 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Morris and Malpani with Hsiao to program the system to implement the visual search using relevant function of Hsiao.  
Therefore, the combination of Morris and Malpani with Hsiao will enable the system to provide highest relevancy suggestions to users [Hsiao: col. 9, line 10-11]. 

Regarding claim 20, Morris meets the claim limitations as follow.
A non-transitory computer-readable medium including program code that, when executed by a processor, causes an apparatus to (i.e. computer-readable media having stored thereon computer-executable instructions that, when executed, configure the processors to perform operations) [Morris: col. 21, line 5-7]:receive (i.e.  receiving an input indicative of a selection from the user) [Morris: col. 21, line 43], from a user terminal ((i.e. terminals) [Morris: col. 8, line 25]; (i.e.  human-machine interface) [Morris: col. 21, line 46]; (i.e. a camera, a microphone, a network interface, one or more sensors configured to monitor movements of the user) [Morris: col. 21, line 19-21]), a control input associated with an intent ((i.e.  receiving an input indicative of a selection from the user) [Morris: col. 21, line 43]; (i.e.  an activity identifier inferred from motions of the user received from the one or more sensors) [Morris: col. 21, line 24-25]), L:\SAMS12\00124- 45 -DOCKET NO. MPS17-BX05 (SAMS12-00124)PATENT obtain (i.e.  obtain a portion of the contextual data) [Morris: col. 13, line 12-13] location data associated with a location of the user terminal ((i.e.  operation system information or information provided by the operation system (e.g., a time of day, device and/or software information); a location of the device 210 (e.g., a geo-location corresponding to an IP address of the device 210, a GPS location of the device 210)) [Morris: col. 13, line 29-35]; (i.e.  the contextual data 204(1) can include contextual data regarding the physical surroundings of a user such as, for example, one or more of: an image of the surroundings 208 of the user 206; salient object identifiers of salient objects 220(1)-(3) (e.g., an onion 220(1), a banana 220(2), a sale sign 220(3)); a location of the surroundings 208 (e.g., coordinates)) [Morris: col. 12, line 61-67]), determine (i.e.  the techniques herein can use the weighting, sorting, ranking, and/or filtering of contextual data to determine) [Morris: col. 13, line 15-20-22] a scored set of execution options associated with the control input ((i.e.  In some examples, the area 404 of the GUI 400 can include an N number of spaces for representing suggestions. In some examples, the techniques discussed herein can leverage contextual data to weight, sort, rank, and/or filter the suggestions to reduce a number of the suggestions greater than N to N suggestions. In some examples, the techniques herein can include providing an option to the user to view more suggestions) [Morris: col. 15, line 42-49]; (i.e.  the techniques discussed herein can use weighting, sorting, ranking, and/or filtering to determine a location in a display to display more weighted suggestions since different portions of a display are more prime because users are more likely to look at them and/or to determine a subset of suggestions to represent when not all of the suggestions can be represented at once due to display space,  understandability, settings based on a user's cognitive abilities, a duration of time)) [Morris: col. 6, line 22-31]; (i.e.  The GUI 300 depicted in FIG. 3 also demonstrates an instance of weighting, sorting, ranking, and/or filtering that results) [Morris: col. 14, line 21-23]), obtain a contextual label (i.e.  obtain a portion of the contextual data) [Morris: col. 13, line 12-13] associated with the location data (i.e.  the environment detection service can include machine learning (e.g., a classifier, deep learning model, convolutional deep neural network) that takes contextual data as an input) [Morris: col. 11, line 29-32], the contextual label determined based on an application of one or more adapted pretrained deep learning models ((i.e.  the suggestion generator 122 can start with a set of heuristic phrases which the suggestion generator 122 can augment using machine learning (e.g., by a deep learning network, Naive Bayes classifier, directed graph) using data regarding the suggestion selection activity and/or utterance patterns of one or more users (e.g., by accessing past utterances of a user, by accessing past utterances and/or selections of a multiplicity of users stored on a cloud service). In some examples, the suggestion generator 122 can include human services (e.g., providing via network interface(s) 130 contextual data and/or generated words to a human for suggestion generation)) [Morris: col. 21, line 4]; (i.e.  outputs an environment identifier based at least in part on co-occurrence(s) of one or more discrete elements of contextual data. For example, if the contextual data includes the salient object labels) [Morris: col. 11, line 32-35]) to the location data (i.e.  the contextual data 204(1) can include contextual data regarding the physical surroundings of a user such as, for example, one or more of: an image of the surroundings 208 of the user 206; salient object identifiers of salient objects 220(1)-(3) (e.g., an onion 220(1), a banana 220(2), a sale sign 220(3)); a location of the surroundings 208 (e.g., coordinates)) [Morris: col. 12, line 61-67], the one or more adapted pretrained deep learning models comprising one or more unchanged convolutional layers and a final fully-connected layer trained on a data set containing classifiers of a target domain of execution options,rescore (i.e.  the techniques herein can use the weighting, sorting, ranking, and/or filtering of contextual data to determine) [Morris: col. 13, line 15-20-22] the scored set of execution options associated with the control input (i.e. In some examples, the techniques herein can include providing an option to the user to view more suggestions. In that example, the GUI 400 can refresh the suggestions 406 with another N suggestions according to the weighting, sorting, ranking, and/or filtering. In some examples, as FIG. 4 illustrates, the contextual data can include a selection 412 (i.e., selection of a representation of the letter "D") of a portion of the keyboard 402. The example suggestions depicted in FIG. 4 include suggestions that appear as a result of using this selection 412 (which can be part of context data obtained by the techniques described herein) to weight, sort, rank, and/or filter generated suggestions. For example, suggestions such as suggestion 406(2) and 406(4) can be sentence completions that start a same letter (or letter combination in some examples). In some examples, the keyboard selection 412 can otherwise be used to weight, sort, rank, and/or filter generated suggestions) [Morris: col. 15, line 47-63] based on the contextual label (i.e.  In some examples the ECTF service(s) 126 (and/or the suggestion generator 124) can additionally or alternatively sort, rank, and/or filter the generated words and/or phrases based at least in part on contextual data) [Morris: col. 12, line 4], and provide (i.e. provide to the user for selection by the user) [Morris: col. 1, line 43-44] a highest-scored execution option ((i.e. greatest relevance) [Morris: col. 5, line 46] ; (i.e.  The GUI 300 depicted in FIG. 3 also demonstrates an instance of weighting, sorting, ranking, and/or filtering that results) [Morris: col. 14, line 21-23]) to a processor of the user terminal ((i.e.   the ECTF 202 can be communicatively coupled to a human-machine interface such that the ECTF 202 can provide suggestions to a user for selection) [Morris: col. 12, line 51-53; Fig. 11]; (i.e.  The GUI 300 depicted in FIG. 3 also demonstrates an instance of weighting, sorting, ranking, and/or filtering that results) [Morris: col. 14, line 21-23]). 
Morris does not explicitly disclose the following claim limitations (Emphasis added).
A non-transitory computer-readable medium including program code that, when executed by a processor, causes an apparatus to:  L:\SAMS12\00124- 46 -DOCKET NO. MPS17-BX05 (SAMS12-00124)PATENT receive, from a user terminal, a control input associated with an intent, obtain location data associated with a location of the user terminal, determine a scored set of execution options associated with the control input, obtain a contextual label associated with the location data, the contextual label determined based on an application of one or more adapted pretrained deep learning models to the location data, the one or more adapted pretrained deep learning models comprising one or more unchanged convolutional layers and a final fully-connected layer trained on a data set containing classifiers of a target domain of execution options,rescore the scored set of execution options associated with the control input based on the contextual label, and provide a highest-scored execution option to a processor of the user terminal. 
However, in the same field of endeavor Malpani further discloses the claim limitations and the deficient claim limitations, as follows:
the one or more adapted pretrained deep learning models comprising one or more unchanged convolutional layers and a final fully-connected layer ((i.e. the convolutional neural network 234 can consist of a stack of eight layers with weights, the first five layers being convolutional layers and the remaining three layers being fully-connected layers) [Malpani: col. 11, line 47-50]; (i.e. for each image-geographic region pair, the features extracted using the model generated by the convolutional neural network 234 as trained in step 303 is implemented with three fully connected layers of the convolutional neural network 234) [Malpani: col. 12, line 63-67] – Note: Malpani discloses that there are three fully connected layers at the bottom of the stack of eight layers. One of them should be a final layer, and it is trained) trained on a data set containing classifiers of a target domain of execution options (i.e. Training with the first set of training images 240 may be regularized by weight decay (e.g., reducing the size of all weight to prevent the convolutional neural network 234 from focusing too much on any single feature of an image) and dropout regularization (e.g., randomly zeroing features so that the convolutional neural network 234 does not rely too much on combinations of features that are a coincidence) for the first two fully-connected layers with a dropout ratio (e.g., a proportion of values set to zero in each training step)) [Malpani: col. 12, line 4-13].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Morris with Malpani to program the system to implement method’s Malpani.  
Therefore, the combination of Morris with Malpani will enable the system to provide a trained model specialized to understand and identify features in images most important to visual tastes of a geographic region [Malpani: col. 13, line 10-16]. 
Morris and Malpani do not explicitly disclose the following claim limitations (Emphasis added).
provide a highest-scored execution option to a processor of the user terminal.
However, in the same field of endeavor Hsiao further discloses the claim limitations and the deficient claim limitations, as follows:
provide a highest-scored execution option (i.e. The search suggestion with the highest relevancy score) [Hsiao: col. 9, line 10-11] 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Morris with Hsiao to program the system to implement the visual search using relevant function of Hsiao.  
Therefore, the combination of Morris with Hsiao will enable the system to provide highest relevancy suggestions to users [Hsiao: col. 9, line 10-11]. 
Claims 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over in view of Morris et al. (US Patent 10,223,067 B2), (“Morris”), in view of Malpani et al. (US Patent 9,753,949 B1), (“Malpani”), in view of Hsiao (US Patent 10,540,378 B1), (“Hsiao”), in view of Willson et al. (US Patent 11,205,110 B1), (“Willson”).

Regarding claim 21, Morris, Malpani, and Hsiao meet the claim limitations as set for in claim 18.Morris and Malpan further meet the claim limitations as follow.
The apparatus of Claim 18 (i.e. system) [Morris: col. 7, line 4], wherein, in at least one adapted pretrained deep learning model of the one or more adapted pretrained deep learning models ((i.e. using a pretrained deep learning model) [Baradel: col. 4, line 58-59; Figs. 2-3]; (i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source and target image features only) [Baradel: col. 3, line 23-29; Figs. 2-3], values of one or more weight matrices ((i.e. the convolutional neural network 234 can consist of a stack of eight layers with weights, the first five layers being convolutional layers and the remaining three layers being fully-connected layers) [Malpani: col. 11, line 47-50]; (i.e. Training with the first set of training images 240 may be regularized by weight decay (e.g., reducing the size of all weight to prevent the convolutional neural network 234 from focusing too much on any single feature of an image) and dropout regularization (e.g., randomly zeroing features so that the convolutional neural network 234 does not rely too much on combinations of features that are a coincidence) for the first two fully-connected layers with a dropout ratio (e.g., a proportion of values set to zero in each training step)) [Malpani: col. 12, line 4-13])  are quantized as integer values.
Morris, Malpani, and Hsiao do not explicitly disclose the following claim limitations (Emphasis added).
The apparatus of Claim 18, wherein, in at least one adapted pretrained deep learning model of the one or more adapted pretrained deep learning models, values of one or more weight matrices are quantized as integer values.
However, in the same field of endeavor Willson further discloses the deficient claim limitations, as follows:
values of one or more weight matrices are quantized as integer values (i.e. In some examples, the server quantizes the embeddings and/or neural network weights using a lossy compression scheme before sending those to the client. For instance, rather than sending a 4 byte floating point value per embedding or network weight, the server quantizes to a 1 byte integer value per embedding or weight) [Willson: col. 10, line 22-28].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Morris, Malpani, and Hsiao with Willson to program the system to implement the lossless compression scheme.  
Therefore, the combination of Morris, Malpani, and Hsiao with Willson will enable the system to reduce the transmission rate on the network [Willson: col. 10, line 22-28]. 

Regarding claim 22, Morris, Malpani, and Hsiao meet the claim limitations as set for in claim 19.Morris and Malpan further meet the claim limitations as follow.
The method of Claim 19 (i.e. a method) [Morris: col. 25, line 3], wherein, in at least one adapted pretrained deep learning model of the one or more adapted pretrained deep learning models ((i.e. using a pretrained deep learning model) [Baradel: col. 4, line 58-59; Figs. 2-3]; (i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source and target image features only) [Baradel: col. 3, line 23-29; Figs. 2-3], values of one or more weight matrices ((i.e. the convolutional neural network 234 can consist of a stack of eight layers with weights, the first five layers being convolutional layers and the remaining three layers being fully-connected layers) [Malpani: col. 11, line 47-50]; (i.e. Training with the first set of training images 240 may be regularized by weight decay (e.g., reducing the size of all weight to prevent the convolutional neural network 234 from focusing too much on any single feature of an image) and dropout regularization (e.g., randomly zeroing features so that the convolutional neural network 234 does not rely too much on combinations of features that are a coincidence) for the first two fully-connected layers with a dropout ratio (e.g., a proportion of values set to zero in each training step)) [Malpani: col. 12, line 4-13])  are quantized as integer values.
Morris, Malpani, and Hsiao do not explicitly disclose the following claim limitations (Emphasis added).
The method of Claim 19, wherein, in at least one adapted pretrained deep learning model of the one or more adapted pretrained deep learning models, values of one or more weight matrices are quantized as integer values.
However, in the same field of endeavor Willson further discloses the deficient claim limitations, as follows:
values of one or more weight matrices are quantized as integer values (i.e. In some examples, the server quantizes the embeddings and/or neural network weights using a lossy compression scheme before sending those to the client. For instance, rather than sending a 4 byte floating point value per embedding or network weight, the server quantizes to a 1 byte integer value per embedding or weight) [Willson: col. 10, line 22-28].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Morris, Malpani, and Hsiao with Willson to program the system to implement the lossless compression scheme.  
Therefore, the combination of Morris, Malpani, and Hsiao with Willson will enable the system to reduce the transmission rate on the network [Willson: col. 10, line 22-28]. 

Regarding claim 23, Morris, Malpani, and Hsiao meet the claim limitations as set for in claim 20.Morris and Malpan further meet the claim limitations as follow.
The non-transitory computer-readable medium of Claim 20 (i.e. computer-readable media having stored thereon computer-executable instructions that, when executed, configure the processors to perform operations) [Morris: col. 21, line 5-7], wherein, in at least one adapted pretrained deep learning model of the one or more adapted pretrained deep learning models ((i.e. using a pretrained deep learning model) [Baradel: col. 4, line 58-59; Figs. 2-3]; (i.e. The conditional adaptation network (CAN) of the present disclosure can address both the coupled and decoupled architecture for reducing a shift across source and target distributions. The CAN of the present disclosure can be embedded into deep learning models or complete the adaptation task from available source and target image features only) [Baradel: col. 3, line 23-29; Figs. 2-3], values of one or more weight matrices ((i.e. the convolutional neural network 234 can consist of a stack of eight layers with weights, the first five layers being convolutional layers and the remaining three layers being fully-connected layers) [Malpani: col. 11, line 47-50]; (i.e. Training with the first set of training images 240 may be regularized by weight decay (e.g., reducing the size of all weight to prevent the convolutional neural network 234 from focusing too much on any single feature of an image) and dropout regularization (e.g., randomly zeroing features so that the convolutional neural network 234 does not rely too much on combinations of features that are a coincidence) for the first two fully-connected layers with a dropout ratio (e.g., a proportion of values set to zero in each training step)) [Malpani: col. 12, line 4-13])  are quantized as integer values.
Morris, Malpani, and Hsiao do not explicitly disclose the following claim limitations (Emphasis added).
The non-transitory computer-readable medium of Claim 20, wherein, in at least one adapted pretrained deep learning model of the one or more adapted pretrained deep learning models, values of one or more weight matrices are quantized as integer values.
However, in the same field of endeavor Willson further discloses the deficient claim limitations, as follows:
values of one or more weight matrices are quantized as integer values (i.e. In some examples, the server quantizes the embeddings and/or neural network weights using a lossy compression scheme before sending those to the client. For instance, rather than sending a 4 byte floating point value per embedding or network weight, the server quantizes to a 1 byte integer value per embedding or weight) [Willson: col. 10, line 22-28].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Morris, Malpani, and Hsiao with Willson to program the system to implement the lossless compression scheme.  
Therefore, the combination of Morris, Malpani, and Hsiao with Willson will enable the system to reduce the transmission rate on the network [Willson: col. 10, line 22-28]. 

Allowable Subject Matter32.         Claims 3-5, 8-10, and 13-16 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all limitations of the base claim and any intervening claims, with proper definitions of all parameters in each mathematical expression of each function in these claims.  This objection is given with a condition that all other objections and rejections of related claims are addressed. 
33.         The above identified claims recite specific equations with unique parameters required for calculating functions in this invention. The prior arts fail to teach or render obvious of these mathematical expressions and features. 

Reference Notice 
Additional prior arts, included in the Notice of Reference Cited, made of record and not relied upon is considered pertinent to applicant's disclosure.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Philip Dang whose telephone number is (408) 918-7529.  The examiner can normally be reached on Monday-Thursday between 8:30 am - 5:00 pm (PST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sath Perungavoor can be reached on 571-272-7455.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /Philip P. Dang/Primary Examiner, Art Unit 2488