DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claim 1 is objected to because of the following informalities: In line 7, add an apostrophe so that “ANNs performance” reads “ANNs’ performance.  Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:

(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “profiling circuitry” in CLAIMS 25 and its dependents CLAIMS 26-28; and “querying circuitry” and “deployment circuitry” in CLAIM 29 and its dependents CLAIMS 30-37.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 16 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 16 recites the limitation "the memory".  There is insufficient antecedent basis for this limitation in the claim. Claim 16 depends only on claim 10 which does not recite any memory. For examining purposes, claim 16 is interpreted as depending on the method of claim 15.

Claims 25-37 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim limitations “profiling circuitry” in CLAIMS 25 and its dependents CLAIMS 26-28; and “querying circuitry” and “deployment circuitry” in CLAIM 29 and its dependents CLAIMS 30-37 invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. Examiner interprets “profiling circuitry”, “querying circuitry”, and “deployment circuitry” as any software or hardware component that can perform respectively profiling, querying, and deployment. Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.

Claims 26-28 are rejected for failing to cure the deficiencies of claim 25. 
Claims 30-37 are rejected for failing to cure the deficiencies of claim 29.	


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-7, 10-14, 17-22, and 25-35 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (“NetAdapt”) in view of Redkar (US 2018/0276553). Fig. 2 from Yang is shown below.

    PNG
    media_image1.png
    364
    662
    media_image1.png
    Greyscale

Yang Fig. 2
Regarding claim 1, Yang teaches: A method for deploying an artificial neural network (ANN), the method comprising: 
generating candidate ANNs for performing an inference task (On p. 5 § 3.2, Yang teaches: “The NetAdapt algorithm is detailed in pseudo code in Algorithm 1 (p. 5) and in Fig. 2 (p. 6). Each iteration solves Eq. 2 by reducing the number of filters in a single CONV or FC layer (the Choose # of Filters and Choose Which Filters blocks in Fig. 2). The number of filters to remove from a layer is guided by empirical measurements… The simplified network is then fine-tuned for a short length of time in order to restore some accuracy (the ShortTerm Fine-Tune block). 
“In each iteration, the previous three steps (highlighted in bold) are applied on each of the CONV or FC layers individually. As a result, NetAdapt generates K (i.e., the number of CONV and FC layers) network candidates in one iteration”.
Under the broadest reasonable interpretation, candidate ANNs are generated after the step of choosing the filter and before the step of short-term fine-tuning because the fine-tuning merely restores some accuracy.)
… based on specifications of a target inference device; (Inputs to Algorithm 1 include resource budgets (specifications) of a target inference device)
generating trained ANNs by training the candidate ANNs to perform the inference task on an inference device conforming to the specifications; (The long-term fine-tune step indicated in Fig. 2 and described on p. 7 and in Algorithm 1 line 11 is interpreted as training. Yang teaches that NetAdapt can generate a family of simplified networks with different tradeoffs: 
Page 2: “NetAdapt can generate not only a network that meets the budget, but also a family of simplified networks with different tradeoffs, which allows dynamic network selection and further study.”
Page 5: “It [NetAdapt] outputs the final adapted network and can also generate a sequence of simplified networks (i.e., the highest accuracy network from each iteration Net1, ..., Neti) to provide the efficient frontier of accuracy and resource consumption tradeoffs.” 
Algorithm 1 steps 9 (pick highest accuracy) and 12 (                        
                            r
                            e
                            t
                            u
                            r
                            n
                             
                            
                                
                                    N
                                    e
                                    t
                                
                                ^
                            
                        
                    ) are interpreted as the generating a plurality of ANNs. Since NetAdapt is intended for mobile platforms, the trained ANNs conform to the specifications.)
determining characteristics describing the trained ANNs performance of the inference task on a device conforming to the specifications; (Yang p. 4 describes characteristics of the trained ANNs’ performance as resource constraints such as latency, memory footprint, or a combination of these metrics. Since NetAdapt is intended for mobile platforms, the inference device will conform to the specifications.)
deploying the… ANN on an inference device conforming to the target inference device specifications. (Yang teaches that the ANN is deployed on the mobile platforms in the “Mobile Inference and Latency Measurements” section of p. 9.)

While Yang teaches generating trained ANNs in Algorithm 1 line 12, Yang does not explicitly teach: storing profiles of the trained ANNs, the profiles reflecting the characteristics of each trained ANN; querying the stored profiles based on requirements of an application to select an ANN from among the trained ANNs; 
	Redkar discloses a system that automatically determines a machine learning or statistical model to use based on a natural language query from a user (¶ 16). Redkar teaches: storing profiles of the trained ANNs, the profiles reflecting the characteristics of each trained ANN; (Redkar teaches model registry 320 in ¶ 34: The model registry 320 may be configured to register machine learning and other statistical models and store data associated with the models.”
Regarding the claim limitation “the profiles reflecting the characteristics of each trained ANN”, ¶40-41 teaches that the model selection module selects a model from the model registry to meet constraints and computing resources.)
querying the stored profiles based on requirements of an application to select an ANN from among the trained ANNs; (Querying the stored profiles to select an ANN, as claimed, is interpreted as scoring the stored models and selecting a stored model based on the score. Redkar ¶ 39 teaches: “The model selection module 315 is configured to calculate a score for each compatible model so that a model may be selected. The score for a model may be calculated based on any of the data above as well as performance characteristics, scalability of the model, response time, model accuracy, or cost.” Further, Redkar ¶ 52 teaches: “The model management system may calculate a score for each compatible model… Based on the scores, the model management system may select a model for invocation”
Redkar teaches the based on requirements of an application limitation in ¶ 38: “Additional models may further be eliminated from contention based on constraints or user preferences.” Computing resources are listed as a possible constraint in ¶ 41.
 An application is interpreted as the chat application 200 in Fig. 2.
A model may be a deep learning neural network (an ANN) in ¶ 35.)
	Redkar is in the same field of endeavor as the claimed invention, namely, selecting optimal models from a registry. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Redkar’s model selection module and model registry into Yang’s system, with a motivation to select the optimal model based on computing constraints (Redkar ¶ 41: “According to some embodiments, the optimization problem may be formulated as a mixed linear/integer programming problem where the computing resources and selection of the optimum model for a user statement can be expressed as linear constraints.”)

Regarding claim 2, Yang/Redkar from claim 1 teaches: The method of claim 1,
Further, Yang teaches: wherein the specifications of the target inference device comprise the architecture, components, bit width, device types, memory structures, memory types, or memory capacity of the target inference device. (Yang p. 5 § 3.1 teaches one resource budget of a platform is a memory footprint (memory capacity). The broadest reasonable interpretation requires the prior art to teach only one of the alternatives listed)

	Regarding claim 3, Yang/Redkar from claim 1 teaches: The method of claim 1, 	
Yang discloses in the Abstract and Experiments sections classifying images with a neural network. The remaining limitations of claim 3 are implied in Yang. However, the combination of Yang/Redkar from claim 1 does not explicitly teach: further comprising: communicating input data to the deployed ANN from the application; generating an output using the deployed ANN; and communicating the output data to the application.
But Redkar teaches: further comprising: communicating input data to the deployed ANN from the application; ([0029] The chat interface 200 provides a way for the user to input a user statement 205 that can be transmitted to the model management system.)
generating an output using the deployed ANN; and ([0030] Based on the extracted information, the model management system selects an appropriate model, invokes the selected model to obtain a result, … )
communicating the output data to the application. ([0030] (continued) …and provides the result to the user. The model management system may format the result such that it may be delivered to the user via the same communication channel that the user statement was received. For example, the result may be provided in a response 210 in the chat interface 200.)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Redkar’s system into the combination of Yang/Redkar’s system from claim 1 by performing inference using the deployed model, with a motivation to return a result based on input data. ([0030] Based on the extracted information, the model management system selects an appropriate model, invokes the selected model to obtain a result, and provides the result to the user.)

	Regarding claim 4, Yang/Redkar from claim 1 teaches: The method of claim 1, 
Further, Yang teaches: wherein the characteristics of the trained ANNs comprise a memory capacity requirement, inference time, latency, accuracy, number of layers, type of layer, type of activation function, or ANN topology. (Yang p. 4 describes characteristics of the trained ANNs as resource constraints such as latency, memory footprint, or a combination of these metrics. The broadest reasonable interpretation requires the prior art to teach only one of the alternatives listed.)

Regarding claim 5, Yang/Redkar from claim 1 teaches: The method of claim 1, 
Further, Redkar teaches: further comprising receiving, in response to querying the stored profiles, an indication of one or more ANNs having a stored profile that satisfies the requirement. (Redkar in ¶ 52 teaches a score for a model being an indication, as claimed: “The model management system may calculate a score for each compatible model… Based on the scores, the model management system may select a model for invocation”.)

Regarding claim 6, Yang/Redkar from claim 1 teaches: The method of claim 1, 
Further, Redkar teaches: wherein the application (Redkar teaches chat interface)
Yang teaches: provides data to the deployed ANN and receives an inference from the deployed ANN based on the data. (Yang’s Abstract states that images are provided to NetAdapt for classification. The experiments in § 4 are evidence of providing data (images) to the deployed ANN (NetAdapt neural network) and receiving an inference (classifications) from the deployed ANN based on the data)

Regarding claim 7, Yang/Redkar from claim 1 teaches: The method of claim 1, 
Further, Redkar teaches: wherein the requirements of the application comprise a maximum latency, a maximum time to inference, a maximum memory capacity, a maximum power, or a device constraint. (A device constraint is interpreted as available computing resources as taught by Redkar in ¶ 41. The broadest reasonable interpretation of this claim requires only one of the alternatives listed.)

Regarding claim 10, Yang teaches: A method for generating an artificial neural network (ANN), the method comprising: 
inputting specifications of a target inference device to an ANN generation device; (On p. 5, Algorithm 1 is interpreted as an ANN generation device. Inputs to Algorithm 1 include resource budgets (specifications) of a target inference device)
generating candidate ANNs, by the ANN generation device, (On p. 5 § 3.2, Yang teaches: “The NetAdapt algorithm is detailed in pseudo code in Algorithm 1 and in Fig. 2. Each iteration solves Eq. 2 by reducing the number of filters in a single CONV or FC layer (the Choose # of Filters and Choose Which Filters blocks in Fig. 2). The number of filters to remove from a layer is guided by empirical measurements… The simplified network is then fine-tuned for a short length of time in order to restore some accuracy (the ShortTerm Fine-Tune block). 
“In each iteration, the previous three steps (highlighted in bold) are applied on each of the CONV or FC layers individually. As a result, NetAdapt generates K (i.e., the number of CONV and FC layers) network candidates in one iteration”.
Under the broadest reasonable interpretation, candidate ANNs are generated after the step of choosing the filter and before the step of short-term fine-tuning because the fine-tuning merely restores some accuracy.)
… based on the specifications; (Yang teaches that a specification of the target inference device is a mobile platform device type. Yang’s Abstract states, “This work proposes an automated algorithm, called NetAdapt, that adapts a pre-trained deep neural network to a mobile platform given a resource budget.” In the experimental section, the end of p. 9 states that a Google Pixel 1 phone and a Samsung Galaxy S8 were used.)
generating trained ANNs, by the ANN generation device, by training the candidate ANNs to perform an inference task; (The long-term fine-tune step indicated in Fig. 2 and described on p. 7 and in Algorithm 1 line 11 is interpreted as training. Yang teaches that NetAdapt can generate a family of simplified networks with different tradeoffs: 
Page 2: “NetAdapt can generate not only a network that meets the budget, but also a family of simplified networks with different tradeoffs, which allows dynamic network selection and further study.”
Page 5: “It [NetAdapt] outputs the final adapted network and can also generate a sequence of simplified networks (i.e., the highest accuracy network from each iteration Net1, ..., Neti) to provide the efficient frontier of accuracy and resource consumption tradeoffs.” 
Algorithm 1 steps 9 (pick highest accuracy) and 12 (                        
                            r
                            e
                            t
                            u
                            r
                            n
                             
                            
                                
                                    N
                                    e
                                    t
                                
                                ^
                            
                        
                    ) are interpreted as the generating a plurality of ANNs.)
generating profiles of the trained ANNs, wherein the profiles indicate characteristics of the trained ANNs; (From the previous citation, each network in the sequence of networks is along an efficient frontier. Each network’s accuracy and resource consumption tradeoff constitutes a profile. Yang p. 4 describes characteristics of the trained ANNs’ performance as resource constraints such as latency, memory footprint, or a combination of these metrics.)
for deployment on a target inference device. (Yang teaches that the ANN is deployed on the mobile platforms in the “Mobile Inference and Latency Measurements” section of p. 9.)

While Yang teaches generating trained ANNs in Algorithm 1 line 12, Yang does not explicitly teach: storing the profiles to be queried based on the requirements and to return a profile of one of the trained ANNs having characteristics satisfying requirements of an application, 
Redkar discloses a system that automatically determines a machine learning or statistical model to use based on a natural language query from a user (¶ 16). 
Redkar teaches: storing the profiles (Redkar teaches model registry 320 that stores models in ¶ 34: The model registry 320 may be configured to register machine learning and other statistical models and store data associated with the models.”)
to be queried based on the requirements and to return a profile of one of the trained ANNs having characteristics satisfying requirements of an application, (to be queried/return a profile, as claimed, is interpreted as scoring the stored models and selecting a stored model based on the score. Redkar ¶ 39 teaches: “The model selection module 315 is configured to calculate a score for each compatible model so that a model may be selected. The score for a model may be calculated based on any of the data above as well as performance characteristics, scalability of the model, response time, model accuracy, or cost.” Further, Redkar ¶ 52 teaches: “The model management system may calculate a score for each compatible model… Based on the scores, the model management system may select a model for invocation”
Redkar teaches the based on the requirements limitation in ¶ 38: “Additional models may further be eliminated from contention based on constraints or user preferences.” Computing resources are listed as a possible constraint in ¶ 41.
An application is interpreted as the chat application 200 in Fig. 2.
A model may be a deep learning neural network (an ANN) in ¶ 35.)
	Redkar is in the same field of endeavor as the claimed invention, namely, selecting optimal models from a registry. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Redkar’s model selection module and model registry into Yang’s system, with a motivation to select the optimal model based on computing constraints (Redkar ¶ 41: “According to some embodiments, the optimization problem may be formulated as a mixed linear/integer programming problem where the computing resources and selection of the optimum model for a user statement can be expressed as linear constraints.”)

	Regarding claim 11, Yang/Redkar from claim 10 teaches: The method of claim 10, 
Further, Yang teaches: wherein the specifications of the target inference device comprise the architecture, components, bit width, device types, memory structures, memory types, or memory capacity of the target inference device. (Yang p. 5 § 3.1 teaches one resource budget of a platform is a memory footprint (memory capacity). The broadest reasonable interpretation requires the prior art to teach only one of the alternatives listed)

	Regarding claim 12, Yang/Redkar from claim 10 teaches: The method of claim 10, 
Further, Yang teaches: wherein the inference task comprises image recognition. (Yang’s abstract teaches, “For image classification on the ImageNet dataset, NetAdapt achieves up to a 1.66× speedup in measured inference latency with higher accuracy.”)

	Regarding claim 13, Yang/Redkar from claim 10 teaches: The method of claim 10, 
Further, Yang teaches: wherein the characteristics of the trained ANNs comprise a memory capacity requirement, inference time, latency, accuracy, number of layers, type of layer, type of activation function, or ANN topology. (Yang p. 4 describes characteristics of the trained ANNs as resource constraints such as latency, memory footprint, or a combination of these metrics. The broadest reasonable interpretation requires the prior art to teach only one of the alternatives listed.)

Regarding claim 14, Yang/Redkar from claim 10 teaches: The method of claim 10,
Further, Redkar teaches: wherein the requirements of the application comprise a maximum latency, a maximum time to inference, a maximum memory capacity, a maximum power, or a device constraint. (A device constraint is interpreted as available computing resources as taught by Redkar in ¶ 41. The broadest reasonable interpretation of this claim requires only one of the alternatives listed.)


Regarding claim 17, Yang teaches: A method for deploying an artificial neural network (ANN), the method comprising: 
… the profiles reflecting characteristics of a plurality of ANNs trained to perform an inference task on an inference device (Yang p. 4 describes characteristics of the trained ANNs’ performance as resource constraints such as latency, memory footprint, or a combination of these metrics.)
conforming to specifications of a target inference device; and (Inputs to Algorithm 1 include resource budgets (specifications) of a target inference device)
deploying the… ANN on an inference device conforming to the target inference device specifications. (Yang teaches that the ANN is deployed on the mobile platforms in the “Mobile Inference and Latency Measurements” section of p. 9.)

However, Yang does not explicitly teach: querying stored profiles, based on requirements of an application, to select an ANN, 
Redkar discloses a system that automatically determines a machine learning or statistical model to use based on a natural language query from a user (¶ 16). Redkar teaches: querying stored profiles, based on requirements of an application, to select an ANN, (Querying stored profiles to select an ANN, as claimed, is interpreted as the model selection module scoring models stored in the model registry and selecting a stored model based on the score.
Redkar ¶ 39 teaches: “The model selection module 315 is configured to calculate a score for each compatible model so that a model may be selected. The score for a model may be calculated based on any of the data above as well as performance characteristics, scalability of the model, response time, model accuracy, or cost.” Further, Redkar ¶ 52 teaches: “The model management system may calculate a score for each compatible model… Based on the scores, the model management system may select a model for invocation”
Redkar teaches the based on requirements of an application limitation in ¶ 38: “Additional models may further be eliminated from contention based on constraints or user preferences.” Computing resources are listed as a possible constraint in ¶ 41.
 An application is interpreted as the chat application 200 in Fig. 2.
A model may be a deep learning neural network (an ANN) in ¶ 35.)
Redkar is in the same field of endeavor as the claimed invention, namely, selecting optimal models from a registry. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Redkar’s model selection module and model registry into Yang’s system, with a motivation to select the optimal model based on computing constraints (Redkar ¶ 41: “According to some embodiments, the optimization problem may be formulated as a mixed linear/integer programming problem where the computing resources and selection of the optimum model for a user statement can be expressed as linear constraints.”)


Regarding claim 18, Yang/Redkar from claim 17 teaches: The method of claim 17, 
Further, Yang teaches: wherein the specifications of the target inference device comprise the architecture, components, bit width, device types, memory structures, memory types, or memory capacity of the target inference device. (Yang p. 5 § 3.1 teaches one resource budget of a platform is a memory footprint (memory capacity). The broadest reasonable interpretation requires the prior art to teach only one of the alternatives listed)

Regarding claim 19, Yang/Redkar from claim 17 teaches: The method of claim 17, 
Yang discloses in the Abstract and Experiments sections classifying images with a neural network. The remaining limitations of claim 3 are implied in Yang. However, the combination of Yang/Redkar from claim 1 does not explicitly teach: further comprising: communicating input data to the deployed ANN from the application; generating an output using the deployed ANN; and communicating the output data to the application.
But Redkar teaches: further comprising: communicating input data to the deployed ANN from the application; ([0029] The chat interface 200 provides a way for the user to input a user statement 205 that can be transmitted to the model management system.)
generating an output using the deployed ANN; and ([0030] Based on the extracted information, the model management system selects an appropriate model, invokes the selected model to obtain a result, … )
communicating the output data to the application. ([0030] (continued) …and provides the result to the user. The model management system may format the result such that it may be delivered to the user via the same communication channel that the user statement was received. For example, the result may be provided in a response 210 in the chat interface 200.)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Redkar’s system into the combination of Yang/Redkar’s system from claim 17 by performing inference using the deployed model, with a motivation to return a result based on input data. ([0030] Based on the extracted information, the model management system selects an appropriate model, invokes the selected model to obtain a result, and provides the result to the user.)

Regarding claim 20, Yang/Redkar from claim 17 teaches: The method of claim 17, 
Further, Redkar teaches: further comprising receiving, in response to querying the stored profiles, an indication of one or more ANNs having a stored profile that satisfies the requirement. (Redkar in ¶ 52 teaches a score for a model being an indication, as claimed: “The model management system may calculate a score for each compatible model… Based on the scores, the model management system may select a model for invocation”.)

Regarding claim 21, Yang/Redkar from claim 17 teaches: The method of claim 17, 
Further, Redkar teaches: wherein the application (Redkar teaches chat interface)
Yang teaches: provides data to the deployed ANN and receives an inference from the deployed ANN based on the data. (Yang’s Abstract states that images are provided to NetAdapt for classification. The experiments in § 4 are evidence of providing data (images) to the deployed ANN (NetAdapt neural network) and receiving an inference (classifications) from the deployed ANN based on the data)

Regarding claim 22, Yang/Redkar from claim 17 teaches: The method of claim 17, 
Further, Redkar teaches: wherein the requirements of the application comprise a maximum latency, a maximum time to inference, a maximum memory capacity, a maximum power, or a device constraint. (A device constraint is interpreted as available computing resources as taught by Redkar in ¶ 41. The broadest reasonable interpretation of this claim requires only one of the alternatives listed.)

Regarding claim 25, Yang teaches: A device for generating an artificial neural network (ANN), the device comprising: 
… specifications of a target inference device; Inputs to Algorithm 1 include resource budgets (specifications) of a target inference device)
processing circuitry configured to generate candidate ANNs (On p. 5 § 3.2, Yang teaches: “The NetAdapt algorithm is detailed in pseudo code in Algorithm 1 (p. 5) and in Fig. 2 (p. 6). Each iteration solves Eq. 2 by reducing the number of filters in a single CONV or FC layer (the Choose # of Filters and Choose Which Filters blocks in Fig. 2). The number of filters to remove from a layer is guided by empirical measurements… The simplified network is then fine-tuned for a short length of time in order to restore some accuracy (the ShortTerm Fine-Tune block). 
“In each iteration, the previous three steps (highlighted in bold) are applied on each of the CONV or FC layers individually. As a result, NetAdapt generates K (i.e., the number of CONV and FC layers) network candidates in one iteration”. Candidate ANNs are generated after the step of choosing the filter and before the step of short-term fine-tuning because the fine-tuning merely restores some accuracy.)
…based on the specifications; Inputs to Algorithm 1 include resource budgets (specifications) of a target inference device)
processing circuitry configured to generate trained ANNs by training the candidate ANNs to perform an inference task; (The long-term fine-tune step indicated in Fig. 2 and described on p. 7 and in Algorithm 1 line 11 is interpreted as training. Yang teaches that NetAdapt can generate a family of simplified networks with different tradeoffs: 
Page 2: “NetAdapt can generate not only a network that meets the budget, but also a family of simplified networks with different tradeoffs, which allows dynamic network selection and further study.”
Page 5: “It [NetAdapt] outputs the final adapted network and can also generate a sequence of simplified networks (i.e., the highest accuracy network from each iteration Net1, ..., Neti) to provide the efficient frontier of accuracy and resource consumption tradeoffs.” 
Algorithm 1 steps 9 (pick highest accuracy) and 12 (                        
                            r
                            e
                            t
                            u
                            r
                            n
                             
                            
                                
                                    N
                                    e
                                    t
                                
                                ^
                            
                        
                    ) are interpreted as the generating a plurality of ANNs.)
profiling circuitry configured to generate profiles of the trained ANNs which reflect the characteristics of each trained ANN, and… (The broadest reasonable interpretation of profiling circuitry in light of the specification is any software or hardware component that can perform profiling. The previous citation notes that NetAdapt profiles networks in terms of accuracy and resource consumption by providing an efficient frontier of these networks. Yang p. 4 describes characteristics of the trained ANNs’ performance as resource constraints such as latency, memory footprint, or a combination of these metrics.)
for deployment on a target inference device. (Yang teaches that the ANN is deployed on the mobile platforms in the “Mobile Inference and Latency Measurements” section of p. 9.)

While Yang teaches generating trained ANNs in Algorithm 1 line 12, Yang does not explicitly teach: an input interface configured to input… ; [profiling circuitry] to store the profiles to be queried based on the requirements and to return a profile of one of the trained ANNs having characteristics satisfying requirements of an application, 
But Redkar teaches: an input interface configured to input… ; [profiling circuitry] to store the profiles to be queried based on the requirements and to return a profile of one of the trained ANNs having characteristics satisfying requirements of an application,
Redkar discloses a system that automatically determines a machine learning or statistical model to use based on a natural language query from a user (¶ 16). 
Redkar teaches: an input interface configured to input (User interface components are shown in Fig. 6, 685. ¶ 59 states they can include a keyboard, etc.)
 [profiling circuitry] to store the profiles (Redkar teaches model registry 320 that stores models in ¶ 34: The model registry 320 may be configured to register machine learning and other statistical models and store data associated with the models.”)
profiles to be queried based on the requirements and to return a profile of one of the trained ANNs having characteristics satisfying requirements of an application, (to be queried/return a profile, as claimed, is interpreted as scoring the stored models and selecting a stored model based on the score. Redkar ¶ 39 teaches: “The model selection module 315 is configured to calculate a score for each compatible model so that a model may be selected. The score for a model may be calculated based on any of the data above as well as performance characteristics, scalability of the model, response time, model accuracy, or cost.” Further, Redkar ¶ 52 teaches: “The model management system may calculate a score for each compatible model… Based on the scores, the model management system may select a model for invocation”
Redkar teaches the based on the requirements limitation in ¶ 38: “Additional models may further be eliminated from contention based on constraints or user preferences.” Computing resources are listed as a possible constraint in ¶ 41.
An application is interpreted as the chat application 200 in Fig. 2.
A model may be a deep learning neural network (an ANN) in ¶ 35.)
	Redkar is in the same field of endeavor as the claimed invention, namely, selecting optimal models from a registry. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Redkar’s model selection module and model registry into Yang’s system, and by inputting Yang’s device specifications by using Redkar’s interface components, with a motivation to select the optimal model based on computing constraints (Redkar ¶ 41: “According to some embodiments, the optimization problem may be formulated as a mixed linear/integer programming problem where the computing resources and selection of the optimum model for a user statement can be expressed as linear constraints.”)


Regarding claim 26, Yang/Redkar from claim 25 teaches: The device of claim 25, 
Further, Yang teaches: wherein the specifications of the target inference device comprise the architecture, components, bit width, device types, memory structures, memory types, or memory capacity of the target inference device. (Yang p. 5 § 3.1 teaches one resource budget of a platform is a memory footprint (memory capacity). The broadest reasonable interpretation requires the prior art to teach only one of the alternatives listed.)


Regarding claim 27, Yang/Redkar from claim 25 teaches: The device of claim 25, 
Further, Yang teaches: wherein the characteristics of the trained ANNs comprise a memory capacity requirement, inference time, latency, accuracy, number of layers, type of layer, type of activation function, or ANN topology. (Yang p. 5 § 3.1 teaches one resource budget of a platform is a memory footprint (memory capacity). The broadest reasonable interpretation requires the prior art to teach only one of the alternatives listed.)

Regarding claim 28, Yang/Redkar from claim 25 teaches: The device of claim 25, 
Further, Redkar teaches: wherein the requirements of the application comprise a maximum latency, a maximum time to inference, a maximum memory capacity, a maximum power, or a device constraint. (A device constraint is interpreted as available computing resources as taught by Redkar in ¶ 41. The broadest reasonable interpretation of this claim requires only one of the alternatives listed.)

Regarding claim 29, Yang teaches: A device for deploying an artificial neural network (ANN) to perform an inference task, the device comprising: 
… inference task requirements… ; (Yang p. 4 describes inference task requirements of the trained ANNs’ performance as resource constraints such as latency, memory footprint, or a combination of these metrics.)
… the profiles reflecting characteristics of ANNs trained to perform an inference task on a target inference device; (Yang p. 4 describes characteristics of the trained ANNs’ performance as resource constraints such as latency, memory footprint, or a combination of these metrics.)
deployment circuitry configured to deploy the… ANN on an inference device (The broadest reasonable interpretation of deployment circuitry in light of the specification is any software or hardware component that can perform deployment. Yang teaches that the ANN is deployed on the mobile platforms in the “Mobile Inference and Latency Measurements” section of p. 9.)
conforming to specifications of the target inference device. (Inputs to Algorithm 1 include resource budgets (specifications) of a target inference device)

However, Yang does not explicitly teach: an input interface configured to input [requirements] of an application; querying circuitry configured to query stored profiles based on the requirements, the querying circuitry further configured to select an ANN based on the query; and 
But Redkar teaches: an input interface configured to input [requirements] of an application (User interface components are shown in Fig. 6, 685. ¶ 59 states they can include a keyboard, etc. An application is chat application in Fig. 2, 200.)
querying circuitry configured to query stored profiles based on the requirements, the querying circuitry further configured to select an ANN based on the query; and (The broadest reasonable interpretation of querying circuitry in light of the specification is any software or hardware component that can perform querying, which consists of scoring and selecting based on the score. 
Redkar ¶ 39 teaches: “The model selection module 315 is configured to calculate a score for each compatible model so that a model may be selected. The score for a model may be calculated based on any of the data above as well as performance characteristics, scalability of the model, response time, model accuracy, or cost.” Further, Redkar ¶ 52 teaches: “The model management system may calculate a score for each compatible model… Based on the scores, the model management system may select a model for invocation”
Redkar teaches the based on requirements of an application limitation in ¶ 38: “Additional models may further be eliminated from contention based on constraints or user preferences.” Computing resources are listed as a possible constraint in ¶ 41.
 An application is interpreted as the chat application 200 in Fig. 2.
A model may be a deep learning neural network (an ANN) in ¶ 35.)
Redkar is in the same field of endeavor as the claimed invention, namely, selecting optimal models from a registry. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Redkar’s model selection module and model registry into Yang’s system, with a motivation to select the optimal model based on computing constraints (Redkar ¶ 41: “According to some embodiments, the optimization problem may be formulated as a mixed linear/integer programming problem where the computing resources and selection of the optimum model for a user statement can be expressed as linear constraints.”)

Regarding claim 30, Yang/Redkar from claim 29 teaches: The device of claim 29, 
Further, Yang teaches: wherein the specifications of the target inference device comprise the architecture, components, bit width, device types, memory structures, memory types, or memory capacity of the target inference device. (Yang p. 5 § 3.1 teaches one resource budget of a platform is a memory footprint (memory capacity). The broadest reasonable interpretation requires the prior art to teach only one of the alternatives listed)

	Regarding claim 31, Yang/Redkar from claim 29 teaches: The device of claim 29, 
Further, Redkar teaches: wherein querying the stored profiles comprises matching the requirements to characteristics of the trained ANNs (Redkar ¶ 38 teaches: “Additional models may further be eliminated from contention based on constraints or user preferences.” Computing resources are listed as a possible constraint in ¶ 41, and eliminating models from contention based on computing resources is matching the requirements to characteristics of the trained ANNs.)

	
	Regarding claim 32, Yang/Redkar from claim 31 teaches: The device of claim 31, 
Further, Yang teaches: wherein the characteristics comprise a memory capacity requirement, inference time, latency, accuracy, number of layers, type of layer, type of activation function, or ANN topology. (Yang p. 5 § 3.1 teaches one resource budget of a platform is a memory footprint (memory capacity). The broadest reasonable interpretation requires the prior art to teach only one of the alternatives listed.)

	Regarding claim 33, Yang/Redkar from claim 29 teaches: The device of claim 29, 
Further, Redkar teaches: wherein the querying circuitry is configured to receive, in response to the query, an indication of one or more ANNs having a stored profile that satisfies the requirements. (Redkar in ¶ 52 teaches a score for a model being an indication, as claimed: “The model management system may calculate a score for each compatible model… Based on the scores, the model management system may select a model for invocation”.)
	
	Regarding claim 34, Yang/Redkar from claim 29 teaches: The device of claim 29, 
Further, Redkar teaches: wherein the application (Redkar teaches chat interface)
Yang teaches: provides data to the deployed ANN and receives an inference from the deployed ANN based on the data. (Yang’s Abstract states that images are provided to NetAdapt for classification. The experiments in § 4 are evidence of providing data (images) to the deployed ANN (NetAdapt neural network) and receiving an inference (classifications) from the deployed ANN based on the data)
	
	Regarding claim 35, Yang/Redkar from claim 29 teaches: The device of claim 29, 
Further, Redkar teaches: wherein the requirements of the application comprise a maximum latency, a maximum time to inference, a maximum memory capacity, a maximum power, or a device constraint. (A device constraint is interpreted as available computing resources as taught by Redkar in ¶ 41. The broadest reasonable interpretation of this claim requires only one of the alternatives listed.)
	
Claims 8-9, 15-16, 23-24, and 36-37 are rejected over the combination of Yang/Redkar in view of Weston et al. (U.S. 20170103324). 

	Regarding claim 8, Yang/Redkar from claim 1 teaches: The method of claim 1, 
Further, Yang teaches: wherein deploying the selected ANN on the inference device / of the inference device (Yang teaches that the ANN is deployed on the mobile platforms in the “Mobile Inference and Latency Measurements” section of p. 9.)
	 However, the combination of Yang/Redkar from claim 1 does not explicitly teach: comprises loading… into at least one memory of the inference device.
	But Weston teaches: comprises loading… into at least one memory. (Weston [0078]: “At block 930, the memory network updates the memory data structure by incorporating the input feature vector into the memory data structure… the memory network can simply store the input feature vector in the next empty memory slot of the memory data structure, without modifying memory slots that store existing memory information. Alternatively, a more sophisticated model can be used to modify the existing memory information in the memory slots based on the input feature vector.”)
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Weston’s system into Yang/Redkar’s system by loading ANN data into a memory slot, with a motivation to store the data for later retrieval by a computer executable program ([0099]: “The memory 1204 can comprise storage locations that are addressable by the processor(s) … for storing processor executable code and data structures. The processor… may, in turn, comprise processing elements and/or logic circuitry configured to execute the software code and manipulate the data structures.”)
	
Regarding claim 9, the combination of Yang, Redkar, and Weston from claim 8 teaches: The method of claim 8, 
Further, Redkar teaches: selected ANN (Redkar ¶ 39)
However the combination of the combination of Yang, Redkar, and Weston from claim 8 does not explicitly teach: wherein the memory into which the [selected ANN] is loaded is determined based on the profile [of the selected ANN] 
But Weston teaches: wherein the memory into which the [selected ANN] is loaded is determined based on the profile [of the selected ANN] (Weston [0029]: “the slot-choosing function H can be trained to store memories by entity or topic.” The memory contents, i.e. input feature vector, are profiled by entity or topic.)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have stored ANNs in the memory slots from the combination of Yang, Redkar, and Weston’s system using Weston’s slot-choosing function with a motivation to organize contents of the memory. ([0029]: The slot-choosing function H can further organize the memory slots of the memory component.)

	Regarding claim 15, Yang/Redkar from claim 10 teaches: The method of claim 10, 
Further, Yang teaches: wherein deploying the selected ANN on the inference device / of the inference device (Yang teaches that the ANN is deployed on the mobile platforms in the “Mobile Inference and Latency Measurements” section of p. 9.)
	 However, the combination of Yang/Redkar from claim 10 does not explicitly teach: comprises loading… into at least one memory of the inference device.
	But Weston teaches: comprises loading… into at least one memory. (Weston [0078]: “At block 930, the memory network updates the memory data structure by incorporating the input feature vector into the memory data structure… the memory network can simply store the input feature vector in the next empty memory slot of the memory data structure, without modifying memory slots that store existing memory information. Alternatively, a more sophisticated model can be used to modify the existing memory information in the memory slots based on the input feature vector.”)
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Weston’s system into Yang/Redkar’s system by loading ANN data into a memory slot, with a motivation to store the data for later retrieval by a computer executable program ([0099]: “The memory 1204 can comprise storage locations that are addressable by the processor(s) … for storing processor executable code and data structures. The processor… may, in turn, comprise processing elements and/or logic circuitry configured to execute the software code and manipulate the data structures.”)
	
Regarding claim 16, the combination of Yang, Redkar, and Weston from claim 15* teaches: The method of claim 15, (*Examiner is interpreting claim 16 as depending on the method of claim 15.)
Further, Redkar teaches: selected ANN (Redkar ¶ 39)
However the combination of the combination of Yang, Redkar, and Weston from claim 15 does not explicitly teach: wherein the memory into which the [selected ANN] is loaded is determined based on the profile [of the selected ANN] 
But Weston teaches: wherein the memory into which the [selected ANN] is loaded is determined based on the profile [of the selected ANN] (Weston [0029]: “the slot-choosing function H can be trained to store memories by entity or topic.” The memory contents, i.e. input feature vector, are profiled by entity or topic.)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have stored ANNs in the memory slots from the combination of Yang, Redkar, and Weston’s system using Weston’s slot-choosing function with a motivation to organize contents of the memory. ([0029]: The slot-choosing function H can further organize the memory slots of the memory component.)

Regarding claim 23, Yang/Redkar from claim 17 teaches: The method of claim 17, 
Further, Yang teaches: wherein deploying the selected ANN on the inference device / of the inference device (Yang teaches that the ANN is deployed on the mobile platforms in the “Mobile Inference and Latency Measurements” section of p. 9.)
	 However, the combination of Yang/Redkar from claim 17 does not explicitly teach: comprises loading… into at least one memory of the inference device.
	But Weston teaches: comprises loading… into at least one memory. (Weston [0078]: “At block 930, the memory network updates the memory data structure by incorporating the input feature vector into the memory data structure… the memory network can simply store the input feature vector in the next empty memory slot of the memory data structure, without modifying memory slots that store existing memory information. Alternatively, a more sophisticated model can be used to modify the existing memory information in the memory slots based on the input feature vector.”)
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Weston’s system into Yang/Redkar’s system by loading ANN data into a memory slot, with a motivation to store the data for later retrieval by a computer executable program ([0099]: “The memory 1204 can comprise storage locations that are addressable by the processor(s) … for storing processor executable code and data structures. The processor… may, in turn, comprise processing elements and/or logic circuitry configured to execute the software code and manipulate the data structures.”)
	
Regarding claim 24, the combination of Yang, Redkar, and Weston from claim 23 teaches: The method of claim 23, 
Further, Redkar teaches: selected ANN (Redkar ¶ 39)
However the combination of the combination of Yang, Redkar, and Weston from claim 23 does not explicitly teach: wherein the memory into which the [selected ANN] is loaded is determined based on the profile [of the selected ANN] 
But Weston teaches: wherein the memory into which the [selected ANN] is loaded is determined based on the profile [of the selected ANN] (Weston [0029]: “the slot-choosing function H can be trained to store memories by entity or topic.” The memory contents, i.e. input feature vector, are profiled by entity or topic.)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have stored ANNs in the memory slots from the combination of Yang, Redkar, and Weston’s system using Weston’s slot-choosing function with a motivation to organize contents of the memory. ([0029]: The slot-choosing function H can further organize the memory slots of the memory component.)

Regarding claim 36, Yang/Redkar from claim 29 teaches: The device of claim 29, 
Further, Yang teaches: wherein deploying the selected ANN on the inference device / of the inference device (Yang teaches that the ANN is deployed on the mobile platforms in the “Mobile Inference and Latency Measurements” section of p. 9.)
	 However, the combination of Yang/Redkar from claim 29 does not explicitly teach: comprises loading… into at least one memory of the inference device.
	But Weston teaches: comprises loading… into at least one memory. (Weston [0078]: “At block 930, the memory network updates the memory data structure by incorporating the input feature vector into the memory data structure… the memory network can simply store the input feature vector in the next empty memory slot of the memory data structure, without modifying memory slots that store existing memory information. Alternatively, a more sophisticated model can be used to modify the existing memory information in the memory slots based on the input feature vector.”)
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Weston’s system into Yang/Redkar’s system by loading ANN data into a memory slot, with a motivation to store the data for later retrieval by a computer executable program ([0099]: “The memory 1204 can comprise storage locations that are addressable by the processor(s) … for storing processor executable code and data structures. The processor… may, in turn, comprise processing elements and/or logic circuitry configured to execute the software code and manipulate the data structures.”)
	
Regarding claim 37, the combination of Yang, Redkar, and Weston from claim 36 teaches: The device of claim 36, 
Further, Redkar teaches: selected ANN (Redkar ¶ 39)
However the combination of the combination of Yang, Redkar, and Weston from claim 23 does not explicitly teach: wherein the memory into which the [selected ANN] is loaded is determined based on the profile [of the selected ANN] 
But Weston teaches: wherein the memory into which the [selected ANN] is loaded is determined based on the profile [of the selected ANN] (Weston [0029]: “the slot-choosing function H can be trained to store memories by entity or topic.” The memory contents, i.e. input feature vector, are profiled by entity or topic.)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have stored ANNs in the memory slots from the combination of Yang, Redkar, and Weston’s system using Weston’s slot-choosing function with a motivation to organize contents of the memory. ([0029]: The slot-choosing function H can further organize the memory slots of the memory component.)

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Dong et al. (“DPP-Net”) teaches Device-aware Progressive Search for Pareto-optimal Neural Architectures, optimizing for both device-related (e.g., inference time and memory usage) and device-agnostic (e.g., accuracy and model size) objectives
Lin et al. (US 20180232639 A1) teaches neural network applications in resource constrained environments.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher Jablon whose telephone number is (571)270-7648.  The examiner can normally be reached on Monday - Friday, 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/ASHER JABLON/Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122