Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/25/2021 has been entered.

Status of Claims
This action is in reply to the amendments and remarks filed on 10/25/2021.
Claims 1-18 are pending.
Claims 1, 4, 10, and 14 have been amended.  

Response to Arguments
Applicant’s arguments, with respect to the rejection(s) of claim(s) 2, 10, and 18 under 35 U.S.C. 112(b), have been fully considered and are persuasive. Therefore, the rejections set forth in the previous office action have been withdrawn.

Applicant’s arguments, with respect to the rejection(s) of claim(s) 1, 4, 10, and 14 under 35 U.S.C. 103, have been considered but they are not persuasive. More specifically, applicant argues that none of the cited prior art teaches the number of amended claim limitations of claims 1, 4, 10, and 14. The examiner respectfully disagrees. Upon review the previous combination of references, Shaked, Wu, and Yu, have been found to teach the amended limitations. For complete mapping, see 35 U.S.C 103 section for full mapping of claim limitations necessitated by applicant amendments.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claim 1, 4, 10, and 14 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for pre-AIA  the inventor(s), at the time the application was filed, had Applicant has amended claims to state “wherein each embedding model is a copy of a particular embedding model”. However, the specification does not explicitly support the claimed “wherein each embedding model is a copy of a particular embedding model” seeing as paragraph 0041 and Fig. 3 teach “Multiple copies of the same embedding model can be deployed in multiple containers, to scalably support a large amount of concurrent query access. For example, an embedding model deployed in embedding container 32 can be a copy of an embedding model deployed in embedding container 31.” 
In the above citation, the specification illustrates a broader teaching than that which is claimed, since the “Multiple copies” does not explicitly include all/each of the embedding models used. For further illustration of the spec’s broadness, Fig. 3 also includes an “Embedding container” connected to “Deep network container” 35, of which the “example” given in paragraph 0041 does not account to being “a copy of embedding model deployed in embedding container 31.” Therefore, the example given teaches not all/each of the embedding environments/models are copies.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-18 are rejected under 35 U.S.C. 103 as being unpatentable over Shaked et al (US Pub 20170300814), hereinafter Shaked, in view of Yu et al (US Patent 10395118), hereinafter Yu, in view of Wu et al (US Pub 20190073590), hereinafter Wu.
Regarding claim 1, Shaked teaches a model-based prediction method performed by a machine learning system, the system including a plurality of  machine learning models and a plurality of embedding models for converting an input  of a machine learning model (paragraphs 0005, 0008, 0034-0037, and Fig. 1 teach “[t]his specification describes systems and methods for implementing a wide and deep machine learning model (model-based prediction method performed by a machine learning system), i.e., a combined machine learning model that includes both a wide machine learning model and a deep machine learning model (executing on a plurality of running environments including a model running environment)”; wherein the deep machine learning model includes a “deep neural network” and the wide machine learning model includes a “generalized linear model” (system including a plurality of machine learning models). Further, “[t]he deep model may include an embedding layer” with embedding functions (plurality of embedding models) for transforming input features (converting an input) for the “deep neural network 130” (of the machine learning model), and the DNN (machine learning model) producing an “intermediate predicted output” (model-based prediction method)), the embedding models being deployed separately from the machine learning models (paragraphs 0034-0037, and Fig. 1 teach embedding functions (models) in an (being deployed in) “embedding layer 150” (an embedding running environment) and a separate (deployed separately) “deep neural network 130” (from the machine learning models) and “generalized linear model” (machine learning models)),  and each machine learning model being deployed in a respective model running environment of a plurality of model running environments, wherein each model running environment is configured to interact with an embedding running environment (paragraphs 0034-0038, and Fig. 1 teach a “deep neural network 130” (machine learning model) executed in a “deep learning model 104” (deployed in a respective model running environment) and a “generalized linear model” (machine learning model) executed in a “wide learning model” (deployed in a respective model running environment), and further the DNN and GLM correspond to an “embedding layer” or “cross-product feature transformation” (each model running environment is configured to interact with an embedding running environment) and the model results are combined (alternative each model running environment is configured to interact with an embedding running environment)), the method comprising: 
receiving, by a model running environment of the plurality of model running environments, an input  of the machine learning model (paragraphs 0019, 0034-0036, and Fig. 1 teach “[t]he wide and deep machine learning model 102 receives a model input including multiple features” that are passed to (receiving) “the deep machine learning model 104” (by a model running environment of the plurality of model running environments) that are to be input to an included “deep neural network 130” (of the machine learning model)); 
sending, by the model running environment, a table query request to an embedding running environment of the plurality of embedding running environments, the table query request including the input , to request low-dimensional conversion of the input  (paragraphs 0034-0036, 0044-0046, and Fig. 1 teach the “deep learning model 104” (from the model running environment) passes (sending) the input features (table query request) to an “embedding layer 150” (to an embedding running environment) of the embedding/transformation functions (of ; 
obtaining, by the embedding running environment, a plurality of vectors based on the table query request and the input  (paragraphs 0034-0036, 0044-0046, and Fig. 1 teach the “embedding layer 150” (by the embedding running environment) receiving input features (based on the table query request and the input) from the “deep learning model 104”, in order to process the “sparse, categorical features” as inputs (and the input) by using one or more embedding “lookup tables” (based on the table query request) for mapping “each of the features 110-114 to a respective numeric embedding, e.g., a floating point vector presentation of the feature (obtaining, by the embedding running environment, a plurality of vectors). The numeric embedding can include one or more floating point values or one or more quantized integer values whose encoding represents floating point values”); 
generating a table query result based on the obtained plurality of vectors (paragraphs 0034-0036, 0044-0046, and Fig. 1 teach the “embedding layer” outputting “numeric embeddings” (generating a table query result) of the input features, from using one or more embedding “lookup tables” for mapping the input features. Further, the “embedding layer” is taught to process the “sparse, categorical features” as inputs by , wherein the table query result is generated, at least in part, by combining the obtained plurality of vectors into a single vector (paragraphs 0045-0046 teach an embedding layer mapping inputs to vectors using “a single lookup table or multiple different look up tables” as an embedding function, and further “the embedding function may be a combining embedding function.  A combining embedding function maps each token in the list to a respective floating point vector (wherein the table query result is generated) and then merges the respective floating point vectors into a single merged vector (at least in part, by combining the obtained plurality of vectors into a single vector)”);
receiving, by the model running environment, the table query result returned by the embedding running environment, the table query result being obtained by the embedding running environment by querying an embedding table for the low-dimensional conversion that is associated with the machine learning model based on the input  (paragraphs 0034-0036, 0044-0046, and Fig. 1 teach the “embedding layer” (by the embedding running environment) outputting “numeric embeddings” of the input features, from using one or more embedding “lookup tables” for mapping the input features (returned table query result being obtained by the ; and 
inputting, by the model running environment, the table query result into the machine learning model, and executing the machine learning model to complete model-based prediction (paragraphs 0034-0036, 0044-0046, and Fig. 1 teach the “embedding layer” outputting the embedding “lookup table” mapped “numeric embeddings” (the table query result), that “can include one or more floating point values or one or more quantized integer values whose encoding represents floating point values”, and the “deep learning model 104” passes the values (inputting, by the model running environment, the table query result) to the to the “deep neural network 130” (into the machine learning model) for generating “an alternative representation of the input, i.e., the deep model intermediate predicted output” (executing the machine learning model to complete model-based prediction)).

However, Shaked does not explicitly teach an input tensor and each embedding model being deployed in a respective embedding running environment of a plurality of embedding running environments, wherein each embedding model is a copy of a particular embedding model.
Yu teaches an input tensor and the claim’s tensor operations (Col. 6, line 37-Col. 7, line 12, and Fig. 2 teach a “one-hot vector” input (input tensor) to “the embedding layer 220” (embedding running environment), where “[t]he embedding layer 220 converts 305 the one-hot vector to a dense representation 221 in a lower dimensional space by multiplying it (sent table query request including the input tensor) with an embedding table (512.times.N) (table query result being obtained by the embedding running environment by querying an embedding table for the low-dimensional conversion), of which each row is a word embedding to be learned (that is associated with the machine learning model based on the input tensor). In embodiments, the resulting word embedding 221 is 310 then input to a first RNN 222” (inputting the table query result into the machine learning model), and the RNN outputs “223” an encoded word result (executing the machine learning model to complete model-based prediction)).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Yu’s teachings of utilizing an “embedding table” in an “embedding layer” for converting an input “one-hot vector” into a “dense representation 221 in a lower dimensional space” into Shaked’s teaching of an embedding layer transforming input features for further processing by a DNN in order to format the input to the RNN for processing smaller formatted data faster and helping “regularize the word embedding table” and processing smaller formatted data (Yu, Col. 6, line 37-Col. 7, line 12 and Col. 8, lines 4-31).
each embedding model being deployed in a respective embedding running environment of a plurality of embedding running environments, wherein each embedding model is a copy of a particular embedding model.
Wu teaches each embedding model being deployed in a respective embedding running environment of a plurality of embedding running environments, wherein each embedding model is a copy of a particular embedding model (paragraphs 0005, 0061, 0079, 0101, 0142-0145, 0148, 0153, and Figs. 14-15 teach “nodes associated with embedding tables (each embedding model being) may be designated for evaluation on the remote machine (deployed in a respective embedding running environment of a plurality of embedding running environments).  That is, (select) embedding tables may be kept on the remote machine (e.g., remote predictor) (deployed in a respective embedding running environment of a plurality of embedding running environments), and nodes whose operations require access to these embedding tables may be designated for remote execution”, and further the embedding functions are all “initialized with weights” and specifically trained on a separate machine before being called for embedding operations (wherein each embedding model is a copy of a particular embedding model)).
Further, Shaked at least implies wherein each model running environment is configured to interact with an embedding running environment, however Wu teaches wherein each model running environment is configured to interact with an embedding running environment (paragraphs 0143-0145, 0148, 0153, and Figs. 14-15 teach the “trained (NN) ML model” executed on a “local machine” (wherein each model running environment); and “nodes associated with embedding tables (embedding running environment) may be designated for evaluation on the remote machine (is configured to interact with).  That is, (select) embedding tables may be kept on the remote machine (e.g., remote predictor) (embedding running environment), and nodes whose operations require access to these embedding tables may be designated for remote execution (wherein each model running environment is configured to interact with an embedding running environment)”. Further paragraphs 0005, 0143, and 0153 teach the machines executing different operational portions can be computing devices including “servers”, and/or “GPUs or CPUs” of the computing devices.).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify an embedding layer transforming input features for further processing by a DNN, as taught by Shaked as modified by utilizing an “embedding table” for converting an input “one-hot vector” into a “dense representation 221 in a lower dimensional space” as taught by Yu, to include machine learning operations executed on different hardware machines as taught by Wu in order to increase efficiency of resource management through expediting the operations to specific hardware resources based on certain “computational power” or “memory requirements” of the operations (Wu, paragraphs 0061 and 0143).

Regarding claim 2, the combination of Shaked, Yu, and Wu teach all the claim limitations of claim 1 above; and further teach wherein the model running environment is a physical execution unit or a virtual execution unit, and the embedding running environment is a physical execution unit or a virtual execution unit (Shaked, paragraph 0072 teaches “[a] computer program (which may also be referred to or described as a program, software, a software application, a module, a software module, a script, or code) can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a stand alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment”, where paragraph 0061 teaches the components are “internal components, e.g., the deep neural network and the set of embedding functions” (model running environment is…a virtual execution unit, and the embedding running environment is…a virtual execution unit)).
The combination at least implies wherein the model running environment is a physical execution unit or a virtual execution unit, and the embedding running environment is a physical execution unit or a virtual execution unit (see mapping above), however Wu teaches wherein the model running environment is a physical execution unit or a virtual execution unit, and the embedding running environment is a physical execution unit or a virtual execution unit (paragraphs 0005, 0061, 0143-0145, and Figs. 14-15 teach “a trained (NN) ML model” operational nodes (model running environment is…a virtual execution unit) and “nodes associated with embedding” (embedding running environment is…a virtual execution unit) that are executed on different machines including different server devices or “GPUs or CPUs” (model running environment is a physical execution unit/embedding running environment is a physical execution unit) based on certain “computational power” or “memory requirements” of the node).


Regarding claim 3, the combination of Shaked, Yu, and Wu teach all the claim limitations of claim 1 above; and further teach wherein the machine learning system includes at least one embedding running environment and at least one model running environment (Shaked, paragraphs 0005, 0008, and Fig. 1 teach “[t]his specification describes systems and methods for implementing a wide and deep machine learning model (machine learning system), i.e., a combined machine learning model that includes both a wide machine learning model and a deep machine learning model (includes…at least one model running environment)”, where “[t]he deep model may include an embedding layer” (includes at least one embedding running environment)), each embedding running environment implementing a single embedding model (Yu, Col. 6, line 37-Col. 7, line 12, and Fig. 2 teach an “embedding layer” (each embedding running environment) multiplying the input vector with an included “embedding table” (implementing a single embedding model)), and each model running environment implementing a single machine learning model (Shaked, paragraphs 0034-0036, and Fig. 1 teach a “deep neural network 130” (implementing a single machine learning model) executed in a “deep learning model 104” (each model running environment)).
Shaked and Yu are combinable for the same rationale as set forth above with respect to claim 1.

Regarding claim 4, Shaked teaches a machine learning method performed by a machine learning system executing on a plurality of running environments including a plurality of model running environments and a plurality of embedding running environments, wherein each model running environment is configured to interact with an embedding running environment  (paragraphs 0005, 0008, and Fig. 1 teach “[t]his specification describes systems and methods for implementing a wide and deep machine learning model (machine learning method performed by a machine learning system), i.e., a combined machine learning model that includes both a wide machine learning model and a deep machine learning model (executing on a plurality of running environments including a model running environment)”, where “[t]he deep model may include an embedding layer” (and an embedding running environment). Further, paragraphs 0034-0038, and Fig. 1 teach a , the method comprising: 
receiving, by a model running environment of the plurality of model running environments, an input to a machine learning model implemented on the model running environment (paragraphs 0019, 0034-0036, and Fig. 1 teach “[t]he wide and deep machine learning model 102 receives a model input including multiple features” that are passed to (receiving) “the deep machine learning model 104” (by a model running environment of the plurality of model running environments) that are to be input to an included (implemented on the model running environment) “deep neural network 130” (to a machine learning model)); 
sending, from the model running environment to an embedding running environment of the plurality of embedding environments, a request for converting the input (paragraphs 0034-0036 and Fig. 1 teach the “deep learning model 104” (from the model running environment) passes (sending) the input features to an “embedding layer 150” (to the embedding running environment) of the embedding/transformation functions (of the plurality of embedding running environments) for processing (a request for converting the input), including “embedding functions 124-128 appl[ying] a ; 
receiving, by the model running environment, a result returned from the embedding running environment, the result including a low-dimensional representation of the input (paragraphs 0034-0036 and Fig. 1 teach the “embedding layer” outputting “numeric embeddings” (a result) of the input features to the “deep neural network 130” in the “deep learning model 104” (receiving, by the model running environment); where the “embedding layer” processes “sparse, categorical features” as inputs and mapping “each of the features 110-114 to a respective numeric embedding, e.g., a floating point vector presentation of the feature. The numeric embedding can include one or more floating point values or one or more quantized integer values whose encoding represents floating point values” (result including a low-dimensional representation of the input)); and 
feeding, by the model running environment, the low-dimensional representation into the machine learning model to perform model-based prediction (paragraphs 0034-0036 and Fig. 1 teach the “embedding layer” outputting “numeric embeddings”, that “can include one or more floating point values or one or more quantized integer values whose encoding represents floating point values” (low-dimensional representation), and the “deep learning model 104” passes the values (feeding, by the model running environment) to the to the “deep neural network 130” (into the machine learning model) for generating “an alternative representation of the input, i.e., the deep model intermediate predicted output” (to perform model-based prediction)).

However, Shaked does not explicitly teach an input and wherein each embedding running environment includes a copy of a particular embedding model.
	Wu teaches and wherein each embedding running environment includes a copy of a particular embedding model (paragraphs 0005, 0061, 0079, 0101, 0142-0145, 0148, 0153, and Figs. 14-15 teach “nodes associated with embedding tables (each embedding model being) may be designated for evaluation on the remote machine (deployed in a respective embedding running environment of a plurality of embedding running environments).  That is, (select) embedding tables may be kept on the remote machine (e.g., remote predictor) (deployed in a respective embedding running environment of a plurality of embedding running environments), and nodes whose operations require access to these embedding tables may be designated for remote execution”, and further the embedding functions are all “initialized with weights” and specifically trained on a separate machine before being called for embedding operations (wherein each embedding model is a copy of a particular embedding model)).
Further, Shaked at least implies wherein each model running environment is configured to interact with an embedding running environment and the result including a low-dimensional representation of the input, however Wu teaches wherein each model running environment is configured to interact with an embedding running environment (paragraphs 0143-0145, 0148, 0153, and Figs. 14-15 teach the “trained (NN) ML model” executed on a “local machine” (wherein each model running environment); and “nodes associated with embedding tables (embedding running environment) may be designated for evaluation on the remote machine (is configured to interact with).  That is, (select) embedding tables may be kept on the remote machine (e.g., remote predictor) (embedding running environment), and nodes whose operations require access to these embedding tables may be designated for remote execution (wherein each model running environment is configured to interact with an embedding running environment)”. Further paragraphs 0005, 0143, and 0153 teach the machines executing different operational portions can be computing devices including “servers”, and/or “GPUs or CPUs” of the computing devices.).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Wu’s teachings of machine learning operations executed on different hardware machines into Shaked’s teaching of an embedding layer transforming input features for further processing by a DNN in order to increase efficiency of resource management through expediting the operations to specific hardware resources based on certain “computational power” or “memory requirements” of the operations (Wu, paragraphs 0061 and 0143).
Additionally, Yu teaches the result including a low-dimensional representation of the input (Col. 6, line 37-Col. 7, line 12, Col. 9, line 32-Col. 10, line 4, and Figs. 2 and 5 teach a “one-hot vector” input to “the embedding layer 220”, where “[t]he embedding layer 220 converts 305 the one-hot vector to a dense representation 221 in a lower dimensional space (result including a low-dimensional representation of the input) by multiplying it with an embedding table (512.times.N), of which each row is 
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify an embedding layer transforming input features for further processing by a DNN, as taught by Shaked as modified by machine learning operations executed on different hardware machines as taught by Wu, to include utilizing an “embedding table” for converting an input “one-hot vector” into a “dense representation 221 in a lower dimensional space” as taught by Yu in order to format the input to the RNN for processing smaller formatted data faster and helping “regularize the word embedding table” and processing smaller formatted data (Yu, Col. 6, line 37-Col. 7, line 12 and Col. 8, lines 4-31).

Regarding claim 5, the combination of Shaked, Wu, and Yu teach all the claim limitations of claim 4 above, and further teach wherein the sending, from the model running environment to the embedding running environment, a request for converting the input includes: 
sending a local request for converting the input, wherein the embedding running environment and the model running environment are located on a same physical node (Wu, paragraphs 0143-0145 and Figs. 14-15 teach “a trained (NN) ML model” executed on a “local machine” (model running environment); and “since dense feature inputs may be large, the transfer of the large dense feature input from the local machine to the remote machine may slow down the computer network.  In this case, the node that applies embedding (embedding running environment) to the dense input may sending a local request for converting the input, wherein the embedding running environment and the model running environment are located on a same physical node), and the local machine may be configured to hold trained matrices needed for generating low-dimensional representations of dense features, such as by embedding”. Further paragraphs 0005, 0143, and 0153 teach the machines executing different operational portions can be computing devices including “servers”, and/or “GPUs or CPUs” of the computing devices.); or 
sending a remote request for converting the input, wherein the embedding running environment and the model running environment are located on different physical nodes (Wu, paragraphs 0143-0145, 0148, 0153, and Figs. 14-15 teach “a trained (NN) ML model” executed on a “local machine” (model running environment); and “nodes associated with embedding tables (embedding running environment) may be designated for evaluation on the remote machine (different physical nodes).  That is, (select) embedding tables may be kept on the remote machine (e.g., remote predictor) (different physical nodes), and nodes whose operations require access to these embedding tables may be designated for remote execution (sending a remote request for converting the input, wherein the embedding running environment and the model running environment are located on different physical nodes)”. Further paragraphs 0005, 0143, and 0153 teach the machines executing different operational portions can be computing devices including “servers”, and/or “GPUs or CPUs” of the computing devices.).


Regarding claim 6, the combination of Shaked, Wu, and Yu teach all the claim limitations of claim 4 above, and further teach wherein the plurality of running environments includes at least two model running environments and different hardware resources are configured for different model running environments, the hardware resources being adapted to running requirements of machine learning models in the model running environments (Wu, paragraphs 0005, 0061, 0143-0145, and Figs. 14-15 teach “a trained (NN) ML model” operational nodes (plurality of running environments includes at least two model running environments) that are executed on different machines including different server devices (hardware resources) or “GPUs or CPUs” within the same device (hardware resources) based on certain .
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify an embedding layer transforming input features for further processing by a DNN, as taught by Shaked as modified by utilizing an “embedding table” for converting an input “one-hot vector” into a “dense representation 221 in a lower dimensional space” as taught by Yu, to include machine learning operations executed on different hardware machines as taught by Wu in order to increase efficiency of resource management through expediting the operations to specific hardware resources based on certain “computational power” or “memory requirements” of the operations (Wu, paragraphs 0061 and 0143).

Regarding claim 7, the combination of Shaked, Wu, and Yu teach all the claim limitations of claim 6 above and further teach wherein the hardware resource includes at least one of a central processing unit or a hardware accelerator (Wu, paragraphs 0005, 0061, 0143-0145, and Figs. 14-15 teach “a trained (NN) ML model” operational nodes that are executed on different machines including different server devices or “GPUs (hardware resource includes at least one of a…hardware accelerator) or CPUs (hardware resource includes at least one of a central processing unit)” based on certain “computational power” or “memory requirements” of the node).


Regarding claim 8, the combination of Shaked, Yu, and Wu teach all the claim limitations of claim 7 above and further teach wherein the hardware accelerator includes at least one of a graphics processing unit, a field-programmable gate array, or an application-specific integrated circuit chip designed for a specific purpose (Wu, paragraphs 0005, 0061, 0143-0145, and Figs. 14-15 teach “a trained (NN) ML model” operational nodes that are executed on different machines including different server devices or “GPUs (hardware accelerator includes at least one of a graphics processing unit) or CPUs” based on certain “computational power” or “memory requirements” of the node).
Shaked, Yu, and Wu are combinable for the same rationale as set forth above with respect to claim 7.

Regarding claim 9, the combination of Shaked, Wu, and Yu teach all the claim limitations of claim 4 above; and further teach wherein the machine learning model includes at least one of a deep neural network model, a Wide & Deep model, or a Deep Factorization Machine model (Shaked, paragraphs 0034-0036, and Fig. 1 teach a “deep neural network 130” (the machine learning model includes at least one of a deep neural network model)).

Regarding claim 10, Shaked teaches a machine learning system comprising a plurality of embedding running environments and a plurality of model running environments, and wherein each model running environment of the plurality of model running environments is configured to interact with an embedding running environment, and wherein a machine learning model is deployed in each model running environment of the plurality of model running environments (paragraphs 0005, 0008, 0034-0036, and Fig. 1 teach “[t]his specification describes systems and methods for implementing a wide and deep machine learning model (machine learning system), i.e., a combined machine learning model that includes both a wide machine learning model and a deep machine learning model (comprising…a model running environment)”; where “[t]he deep model may include an embedding layer” (comprising an embedding running environment) with embedding functions (an embedding model), and a “deep neural network 130” (machine learning model) executed in a “deep learning model 104” (deployed in a model running environment). Further, paragraphs 0034-0038, and Fig. 1 teach a “deep neural network 130” (machine learning model) executed in a “deep learning model 104” (model running environment) and a “generalized linear model” (machine learning model) executed in a “wide learning model” (model running environment), and further the DNN and GLM correspond to an “embedding layer” or “cross-product feature transformation” (each model running environment is configured to interact with an embedding running environment) and the model results are combined ; 
wherein a model running environment of the plurality of model running environments is configured to receive an input for the machine learning model (paragraphs 0019, 0034-0036, and Fig. 1 teach “[t]he wide and deep machine learning model 102 receives a model input including multiple features” that are passed to (receiving) “the deep machine learning model 104” (by the model running environment of the plurality of model running environments) that are to be input to an included “deep neural network 130” (of the machine learning model)), send a table query request including the input to an embedding running environment of the plurality of embedding environments (paragraphs 0034-0036, 0044-0046, and Fig. 1 teach the “deep learning model 104” (from the model running environment) passes (sending) the input features (table query request) to an “embedding layer 150” (to the embedding running environment) of the embedding/transformation functions (of the plurality of embedding running environments). The “embedding layer” processes the “sparse, categorical features” as inputs by using one or more embedding “lookup tables” (table query request including the input) for mapping “each of the features 110-114 to a respective numeric embedding, e.g., a floating point vector presentation of the feature. The numeric embedding can include one or more floating point values or one or more quantized integer values whose encoding represents floating point values” (request low-dimensional conversion of the input)), receive a response including a low-dimensional converted value of the input from the embedding running environment (paragraphs 0034-0036, 0044-0046, and Fig. 1 teach the “embedding , and feed the low-dimensional converted value into the machine learning model to execute model-based prediction (paragraphs 0034-0036, 0044-0046, and Fig. 1 teach the “embedding layer” outputting the embedding “lookup table” mapped “numeric embeddings”, that “can include one or more floating point values or one or more quantized integer values whose encoding represents floating point values” (low-dimensional converted value), and the “deep learning model 104” passes the values (model running environment feeds the low-dimensional converted value) to the to the “deep neural network 130” (into the machine learning model) for generating “an alternative representation of the input, i.e., the deep model intermediate predicted output” (execute model-based prediction)); and 
wherein the embedding running environment is configured to perform embedding query based on the input to obtain the low-dimensional converted value by obtaining a plurality of vectors based on the input and generating the response based, at least in part, on combining the plurality of vectors into a single vector (paragraphs 0034-0036, 0045-0046, and Fig. 1 teach an embedding layer mapping inputs, received from the “deep learning model 104” (based on the input), to vectors (obtaining a plurality of vectors) using “a single lookup table or multiple different look up tables” as an embedding function, and further “the embedding function may be a combining embedding function.  A combining embedding function maps each token in the list to a respective floating point vector (obtaining a plurality of vectors based on the input) and then merges the respective floating point vectors into a single merged vector (and generating the response based, at least in part, on combining the plurality of vectors into a single vector)”), and send the a response including the low-dimensional converted value back to the model running environment (paragraphs 0034-0036, 0044-0046, and Fig. 1 teach the “embedding layer” (by the embedding running environment) outputting “numeric embeddings” of the input features, from using one or more embedding “lookup tables” for mapping the input features, to the “deep neural network 130” in the “deep learning model 104” (send the a response including the…converted value back to the model running environment). Further, the “embedding layer” (embedding running environment) is taught to process the “sparse, categorical features” as inputs by using one or more embedding “lookup tables” for mapping (configured to perform embedding query based on the input) “each of the features 110-114 to a respective numeric embedding, e.g., a floating point vector presentation of the feature. The numeric embedding can include one or more floating point values or one or more quantized integer values whose encoding represents floating point values” (response including the [obtained] low-dimensional converted value)).
 wherein each embedding running environment of the plurality of embedding running environments includes a copy of a particular embedding model.
Wu teaches and wherein each embedding running environment includes a copy of a particular embedding model (paragraphs 0005, 0061, 0079, 0101, 0142-0145, 0148, 0153, and Figs. 14-15 teach “nodes associated with embedding tables (each embedding model being) may be designated for evaluation on the remote machine (deployed in a respective embedding running environment of a plurality of embedding running environments).  That is, (select) embedding tables may be kept on the remote machine (e.g., remote predictor) (deployed in a respective embedding running environment of a plurality of embedding running environments), and nodes whose operations require access to these embedding tables may be designated for remote execution”, and further the embedding functions are all “initialized with weights” and specifically trained on a separate machine before being called for embedding operations (wherein each embedding model is a copy of a particular embedding model)).
Shaked at least implies and wherein each model running environment of the plurality of model running environments is configured to interact with an embedding running environment and response including a low-dimensional converted value of the input, however Wu teaches wherein each model running environment is configured to interact with an embedding running environment (paragraphs 0143-0145, 0148, 0153, and Figs. 14-15 teach the “trained (NN) ML model” executed on a “local machine” (wherein each model running environment); and “nodes embedding running environment) may be designated for evaluation on the remote machine (is configured to interact with).  That is, (select) embedding tables may be kept on the remote machine (e.g., remote predictor) (embedding running environment), and nodes whose operations require access to these embedding tables may be designated for remote execution (wherein each model running environment is configured to interact with an embedding running environment)”. Further paragraphs 0005, 0143, and 0153 teach the machines executing different operational portions can be computing devices including “servers”, and/or “GPUs or CPUs” of the computing devices.).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Wu’s teachings of machine learning operations executed on different hardware machines into Shaked’s teaching of an embedding layer transforming input features for further processing by a DNN in order to increase efficiency of resource management through expediting the operations to specific hardware resources based on certain “computational power” or “memory requirements” of the operations (Wu, paragraphs 0061 and 0143).
Additionally, Yu teaches response including a low-dimensional converted value of the input (Col. 6, line 37-Col. 7, line 12, Col. 9, line 32-Col. 10, line 4, and Figs. 2 and 5 teach a “one-hot vector” input to “the embedding layer 220”, where “[t]he embedding layer 220 converts 305 the one-hot vector to a dense representation 221 in a lower dimensional space (response including a low-dimensional converted value of the input) by multiplying it with an embedding table (512.times.N), of which each row is 
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify an embedding layer transforming input features for further processing by a DNN, as taught by Shaked as modified by machine learning operations executed on different hardware machines as taught by Wu, to include utilizing an “embedding table” for converting an input “one-hot vector” into a “dense representation 221 in a lower dimensional space” as taught by Yu in order to format the input to the RNN for processing smaller formatted data faster and helping “regularize the word embedding table” and processing smaller formatted data (Yu, Col. 6, line 37-Col. 7, line 12 and Col. 8, lines 4-31).

Regarding claim 11, the combination of Shaked, Wu, and Yu teach all the claim limitations of claim 10 above; and further teach wherein the embedding running environment and the model running environment each are a physical execution unit or a virtual execution unit (Shaked, paragraph 0072 teaches “[a] computer program (which may also be referred to or described as a program, software, a software application, a module, a software module, a script, or code) can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a stand alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment”, where paragraph 0061 teaches the components are “internal components, e.g., the deep neural network and the set of embedding functions” .
The combination at least implies wherein the embedding running environment and the model running environment each are a physical execution unit or a virtual execution unit (see mapping above), however Wu teaches wherein the embedding running environment and the model running environment each are a physical execution unit or a virtual execution unit (paragraphs 0005, 0061, 0143-0145, and Figs. 14-15 teach “a trained (NN) ML model” operational nodes (model running environment is…a virtual execution unit) and “nodes associated with embedding” (embedding running environment is…a virtual execution unit) that are executed on different machines including different server devices or “GPUs or CPUs” (model running environment is a physical execution unit/embedding running environment is a physical execution unit) based on certain “computational power” or “memory requirements” of the node).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify an embedding layer transforming input features for further processing by a DNN, as taught by Shaked as modified by utilizing an “embedding table” for converting an input “one-hot vector” into a “dense representation 221 in a lower dimensional space” as taught by Yu, to include embedding operations of inputs either on the local machine or a remote machine as taught by Wu in order to reduce calculation time since a “transfer of the large dense feature input from the local machine to the remote machine may slow down the computer network”, or more efficiently manage calculations due to memory constraints 

Regarding claim 12, the combination of Shaked, Wu, and Yu teach all the claim limitations of claim 10 above; and further teach comprising at least one embedding running environment and at least one model running environment (Shaked, paragraphs 0005, 0008, and Fig. 1 teach “[t]his specification describes systems and methods for implementing a wide and deep machine learning model (machine learning system), i.e., a combined machine learning model that includes both a wide machine learning model and a deep machine learning model (comprising…at least one model running environment)”, where “[t]he deep model may include an embedding layer” (comprising at least one embedding running environment)), each embedding running environment implementing at least one embedding model (Shaked, paragraphs 0034-0036, and Fig. 1 teach embedding functions (implementing at least one embedding model) in an “embedding layer 150” (each embedding running environment)), and each model running environment implementing at least one machine learning model (Shaked, paragraphs 0034-0036, and Fig. 1 teach a “deep neural network 130” (implementing at least one machine learning model) executed in a “deep learning model 104” (each model running environment)).

Regarding claim 13, the combination of Shaked, Wu, and Yu teach all the claim limitations of claim 12 above, and further teach wherein different hardware resources are configured for different model running environments, the hardware resources being adapted to running requirements of machine learning models in the model running environments (Wu, paragraphs 0005, 0061, 0143-0145, and Figs. 14-15 teach “a trained (NN) ML model” operational nodes (model running environments) that are executed on different machines including different server devices (hardware resources) or “GPUs or CPUs” within the same device (hardware resources) based on certain “computational power” or “memory requirements” of the node (different hardware resources are configured for different model running environments, the hardware resources being adapted to running requirements of machine learning models in the model running environments)).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify an embedding layer transforming input features for further processing by a DNN, as taught by Shaked as modified by utilizing an “embedding table” for converting an input “one-hot vector” into a “dense representation 221 in a lower dimensional space” as taught by Yu, to include machine learning operations executed on different hardware machines as taught by Wu in order to increase efficiency of resource management through expediting the operations to specific hardware resources based on certain “computational power” or “memory requirements” of the operations (Wu, paragraphs 0061 and 0143).

Regarding claim 14, Shaked teaches a non-transitory storage medium storing contents that, when executed by one or more processors, cause the one or more processors to perform actions comprising (paragraphs 0008 and 0070-:
receiving, by a model running environment of a plurality of model running environments included in a machine learning system, an input to a machine learning model implemented on the model running environment, wherein each model running environment is configured to interact with an embedding running environment (paragraphs 0019, 0034-0036, and Fig. 1 teach “[t]he wide and deep machine learning model 102 (machine learning system) receives a model input including multiple features” that are passed to (receiving) “the deep machine learning model 104” (by the model running environment) and a “wide learnisng model” (of a plurality of model running environments) that are to be input to an included (implemented on the model running environment) “deep neural network 130” (to a machine learning model); and further the DNN and GLM correspond to an “embedding layer” or “cross-product feature transformation” (each model running environment is configured to interact with an embedding running environment) and the model results are combined (alternative each model running environment is configured to interact with an embedding running environment)); 
sending, from the model running environment to an embedding running environment a plurality of embedding environments included in of the machine learning system, a request for converting the input, ;); 
obtaining, by the embedding running environment, a plurality of vectors based on the table query request and the input (paragraphs 0034-0036, 0044-0046, and Fig. 1 teach the “embedding layer 150” (by the embedding running environment) receiving input features (based on the table query request and the input) from the “deep learning model 104”, in order to process the “sparse, categorical features” as inputs (and the input) by using one or more embedding “lookup tables” (based on the table query request) for mapping “each of the features 110-114 to a respective numeric embedding, e.g., a floating point vector presentation of the feature (obtaining, by the embedding running environment, a plurality of vectors). The numeric embedding can include one or more floating point values or one or more quantized integer values whose encoding represents floating point values”); 
generating a table query result based on the obtained plurality of vectors (paragraphs 0034-0036, 0044-0046, and Fig. 1 teach the “embedding layer” outputting “numeric embeddings” (generating a table query result) of the input features, from using , wherein the table query result is generated, at least in part, by combining the obtained plurality of vectors into a single vector (paragraphs 0045-0046 teach an embedding layer mapping inputs to vectors using “a single lookup table or multiple different look up tables” as an embedding function, and further “the embedding function may be a combining embedding function.  A combining embedding function maps each token in the list to a respective floating point vector (wherein the table query result is generated) and then merges the respective floating point vectors into a single merged vector (at least in part, by combining the obtained plurality of vectors into a single vector)”);
receiving, by the model running environment, a result returned from the embedding running environment, the result including a low-dimensional representation of the input (paragraphs 0034-0036 and Fig. 1 teach the “embedding layer” outputting “numeric embeddings” (a result) of the input features to the “deep neural network 130” in the “deep learning model 104” (receiving, by the model running environment); where the “embedding layer” processes “sparse, categorical features” as ; and 
feeding, by the model running environment, the low-dimensional representation into the machine learning model to perform model-based prediction (paragraphs 0034-0036 and Fig. 1 teach the “embedding layer” outputting “numeric embeddings”, that “can include one or more floating point values or one or more quantized integer values whose encoding represents floating point values” (low-dimensional representation), and the “deep learning model 104” passes the values (feeding, by the model running environment) to the to the “deep neural network 130” (into the machine learning model) for generating “an alternative representation of the input, i.e., the deep model intermediate predicted output” (to perform model-based prediction)).
However, Shaked does not explicitly teach wherein each embedding running environment of the plurality of embedding running environments includes a copy of a particular embedding model.
Wu teaches wherein each embedding running environment of the plurality of embedding running environments includes a copy of a particular embedding model (paragraphs 0005, 0061, 0079, 0101, 0142-0145, 0148, 0153, and Figs. 14-15 teach “nodes associated with embedding tables (each embedding model being) may be designated for evaluation on the remote machine (deployed in a respective embedding running environment of a plurality of embedding running environments).  That is, (select) embedding tables may be kept on the remote machine (e.g., remote predictor) (deployed in a respective embedding running environment of a plurality of embedding running environments), and nodes whose operations require access to these embedding tables may be designated for remote execution”, and further the embedding functions are all “initialized with weights” and specifically trained on a separate machine before being called for embedding operations (wherein each embedding model is a copy of a particular embedding model)).
Further, Shaked at least implies wherein each model running environment is configured to interact with an embedding running environment and the result including a low-dimensional representation of the input, however Wu teaches wherein each model running environment is configured to interact with an embedding running environment (paragraphs 0143-0145, 0148, 0153, and Figs. 14-15 teach the “trained (NN) ML model” executed on a “local machine” (wherein each model running environment); and “nodes associated with embedding tables (embedding running environment) may be designated for evaluation on the remote machine (is configured to interact with).  That is, (select) embedding tables may be kept on the remote machine (e.g., remote predictor) (embedding running environment), and nodes whose operations require access to these embedding tables may be designated for remote execution (wherein each model running environment is configured to interact with an embedding running environment)”. Further paragraphs 0005, 0143, and 0153 teach the machines executing different operational portions can be computing devices including “servers”, and/or “GPUs or CPUs” of the computing devices.).

	Additionally, Yu teaches the result including a low-dimensional representation of the input (Col. 6, line 37-Col. 7, line 12, Col. 9, line 32-Col. 10, line 4, and Figs. 2 and 5 teach a “one-hot vector” input to “the embedding layer 220”, where “[t]he embedding layer 220 converts 305 the one-hot vector to a dense representation 221 in a lower dimensional space (result including a low-dimensional representation of the input) by multiplying it with an embedding table (512.times.N), of which each row is a word embedding to be learned. In embodiments, the resulting word embedding 221 is 310 then input to a first RNN 222”).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify an embedding layer transforming input features for further processing by a DNN, as taught by Shaked as modified by machine learning operations executed on different hardware machines as taught by Wu, to include utilizing an “embedding table” for converting an input “one-hot vector” into a “dense representation 221 in a lower dimensional space” as taught by Yu in order to format the input to the RNN for processing smaller formatted data faster and helping 

Regarding claim 15, the combination of Shaked, Wu, and Yu teach all the claim limitations of claim 14 above; and further teach wherein the model running environment is a physical execution unit or a virtual execution unit, and the embedding running environment is a physical execution unit or a virtual execution unit (Shaked, paragraph 0072 teaches “[a] computer program (which may also be referred to or described as a program, software, a software application, a module, a software module, a script, or code) can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a stand alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment”, where paragraph 0061 teaches the components are “internal components, e.g., the deep neural network and the set of embedding functions” (model running environment is…a virtual execution unit, and the embedding running environment is…a virtual execution unit)).
The combination at least implies wherein the model running environment is a physical execution unit or a virtual execution unit, and the embedding running environment is a physical execution unit or a virtual execution unit (see mapping above), however Wu teaches wherein the model running environment is a physical execution unit or a virtual execution unit, and the embedding running environment is a physical execution unit or a virtual execution unit (paragraphs model running environment is…a virtual execution unit) and “nodes associated with embedding” (embedding running environment is…a virtual execution unit) that are executed on different machines including different server devices or “GPUs or CPUs” (model running environment is a physical execution unit/embedding running environment is a physical execution unit) based on certain “computational power” or “memory requirements” of the node).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify an embedding layer transforming input features for further processing by a DNN, as taught by Shaked as modified by utilizing an “embedding table” for converting an input “one-hot vector” into a “dense representation 221 in a lower dimensional space” as taught by Yu, to include embedding operations of inputs either on the local machine or a remote machine as taught by Wu in order to reduce calculation time since a “transfer of the large dense feature input from the local machine to the remote machine may slow down the computer network”, or more efficiently manage calculations due to memory constraints since “[u]se of embedding tables may require higher amounts of memory. Therefore, optionally, nodes associated with embedding tables may be designated for evaluation on the remote machine” (Wu, paragraphs 0145 and 0148).

Regarding claim 16, the combination of Shaked, Wu, and Yu teach all the claim limitations of claim 14 above, and further teach wherein the sending, from the model running environment to the embedding running environment, a request for converting the input includes: 
sending a local request for converting the input, wherein the embedding running environment and the model running environment are located on a same physical node (Wu, paragraphs 0143-0145 and Figs. 14-15 teach “a trained (NN) ML model” executed on a “local machine” (model running environment); and “since dense feature inputs may be large, the transfer of the large dense feature input from the local machine to the remote machine may slow down the computer network.  In this case, the node that applies embedding (embedding running environment) to the dense input may be designated for execution on the local machine (sending a local request for converting the input, wherein the embedding running environment and the model running environment are located on a same physical node), and the local machine may be configured to hold trained matrices needed for generating low-dimensional representations of dense features, such as by embedding”. Further paragraphs 0005, 0143, and 0153 teach the machines executing different operational portions can be computing devices including “servers”, and/or “GPUs or CPUs” of the computing devices.); or 
sending a remote request for converting the input, wherein the embedding running environment and the model running environment are located on different physical nodes (Wu, paragraphs 0143-0145, 0148, 0153, and Figs. 14-15 teach “a trained (NN) ML model” executed on a “local machine” (model running environment); and “nodes associated with embedding tables (embedding running environment) may be designated for evaluation on the remote machine (different physical nodes).  That is, different physical nodes), and nodes whose operations require access to these embedding tables may be designated for remote execution (sending a remote request for converting the input, wherein the embedding running environment and the model running environment are located on different physical nodes)”. Further paragraphs 0005, 0143, and 0153 teach the machines executing different operational portions can be computing devices including “servers”, and/or “GPUs or CPUs” of the computing devices.).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify an embedding layer transforming input features for further processing by a DNN, as taught by Shaked as modified by utilizing an “embedding table” for converting an input “one-hot vector” into a “dense representation 221 in a lower dimensional space” as taught by Yu, to include embedding operations of inputs either on the local machine or a remote machine as taught by Paik in order to reduce calculation time since a “transfer of the large dense feature input from the local machine to the remote machine may slow down the computer network”, or more efficiently manage calculations due to memory constraints since “[u]se of embedding tables may require higher amounts of memory. Therefore, optionally, nodes associated with embedding tables may be designated for evaluation on the remote machine” (Wu, paragraphs 0145 and 0148).

Regarding claim 17, the combination of Shaked, Wu, and Yu teach all the claim limitations of claim 14 above, and further teach wherein the machine learning system includes a plurality of model running environments and different hardware resources are configured for different model running environments, the hardware resources being adapted to running requirements of machine learning models in the model running environments (Wu, paragraphs 0005, 0061, 0143-0145, and Figs. 14-15 teach “a trained (NN) ML model” operational nodes (machine learning system includes a plurality of model running environments) that are executed on different machines including different server devices (hardware resources) or “GPUs or CPUs” within the same device (hardware resources) based on certain “computational power” or “memory requirements” of the node (different hardware resources are configured for different model running environments, the hardware resources being adapted to running requirements of machine learning models in the model running environments)).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify an embedding layer transforming input features for further processing by a DNN, as taught by Shaked as modified by utilizing an “embedding table” for converting an input “one-hot vector” into a “dense representation 221 in a lower dimensional space” as taught by Yu, to include machine learning operations executed on different hardware machines as taught by Wu in order to increase efficiency of resource management through expediting the operations to specific hardware resources based on certain “computational power” or “memory requirements” of the operations (Wu, paragraphs 0061 and 0143).

Regarding claim 18, the combination of Shaked, Wu, and Yu teach all the claim limitations of claim 17 above and further teach wherein the hardware resource includes at least one of a central processing unit or a hardware accelerator (Wu, paragraphs 0005, 0061, 0143-0145, and Figs. 14-15 teach “a trained (NN) ML model” operational nodes that are executed on different machines including different server devices or “GPUs (hardware resource includes at least one of a…hardware accelerator) or CPUs (hardware resource includes at least one of a central processing unit)” based on certain “computational power” or “memory requirements” of the node).
Shaked, Yu, and Wu are combinable for the same rationale as set forth above with respect to claim 17.

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Boufounos et al (US Pub 20130114811) teaches different clients obtaining “a copy of the embedding parameters” for encryption operations.

Conclusion
15.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to CLINT MULLINAX whose telephone number is 571-272-3241.  The examiner can normally be reached on Mon - Fri 8:00-4:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/C.M./Examiner, Art Unit 2123                                                                                                                                                                                                        

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123