Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
“the computing device and the remote device are configured to” in claims 18 and 19.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.  The computing device is given structure in at least [¶0223] "The computing device (e.g., 201 or 203) has an operating system (e.g., 213 or 233), including instructions, which when executed by the at least one microprocessor (e.g., 215 or 235), cause the computing device to borrow an amount of memory from a lender device (e.g., 203 or 245) over a network connection (205) using a communication device (e.g., 217 or 237)." Of the published instant specification. The remote device is given structure in at least [¶0313] "The computing device and the remote device can be the device (621) and the appliance/server (623) illustrated in FIGS. 26-31. For example, the computing device and the remote device can be the borrower device (201) and the lender device (203)" of the published instant specification in combination with FIG. 2 showing the lender device. 
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 10-17 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claim 10, "computing device" in "storing, for the remote device, a second portion of the artificial neural network in the memory of computing device" lacks antecedent basis.  It's unclear whether this computing device is 'the computing device" or any other computing device.  “The computing device” is recommended. 

The remaining claims are rejected with respect to their dependence on the rejected claims. 

Claim Rejections - 35 USC § 101
101 Rejection
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-21 are rejected under 35 USC § 101 because the claimed invention is directed to non-statutory subject matter.

Regarding Claim 1:  Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 1 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 1 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
executing an application in the computing device to generate an output of the first portion of the artificial neural network, wherein the second portion of the artificial neural network is configured to receive the output of the first portion of the artificial neural network as input to generate an output of the second portion of the artificial neural network (mathematical calculation)
generating, in the computing device, a result corresponding to the output of the second portion of the artificial neural network (mathematical calculation)
Therefore, claim 1 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 1 recites additional elements “a computing device”, “a remote device”, and “memory”. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Claim 1 also recites additional elements “performing learning by an autoencoder” which amounts to generally linking the judicial exception to a particular technology or field of use.  Therefore, claim 1 is directed to a judicial exception.
Step 2B Analysis:  Claim 1 recites additional elements “storing a first portion of an artificial neural network in local memory of the computing device, wherein a second portion of the artificial neural network in the computing device is stored in memory of a remote device, and the remote device and the computing device are connected via a wired or wireless network connection” and “accessing, by the computing device, at least a portion of the memory of the remote device” which is well-understood, routine, and conventional (See MPEP 2106.05(d)(II)(iv) Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93; ))
As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 1 amount to no more than mere instructions to apply the judicial exception using a generic computer component.
For the reasons above, claim 1 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to independent claim 10, which recites a method, as well as to dependent claims 2-9 and 11-17. The additional limitations of the dependent claims are addressed briefly below:
Dependent claim 2 recites additional accessing of memory “wherein the accessing of the portion of the memory of the remote device includes accessing the second portion of the artificial neural network stored in the portion of the memory of the remote device;” which is well-understood, routine, and conventional.  Claim 2 also recites additional mathematical calculations “applying, by the computing device, the input to the second portion of the artificial neural network to generate the output of the second portion of the artificial neural network.”
Dependent claim 3 recites additional memory accessing “storing, by the computing device, the first portion of the artificial neural network in the memory of the remote device”, “mapping, in the computing device, a first virtual address region for accessing the first portion of the artificial neural network to the remote device”, and “mapping, in the computing device, a second virtual address region for accessing the first portion of the artificial neural network to the local memory of the computing device”, as well as additional general data gathering “receiving, in the computing device from the remote device, the second portion of the artificial neural network” which is insignificant extra-solution activity (See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015)).
Dependent claim 4 recites additional accessing of memory “wherein the accessing of the portion of the memory of the remote device includes accessing an alternative module as a substitute of the second portion of the artificial neural network;” which is well-understood, routine, and conventional.  Claim 2 also recites additional mathematical calculations “the result is generated by applying the input to the alternative module.”
Dependent claim 5 recites additional insignificant extra-solution activity “the alternative module is configured to receive a user input in assisting computation of the result.” which amounts to gathering data (See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015)).
Dependent claims 6 and 13 recite additional mathematical relationships “the alternative module includes a simplified model of the second portion of the artificial neural network.”.
Dependent claim 7 recites additional insignificant extra-solution activity “receiving the alternative module in response to a prediction of degradation in the wired or wireless network connection.” Which amounts to gathering data (See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015)).
Dependent claim 8 recites mental processes “requesting, by the computing device, the remote device to apply the input to the second portion of the artificial neural network” and additional memory accessing “wherein accessing, by the computing device, at least a portion of the memory of the remote device includes accessing the output of the second portion of the artificial neural network generated on the remote device”
Dependent claim 9 recites additional mental processes “detecting degradation in the wired or wireless network connection” and “in response to the degradation, skipping in the application computation involving the second portion of the artificial neural network” which amounts to observation, evaluation, and judgement.
Dependent claim 11 recites additional memory accessing “detecting degradation in the wired or wireless network connection”, and “wherein the remote device is configured to access the second portion of the artificial neural network stored in the memory of the computing device in applying the input to the second portion of the artificial neural network.”
Dependent claim 12 recites additional insignificant extra-solution activity “transmitting to the remote device an alternative module as a substitute of the second portion of the artificial neural network, wherein the remote device is configured to replace processing by the second portion of the artificial neural network with processing by the alternative module when the connection degrades.” Which amounts to outputting data.
Dependent claim 14 recites additional mathematical relationships and calculations “generating the simplified model of the second portion of the artificial neural network based on a usage pattern of the artificial neural network in the application.”
Dependent claim 15 recites additional mental processes “predicting the usage pattern for a time period during which the connection is predicted to degrade.” Which amount to observation, evaluation, and judgement.
Dependent claim 16 recites additional insignificant extra-solution activity “receiving, in the computing device, a request from the remote device to apply the input to the second portion of the artificial neural network” and “providing the output of the second portion of the artificial neural network to the remote device.” Which amounts to gathering and outputting data.  Claim 16 also recites additional mathematical calculations “performing, by the computing device, computation of the second portion of the artificial neural network to generate the output of the second portion of the artificial neural network”
Dependent claim 17 recites additional mathematical calculations “updating the second portion of the artificial neural network using a machine learning technique in a time period during which the first portion of the artificial neural network is being used in the application in the remote device”

Regarding Claim 18:  Claim 18 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 18 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 18 recites a computer implemented method of processing neural networks, which, under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing.
Therefore, claim 18 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 18 recites additional elements “a computing device”, “a remote device”, “a communication device” and “memory”. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Claim 18 also recites additional elements “performing learning by an autoencoder” which amounts to generally linking the judicial exception to a particular technology or field of use.  Therefore, claim 18 is directed to a judicial exception.
Step 2B Analysis:  Claim 18 recites additional elements “wherein the remote device is configured to provide, to the computing device, access to the local memory of the remote device” and “the computing device and the remote device are configured to collaborate in storing the artificial neural network and in processing based on the artificial neural network” which is well-understood, routine, and conventional (See MPEP 2106.05(d)(II)(iv) Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93; )) Two computing devices connected through a communication device is also well-known (See Murai “Design and Implementation of the S & T Net Software” (1986) [p. 1 §1] “The Keio S & T net [4] is a computer network constructed by a local area network using Acknowledging Ethernet [8, 91 and Tlnix… Well-known connection achieves communication for the entrance of each node pre-initialized by the datagram and is the communication format for the network management application program”).  
As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 18 amount to no more than mere instructions to apply the judicial exception using a generic computer component.
For the reasons above, claim 18 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to dependent claims 19-21. The additional limitations of the dependent claims are addressed briefly below:
Dependent claim 19 recites additional insignificant extra-solution activity “the computing device is configured to store the first portion of the artificial neural network and process input to the artificial neural network using the first portion of the artificial neural network;” and “the remote device is configured to store the second portion of the artificial neural network and process output from the first portion of the artificial neural network using the second portion of the artificial neural network” which amounts to gathering and outputting data.  Claim 19 also recites additional mathematical relationships and calculations “the artificial neural network is partitioned into a first portion and a second portion.”
Dependent claim 20 recites additional mathematical relationships and calculations “an alternative module is configured as a substitute of the second portion of the artificial neural network when a connection between the computing device and the remote device degrades”
Dependent claim 21 recites additional insignificant extra-solution activity “wherein the communication device of the computing device and the communication device of the remote device are connected via a fifth generation cellular network.” Which amounts to selection of a data-type (See MPEP 2106.05(g) for example Ameranth, 842 F.3d at 1241-43, 120 USPQ2d at 1854-55). 

Therefore, when considering the elements separately and in combination, they do not do not add significantly more to the inventive concept. Accordingly, claims 1-21 are rejected under 35 U.S.C. § 101. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.


	Claims 1-2, 4-6, 8, 10-11, and 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (US10764125B2) and in view of Burger (US20200265301A1). 

	Regarding claim 1, Zhang teaches A method implemented in a computing device, the method comprising: storing a first portion of an [artificial neural network] in local memory of the computing device, ([Col. 1 l. 46-63] "The distributed system includes a master node and a plurality of slave nodes, each slave node includes at least one parameter server and at least one parameter client, each parameter server stores a sub-model corresponding to the model, each parameter server stores a different sub-model, and the sub-model corresponding to the model is obtained by splitting the model" [Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training.  The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system." See FIG. 1A.  Slave node 1 interpreted as first slave node containing a first portion (sub-model) of the artificial neural network in local memory.)
	wherein a second portion of the [artificial neural network] in the computing device is stored in memory of a remote device, ([Col. 1 l. 46-63] "The distributed system includes a master node and a plurality of slave nodes, each slave node includes at least one parameter server and at least one parameter client, each parameter server stores a sub-model corresponding to the model, each parameter server stores a different sub-model, and the sub-model corresponding to the model is obtained by splitting the model" [Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training.  The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system." [Col. 11 l. 50-59] "If the trained master model 1 needs to be sent to a user, the first parameter server and the second parameter server may respectively send the sub-model 1 and the sub-model 2 to a master node 101, and the master node 101 integrates the sub-model 1 and the sub-model 2, and sends a model obtained through integration to the user." See FIG. 1A.  Slave node 2 interpreted as remote device containing a second portion (sub-model) of the artificial neural network in local memory.)
	executing an application in the computing device to generate an output of the first portion of the [artificial neural network], wherein the second portion of the artificial neural network is configured to receive the output of the first portion of the artificial neural network as input to generate an output of the second portion of the [artificial neural network]; ([Col. 1 l. 46-63] " The method includes: receiving, by a parameter server in a first slave node, a training result sent by a parameter client in at least one slave node in the distributed system; and updating, by the parameter server in the first slave node based on the received training result, a sub-model stored on the parameter server in the first slave node." Parameter server interpreted as an application in the computing device. Training result interpreted as synonymous with output of a first portion of the ANN. Updating based on received results interpreted as synonymous with applying the input to the second portion.)
	accessing, by the computing device, at least a portion of the memory of the remote device; and ([Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training.  The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system." Obtaining data from memory interpreted as synonymous with accessing at least a portion of the memory.)
	generating, in the computing device, a result corresponding to the output of the second portion of the artificial neural network. ([Col. 1 l. 46-63] "a parameter client in each slave node obtains a training result by executing a training task corresponding to a sub-model stored on a parameter server in the slave node.").
	However, Zhang does not explicitly teach and the remote device and the computing device are connected via a wired or wireless network connection;  

Burger, in the same field of endeavor, teaches that the machine learning model is specifically directed towards artificial neural networks and further teaches the remote device and the computing device are connected via a wired or wireless network connection; ([¶0031] "The client device 120 can include a server interface 150 which can be used to communicate with the server computer 110 using the API or other communication protocol. The network 140 can include a local area network (LAN), a Wide Area Network (WAN), the Internet, an intranet, a wired network, a wireless network, a cellular network, combinations thereof, or any network suitable for providing a channel for communication between the server computer 110 and the client device 120."). 

	Zhang and Burger are both directed towards distributed machine learning systems.  Therefore, Zhang and Burger are analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Zhang with the teachings of Burger by using an artificial neural network as the specific machine learning model in the system described by Zhang, and further ensuring that the connections between nodes as described by Zhang were wired or wireless. It would have been obvious to one of ordinary skill in the art that a connection would necessarily have to be either wired or wireless, which is reinforced by Burger.  It would also have been obvious to one of ordinary skill in the art that artificial neural networks are a well-known and commonly used form of machine learning, and it would be obvious to use artificial neural networks in a distributed machine learning system, which is also reinforced by Burger.  Burger provides as further motivation for combination ([¶0024] “This approach can potentially: improve an overall accuracy of the deployed DNN model; reduce a communication workload between edge devices and the server computer; reduce a cost of data labeling by reducing or minimizing redundancy and/or repetition in the training data; reduce a DNN retraining cost by only processing more informative samples; and recycle unlabeled data collected on the edge devices by looking for informative samples.”).  This motivation for combination also applies to the remaining claims which are dependent on this combination.  

	Regarding claim 2, the combination of Zhang, and Burger teaches The method of claim 1, wherein the accessing of the portion of the memory of the remote device includes accessing the second portion of the artificial neural network stored in the portion of the memory of the remote device; and the method further comprises: (Zhang [Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training. The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system.")
	applying, by the computing device, the input to the second portion of the artificial neural network to generate the output of the second portion of the artificial neural network. (Zhang [Col. 1 l. 46-63] " The method includes: receiving, by a parameter server in a first slave node, a training result sent by a parameter client in at least one slave node in the distributed system; and updating, by the parameter server in the first slave node based on the received training result, a sub-model stored on the parameter server in the first slave node." Parameter server interpreted as an application in the computing device. Training result interpreted as synonymous with output of a first portion of the ANN.  Updating based on received results interpreted as synonymous with applying the input to the second portion.). 

	Regarding claim 4, the combination of Zhang, and Burger teaches The method of claim 1, wherein the accessing of the portion of the memory of the remote device includes accessing an alternative module as a substitute of the second portion of the artificial neural network; and the result is generated by applying the input to the alternative module. (Burger [¶0049] "A network accelerator (such as the ML accelerators 424 and 524 in FIGS. 4 and 5, respectively) can be used to accelerate the computations of the DNN 200. As one example, the DNN 200 can be partitioned into different subgraphs that can be individually accelerated. As a specific example, each of the layers 210, 220, 230, and 240 can be a subgraph that is accelerated. The computationally expensive calculations of the layer can be performed using quantized floating-point and the less expensive calculations of the layer can be performed using normal-precision floating-point." Quantized accelerator of Burger interpreted as synonymous with alternative module as opposed to normal-precision second portion.). 

	Regarding claim 5, the combination of Zhang, and Burger teaches The method of claim 4, wherein the alternative module is configured to receive a user input in assisting computation of the result. (Burger [¶0080] " The user may indicate that the classification is incorrect by responding using the input device 502 (such as by correcting the classification using a keyboard or touchscreen). The quality analyzer 535 can mark the input data as data that was misclassified and the misclassified input data can be uploaded (with or without a correct label) using the upload logic 540 and server interface 542 to the server computer 110. The uploaded input data can be used to incrementally train the machine learning tool so that the model parameters 534 can be adjusted based on the new training data." alternative module interpreted as synonymous with machine learning tool.). 

	Regarding claim 6, the combination of Zhang, and Burger teaches The method of claim 4, wherein the alternative module includes a simplified model of the second portion of the artificial neural network. (Burger [¶0049] "A network accelerator (such as the ML accelerators 424 and 524 in FIGS. 4 and 5, respectively) can be used to accelerate the computations of the DNN 200. As one example, the DNN 200 can be partitioned into different subgraphs that can be individually accelerated. As a specific example, each of the layers 210, 220, 230, and 240 can be a subgraph that is accelerated. The computationally expensive calculations of the layer can be performed using quantized floating-point and the less expensive calculations of the layer can be performed using normal-precision floating-point." Quantized model interpreted as synonymous with simplified model of the second portion of the ANN.). 

	Regarding claim 8, the combination of Zhang, and Burger teaches The method of claim 1, further comprising: requesting, by the computing device, the remote device to apply the input to the second portion of the artificial neural network; (Zhang [Col. 3 l. 55-67] "the method may further include: receiving, by the master node by using the task management process, a request message sent by a first atomic task, where the request message is used to request to execute the first atomic task; and assigning, by the master node by using the task management process, first execution time to the first atomic task, and assigning, by using the task management process, the first atomic task to a first slave node for processing, where the first atomic task is any one of the atomic task used to train the model, and the first execution time is pre-estimated by the master node based on execution time of the first atomic task.")
	wherein accessing, by the computing device, at least a portion of the memory of the remote device includes accessing the output of the second portion of the artificial neural network generated on the remote device. (Zhang [Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training. The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system."). 

	Regarding claim 10, claim 10 is substantially similar to claim 1.  Therefore, the rejection applied to claim 1 also applies to claim 10.

	Regarding claim 11, the combination of Zhang, and Burger teaches The method of claim 10, wherein the remote device is configured to access the second portion of the artificial neural network stored in the memory of the computing device in applying the input to the second portion of the artificial neural network. (Zhang [Col. 1 l. 46-63] " The method includes: receiving, by a parameter server in a first slave node, a training result sent by a parameter client in at least one slave node in the distributed system; and updating, by the parameter server in the first slave node based on the received training result, a sub-model stored on the parameter server in the first slave node." Parameter server interpreted as an application in the computing device. Training result interpreted as synonymous with output of a first portion of the ANN. Updating based on received results interpreted as synonymous with applying the input to the second portion.). 

	Regarding claim 16, the combination of Zhang, and Burger teaches The method of claim 10, further comprising: receiving, in the computing device, a request from the remote device to apply the input to the second portion of the artificial neural network; (Zhang [Col. 3 l. 55-67] "the method may further include: receiving, by the master node by using the task management process, a request message sent by a first atomic task, where the request message is used to request to execute the first atomic task; and assigning, by the master node by using the task management process, first execution time to the first atomic task, and assigning, by using the task management process, the first atomic task to a first slave node for processing, where the first atomic task is any one of the atomic task used to train the model, and the first execution time is pre-estimated by the master node based on execution time of the first atomic task.")
	performing, by the computing device, computation of the second portion of the artificial neural network to generate the output of the second portion of the artificial neural network; and providing the output of the second portion of the artificial neural network to the remote device. (Zhang [Col. 1 l. 46-63] "The distributed system includes a master node and a plurality of slave nodes, each slave node includes at least one parameter server and at least one parameter client, each parameter server stores a sub-model corresponding to the model, each parameter server stores a different sub-model, and the sub-model corresponding to the model is obtained by splitting the model" [Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training.  The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system." [Col. 11 l. 50-59] "If the trained master model 1 needs to be sent to a user, the first parameter server and the second parameter server may respectively send the sub-model 1 and the sub-model 2 to a master node 101, and the master node 101 integrates the sub-model 1 and the sub-model 2, and sends a model obtained through integration to the user." See FIG. 1A.  Slave node 2 interpreted as remote device containing a second portion (sub-model) of the artificial neural network in local memory.). 

	Regarding claim 17, the combination of Zhang, and Burger teaches The method of claim 10, further comprising: updating the second portion of the artificial neural network using a machine learning technique in a time period during which the first portion of the artificial neural network is being used in the application in the remote device. (Zhang [Col. 1 l. 46-63] "The distributed system includes a master node and a plurality of slave nodes, each slave node includes at least one parameter server and at least one parameter client, each parameter server stores a sub-model corresponding to the model, each parameter server stores a different sub-model, and the sub-model corresponding to the model is obtained by splitting the model" [Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training.  The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system." [Col. 11 l. 50-59] "If the trained master model 1 needs to be sent to a user, the first parameter server and the second parameter server may respectively send the sub-model 1 and the sub-model 2 to a master node 101, and the master node 101 integrates the sub-model 1 and the sub-model 2, and sends a model obtained through integration to the user." See FIG. 1A.  Slave node 2 interpreted as remote device containing a second portion (sub-model) of the artificial neural network in local memory.). 

	Regarding claim 18, Zhang teaches a computing device having local memory and a communication device; and ([Col. 1 l. 46-63] "The distributed system includes a master node and a plurality of slave nodes, each slave node includes at least one parameter server and at least one parameter client, each parameter server stores a sub-model corresponding to the model, each parameter server stores a different sub-model, and the sub-model corresponding to the model is obtained by splitting the model" See FIG. 1A slave node 1 interpreted as computing device, slave node 2 interpreted as remote device.  See also FIG. 4 where computing device has local memory and I/O device (communication device).)
	a remote device having local memory and a communication device; ([Col. 1 l. 46-63] "The distributed system includes a master node and a plurality of slave nodes, each slave node includes at least one parameter server and at least one parameter client, each parameter server stores a sub-model corresponding to the model, each parameter server stores a different sub-model, and the sub-model corresponding to the model is obtained by splitting the model" See FIG. 1A slave node 1 interpreted as computing device, slave node 2 interpreted as remote device.  See also FIG. 4 where computing device has local memory and I/O device (communication device).)
	wherein the remote device is configured to provide, to the computing device, access to the local memory of the remote device; ([Col. 1 l. 46-63] "The method includes: receiving, by a parameter server in a first slave node, a training result sent by a parameter client in at least one slave node in the distributed system; and updating, by the parameter server in the first slave node based on the received training result, a sub-model stored on the parameter server in the first slave node." [Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training.  The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system.")
	wherein the computing device and the remote device are configured to collaborate in storing the artificial neural network and in processing based on the artificial neural network. ([Col. 1 l. 46-63] "The method includes: receiving, by a parameter server in a first slave node, a training result sent by a parameter client in at least one slave node in the distributed system; and updating, by the parameter server in the first slave node based on the received training result, a sub-model stored on the parameter server in the first slave node." [Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training.  The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system.").
	However, Zhang does not explicitly teach A computing system having an artificial neural network, the system comprising: 
	wherein the communication device of the computing device and the communication device of the remote device are configured to be connected via a wired or wireless network;  

Burger, in the same field of endeavor, teaches A computing system having an artificial neural network, the system comprising: ([Abstract] "Technology related to incremental training of machine learning tools is disclosed. In one example of the disclosed technology, a method can include receiving operational parameters of a machine learning tool based on a primary set of training data. The machine learning tool can be a deep neural network")
	wherein the communication device of the computing device and the communication device of the remote device are configured to be connected via a wired or wireless network; ([¶0031] "The client device 120 can include a server interface 150 which can be used to communicate with the server computer 110 using the API or other communication protocol. The network 140 can include a local area network (LAN), a Wide Area Network (WAN), the Internet, an intranet, a wired network, a wireless network, a cellular network, combinations thereof, or any network suitable for providing a channel for communication between the server computer 110 and the client device 120."). 

	Zhang and Burger are both directed towards distributed machine learning systems.  Therefore, Zhang and Burger are analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Zhang with the teachings of Burger by using an artificial neural network as the specific machine learning model in the system described by Zhang, and further ensuring that the connections between nodes as described by Zhang were wired or wireless. It would have been obvious to one of ordinary skill in the art that a connection would necessarily have to be either wired or wireless, which is reinforced by Burger.  It would also have been obvious to one of ordinary skill in the art that artificial neural networks are a well-known and commonly used form of machine learning, and it would be obvious to use artificial neural networks in a distributed machine learning system, which is also reinforced by Burger.  Burger provides as further motivation for combination ([¶0024] “This approach can potentially: improve an overall accuracy of the deployed DNN model; reduce a communication workload between edge devices and the server computer; reduce a cost of data labeling by reducing or minimizing redundancy and/or repetition in the training data; reduce a DNN retraining cost by only processing more informative samples; and recycle unlabeled data collected on the edge devices by looking for informative samples.”).  This motivation for combination also applies to the remaining claims which are dependent on this combination.  

	Regarding claim 19, the combination of Zhang, and Burger teaches The computing system of claim 18, wherein the artificial neural network is partitioned into a first portion and a second portion; (Zhang [Col. 1 l. 46-63] "The distributed system includes a master node and a plurality of slave nodes, each slave node includes at least one parameter server and at least one parameter client, each parameter server stores a sub-model corresponding to the model, each parameter server stores a different sub-model, and the sub-model corresponding to the model is obtained by splitting the model" Splitting the model interpreted as synonymous with partitioning into a first and second portion.)
	the computing device is configured to store the first portion of the artificial neural network and process input to the artificial neural network using the first portion of the artificial neural network; (Zhang [Col. 1 l. 46-63] "The distributed system includes a master node and a plurality of slave nodes, each slave node includes at least one parameter server and at least one parameter client, each parameter server stores a sub-model corresponding to the model, each parameter server stores a different sub-model, and the sub-model corresponding to the model is obtained by splitting the model" [Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training.  The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system." See FIG. 1A.  Slave node 1 interpreted as first slave node containing a first portion (sub-model) of the artificial neural network in local memory.)
	and the remote device is configured to store the second portion of the artificial neural network and process output from the first portion of the artificial neural network using the second portion of the artificial neural network. (Zhang [Col. 1 l. 46-63] "The distributed system includes a master node and a plurality of slave nodes, each slave node includes at least one parameter server and at least one parameter client, each parameter server stores a sub-model corresponding to the model, each parameter server stores a different sub-model, and the sub-model corresponding to the model is obtained by splitting the model" [Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training.  The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system." [Col. 11 l. 50-59] "If the trained master model 1 needs to be sent to a user, the first parameter server and the second parameter server may respectively send the sub-model 1 and the sub-model 2 to a master node 101, and the master node 101 integrates the sub-model 1 and the sub-model 2, and sends a model obtained through integration to the user." See FIG. 1A.  Slave node 2 interpreted as remote device containing a second portion (sub-model) of the artificial neural network in local memory.). 

	Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Zhang and Burger and in further view of Anh (US20180349313A1).

	Regarding claim 3, the combination of Zhang and Burger teaches The method of claim 2, further comprising: storing, by the computing device, the first portion of the artificial neural network in the memory of the remote device; (Zhang [Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training. The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system.")
	receiving, in the computing device from the remote device, the second portion of the artificial neural network; and (Zhang [Col. 8 l.57-65] "The memory 103 may be configured to store data required by the slave node 102 to perform training. The slave node 102 may directly obtain data from the corresponding memory 103 to perform training, and the data in the memory 103 may come from a file system.")
	However, the combination of Zhang and Burger does not explicitly teach mapping, in the computing device, a first virtual address region for accessing the first portion of the artificial neural network to the remote device; 
	mapping, in the computing device, a second virtual address region for accessing the first portion of the artificial neural network to the local memory of the computing device.  

Anh, in the same field of endeavor, teaches mapping, in the computing device, a first virtual address region for accessing the first portion of the artificial neural network to the remote device; ([¶0087] "As shown in FIG. 5, a master process 510 functions to create remote shared memory for a master parameter. Because it creates remote shared memory in a parameter server 530, the master process 510 may access all of the remote shared memory areas created by itself, and may enable worker processes 520 to access the master area by sending shared memory creation information thereto." [¶0088] "Meanwhile, each of the worker processes 520 may create a worker gradient parameter area for storing the result of training performed by itself, and may access the worker gradient parameter area created by itself. That is, the worker process 520 is not allowed to access the memory area of another worker process, but is allowed to access the master parameter area and a worker parameter area for storing the result of training performed by the corresponding worker process 520. For example, the X-th worker process 520_X may access the master parameter area and the X-th worker parameter area.").
	mapping, in the computing device, a second virtual address region for accessing the first portion of the artificial neural network to the local memory of the computing device. ([¶0087] "As shown in FIG. 5, a master process 510 functions to create remote shared memory for a master parameter. Because it creates remote shared memory in a parameter server 530, the master process 510 may access all of the remote shared memory areas created by itself, and may enable worker processes 520 to access the master area by sending shared memory creation information thereto." [¶0088] "Meanwhile, each of the worker processes 520 may create a worker gradient parameter area for storing the result of training performed by itself, and may access the worker gradient parameter area created by itself. That is, the worker process 520 is not allowed to access the memory area of another worker process, but is allowed to access the master parameter area and a worker parameter area for storing the result of training performed by the corresponding worker process 520. For example, the X-th worker process 520_X may access the master parameter area and the X-th worker parameter area."). 

	Zhang, Burger, and Anh are all directed towards distributed machine learning systems.  Therefore, Zhang, Burger, and Anh are all analogous art in the same field of endeavor. It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Zhang and Burger with the teachings of Anh by mapping the portions of the neural network to virtual memory addresses. Anh provides as motivation for combination ([¶0014] “A further object of the present invention is to improve communication performance, compared to the method in which the parameters are interchanged through a communication method using message transmission, and to maximize the utilization of computation resources, which are idle while parameters are being sent and received.”).   

Claims 7, 9, 12-15, and 20-21 are rejected under U.S.C. §103 as being unpatentable over the combination of Zhang and Burger and Sundstrom (US20200401944A1).

	Regarding claim 7, the combination of Zhang and Burger teaches The method of claim 4
	However, the combination of Zhang and Burger teaches does not explicitly teach, further comprising: receiving the alternative module in response to a prediction of degradation in the wired or wireless network connection.  

Sundstrom, in the same field of endeavor, teaches The method of claim 4, further comprising: receiving the alternative module in response to a prediction of degradation in the wired or wireless network connection. ([¶0069] "A model running on the cloud server 130 may be configured to make a final decision upon escalation. A control function 410 is connected to each distributed node system...a decision whether to escalate or not is determined by a configuration at each level, as provided by the control function... If the ML control function e.g. determines that too much LTE bandwidth is being used, the control function may adjust an escalation threshold value in the gateway node 110 to reduce bandwidth utilization" Determining too much LTE bandwidth is being used is interpreted as synonymous with predicting degradation in the wireless network connection.  Escalating the network task interpreted as synonymous with receiving the alternative module.). 

	Zhang, Burger, and Sundstrom are all directed towards distributed machine learning systems.  Therefore, Zhang, Burger, and Sundstrom are all analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Zhang and Burger with the teachings of Sundstrom by detecting network connectivity degradation and basing neural network computation decisions on the result of the detected degradation. Sundstrom provides as an additional motivation for combination ([¶0070] “the system, node and method as proposed herein will improve upon a state of the system by utilizing an overall cost function optimized in a control function, which takes input from all nodes of the system. This provides a benefit over the state of the art procedure in which decisions and threshold setting are done in a pure hierarchical manner between nearest nodes”).  This motivation for combination also applies to the remaining claims which depend on this combination.

	Regarding claim 9, the combination of Zhang and Burger teaches The method of claim 1.
	However, the combination of Zhang and Burger does not explicitly teach detecting degradation in the wired or wireless network connection; and 
	in response to the degradation, skipping in the application computation involving the second portion of the artificial neural network.  

	in response to the degradation, skipping in the application computation involving the second portion of the artificial neural network.  

Sundstrom, in the same field of endeavor, teaches The method of claim 1, further comprising: detecting degradation in the wired or wireless network connection; and ([¶0069] "A model running on the cloud server 130 may be configured to make a final decision upon escalation. A control function 410 is connected to each distributed node system...a decision whether to escalate or not is determined by a configuration at each level, as provided by the control function... If the ML control function e.g. determines that too much LTE bandwidth is being used, the control function may adjust an escalation threshold value in the gateway node 110 to reduce bandwidth utilization").
	in response to the degradation, skipping in the application computation involving the second portion of the artificial neural network. ([¶0006] "FIG. 1 illustrates such a concept for enhancing computation resources, where each box indicates a compute node. The system allows for a node to carry out a compute task, or to escalate the task to a hierarchically higher node." Choosing to escalate a task on a higher node instead of performing it interpreted as synonymous with skipping in the application computation involving the second portion of the ANN.). 

	Zhang, Burger, and Sundstrom are all directed towards distributed machine learning systems.  Therefore, Zhang, Burger, and Sundstrom are all analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Zhang and Burger with the teachings of Sundstrom by detecting network connectivity degradation and basing neural network computation decisions on the result of the detected degradation. Sundstrom provides as an additional motivation for combination ([¶0070] “the system, node and method as proposed herein will improve upon a state of the system by utilizing an overall cost function optimized in a control function, which takes input from all nodes of the system. This provides a benefit over the state of the art procedure in which decisions and threshold setting are done in a pure hierarchical manner between nearest nodes”).  This motivation for combination also applies to the remaining claims which depend on this combination.

	Regarding claim 12, the combination of Zhang and Burger teaches The method of claim 10, further comprising: transmitting to the remote device an alternative module as a substitute of the second portion of the artificial neural network, (Burger [¶0049] "A network accelerator (such as the ML accelerators 424 and 524 in FIGS. 4 and 5, respectively) can be used to accelerate the computations of the DNN 200. As one example, the DNN 200 can be partitioned into different subgraphs that can be individually accelerated. As a specific example, each of the layers 210, 220, 230, and 240 can be a subgraph that is accelerated. The computationally expensive calculations of the layer can be performed using quantized floating-point and the less expensive calculations of the layer can be performed using normal-precision floating-point. " ML accelerator 524 in Burger is part of client (remote) device 500.  Quantized floating-point precision version of model interpreted as alternative module.). 
	However, the combination of Zhang and Burger does not explicitly teach the remote device is configured to replace processing by the second portion of the artificial neural network with processing by the alternative module when the connection degrades.  

Sundstrom, in the same field of endeavor, teaches wherein the remote device is configured to replace processing by the second portion of the artificial neural network with processing by the alternative module when the connection degrades. ([¶0069] "A model running on the cloud server 130 may be configured to make a final decision upon escalation. A control function 410 is connected to each distributed node system...a decision whether to escalate or not is determined by a configuration at each level, as provided by the control function... If the ML control function e.g. determines that too much LTE bandwidth is being used, the control function may adjust an escalation threshold value in the gateway node 110 to reduce bandwidth utilization" [¶0006] "FIG. 1 illustrates such a concept for enhancing computation resources, where each box indicates a compute node. The system allows for a node to carry out a compute task, or to escalate the task to a hierarchically higher node." Determining too much LTE bandwidth is being used is interpreted as synonymous with predicting degradation in the wireless network connection.  Escalating the network task interpreted as synonymous with receiving the alternative module. Choosing to escalate a task on a higher node instead of performing it interpreted as synonymous with replacing processing by the second portion of the ANN with the alternative module.). 

	Zhang, Burger, and Sundstrom are all directed towards distributed machine learning systems.  Therefore, Zhang, Burger, and Sundstrom are all analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Zhang and Burger with the teachings of Sundstrom by detecting network connectivity degradation and basing neural network computation decisions on the result of the detected degradation. Sundstrom provides as an additional motivation for combination ([¶0070] “the system, node and method as proposed herein will improve upon a state of the system by utilizing an overall cost function optimized in a control function, which takes input from all nodes of the system. This provides a benefit over the state of the art procedure in which decisions and threshold setting are done in a pure hierarchical manner between nearest nodes”).  This motivation for combination also applies to the remaining claims which depend on this combination.

	Regarding claim 13, the combination of Zhang, Burger, and Sundstrom teaches The method of claim 12, wherein the alternative module includes a simplified model of the second portion of the artificial neural network. (Burger [¶0049] "A network accelerator (such as the ML accelerators 424 and 524 in FIGS. 4 and 5, respectively) can be used to accelerate the computations of the DNN 200. As one example, the DNN 200 can be partitioned into different subgraphs that can be individually accelerated. As a specific example, each of the layers 210, 220, 230, and 240 can be a subgraph that is accelerated. The computationally expensive calculations of the layer can be performed using quantized floating-point and the less expensive calculations of the layer can be performed using normal-precision floating-point." Quantized model interpreted as synonymous with simplified model of the second portion of the ANN.). 

	Regarding claim 14, the combination of Zhang, Burger, and Sundstrom teaches The method of claim 13, further comprising: generating the simplified model of the second portion of the artificial neural network based on a usage pattern of the artificial neural network in the application. (Burger [¶0049] "A network accelerator (such as the ML accelerators 424 and 524 in FIGS. 4 and 5, respectively) can be used to accelerate the computations of the DNN 200. As one example, the DNN 200 can be partitioned into different subgraphs that can be individually accelerated. As a specific example, each of the layers 210, 220, 230, and 240 can be a subgraph that is accelerated. The computationally expensive calculations of the layer can be performed using quantized floating-point and the less expensive calculations of the layer can be performed using normal-precision floating-point." Quantizing based on computational complexity is interpreted as synonymous with generating a simplified model based on a usage pattern of the ANN in the application.). 

	Regarding claim 15, the combination of Zhang, Burger, and Sundstrom teaches The method of claim 14, further comprising: predicting the usage pattern for a time period during which the connection is predicted to degrade. (Sundstrom [¶0069] "If the ML control function e.g. determines that too much LTE bandwidth is being used, the control function may adjust an escalation threshold value in the gateway node 110 to reduce bandwidth utilization."). 

	Regarding claim 20, the combination of Zhang and Burger teaches The computing system of claim 18.
	However, the combination of Zhang and Burger does not explicitly teach an alternative module is configured as a substitute of the second portion of the artificial neural network when a connection between the computing device and the remote device degrades.  

Sundstrom, in the same field of endeavor, teaches an alternative module is configured as a substitute of the second portion of the artificial neural network when a connection between the computing device and the remote device degrades. ([¶0069] "A model running on the cloud server 130 may be configured to make a final decision upon escalation. A control function 410 is connected to each distributed node system...a decision whether to escalate or not is determined by a configuration at each level, as provided by the control function... If the ML control function e.g. determines that too much LTE bandwidth is being used, the control function may adjust an escalation threshold value in the gateway node 110 to reduce bandwidth utilization" [¶0006] "FIG. 1 illustrates such a concept for enhancing computation resources, where each box indicates a compute node. The system allows for a node to carry out a compute task, or to escalate the task to a hierarchically higher node." Determining too much LTE bandwidth is being used is interpreted as synonymous with predicting degradation in the wireless network connection.  Escalating the network task interpreted as synonymous with receiving the alternative module. Choosing to escalate a task on a higher node instead of performing it interpreted as synonymous with replacing processing by the second portion of the ANN with the alternative module.). 

Zhang, Burger, and Sundstrom are all directed towards distributed machine learning systems.  Therefore, Zhang, Burger, and Sundstrom are all analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Zhang and Burger with the teachings of Sundstrom by detecting network connectivity degradation and basing neural network computation decisions on the result of the detected degradation. Sundstrom provides as an additional motivation for combination ([¶0070] “the system, node and method as proposed herein will improve upon a state of the system by utilizing an overall cost function optimized in a control function, which takes input from all nodes of the system. This provides a benefit over the state of the art procedure in which decisions and threshold setting are done in a pure hierarchical manner between nearest nodes”).  This motivation for combination also applies to the remaining claims which depend on this combination.

	Regarding claim 21, the combination of Zhang and Burger teaches The computing system of claim 18.
	However, the combination of Zhang and Burger does not explicitly teach the communication device of the computing device and the communication device of the remote device are connected via a fifth generation cellular network.  

Sundstrom, in the same field of endeavor, teaches the communication device of the computing device and the communication device of the remote device are connected via a fifth generation cellular network. ([¶0055] "The network interfaces 306, 307 may also be different, configured to use different bearers of different communication technologies, such as ZigBee, BLE (Bluetooth Low Energy), WiFi, D2D LTE under 3GPP specifications, 3GPP LTE, MTC, NB-IoT, 5G New Radio (NR), and wired connection technologies."). 

Zhang, Burger, and Sundstrom are all directed towards distributed machine learning systems.  Therefore, Zhang, Burger, and Sundstrom are all analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Zhang and Burger with the teachings of Sundstrom by detecting network connectivity degradation and basing neural network computation decisions on the result of the detected degradation. Sundstrom provides as an additional motivation for combination ([¶0070] “the system, node and method as proposed herein will improve upon a state of the system by utilizing an overall cost function optimized in a control function, which takes input from all nodes of the system. This provides a benefit over the state of the art procedure in which decisions and threshold setting are done in a pure hierarchical manner between nearest nodes”).  This motivation for combination also applies to the remaining claims which depend on this combination.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Chilimbi (US20150324690A1) and Byers (US20200219007A1) are both directed towards distributed neural network systems.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720.  The examiner can normally be reached on M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/SB/Examiner, Art Unit 2124                                                                                                                                                                                                        

/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124