DETAILED ACTION
This action is in response to the application filed 16/051,792 which claims priority to PRO 62/539613 filed 08/01/2017. Claims 1-6, 9-22, and 25-28 are pending and have been considered. 
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statements (IDSs) submitted on 10/24/2018, 12/23/2019, 09/05/2020, and 09/09/2020 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-6, 9-22, and 25-28 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding claim 1, 
Step 1 Analysis: Claim 1 is directed to a process, which falls within one of the four statutory categories. 
Step 2A Prong 1 Analysis: Claim 1 recites, in part, applying the first data group to a first localized machine learning construct, determining a first set of convolutional layers within the first localized machine learning construct…, adjudicating similarity between the first localized machine learning construct and a second localized machine learning construct, and analyzing a second data group by the second localized machine learning construct. The limitations of applying the first data group to a first localized machine learning construct, determining a first set of convolutional layers within the first localized machine learning construct…, adjudicating similarity between the first localized machine learning construct and a second localized machine learning construct, and analyzing a second data group by the second localized machine learning construct, as drafted, are processes that, under broadest reasonable interpretation, covers performance of the limitation in the mind or pen and paper, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements – “first localized machine learning construct”, “second machine learning construct”, “first set of convolutional layers”, and “first data flow graph machine”. These elements that are recited are only generally linked to the judicial exception. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim further recites: obtaining a first data group in a first locality and sending the first set of convolutional layers to the second localized machine learning construct, based on the similarity that was adjudicated meeting a threshold. These limitations are considered insignificant extra solution activities and thus the judicial exception is not integrated into a practical application. The claim as a whole is directed to an abstract idea. 
Step 2B Analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of utilizing a first localized machine learning construct, second machine learning construct, first set of convolutional layers, and first data flow graph machine to perform the steps of the claimed process amount to no more than generally linking the elements to the judicial exception. Additionally, the limitation of obtaining a first data group in a first locality is well-understood, routine, and conventional, as evidenced by MPEP §2106.05(d)(II)(iv), “Storing and retrieving information in memory”. Furthermore, the limitation of sending the first set of convolutional layers to the second localized machine learning construct, based on the similarity that was adjudicated meeting a threshold is well-understood, routine, and conventional, as evidenced by MPEP §2106.05(d)(II)(I), “transmitting data over a network”. These limitations therefore remain insignificant extra-solution activity even upon reconsideration, and do not amount to significantly more. Even when considered in combination, these additional elements amount to generally linking the elements to the judicial exception and insignificant extra-solution activity, which cannot provide an inventive concept. The claim is not patent eligible.  

Regarding claim 2, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein the similarity is adjudicated based on machine learning construct context for the first localized machine learning construct and the second localized machine learning construct. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does recite the additional element of “machine learning construct context”, however it does not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 1 above. The claim is not patent eligible.

Regarding claim 3, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein the threshold is updated based on the analyzing a second group of data by the second localized machine learning construct. This limitation recites additional mental steps in addition to the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Regarding claim 4, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein the first localized machine learning construct comprises a first retail establishment. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Regarding claim 5, the rejection of claim 4 is further incorporated, and further, the claim recites: wherein the second localized machine learning construct comprises a second retail establishment. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Regarding claim 6, the rejection of claim 4 is further incorporated, and further, the claim recites: wherein the analyzing comprises determining a sales recommendation for a retail establishment associated with the second localized machine learning construct. This limitation recites additional mental steps in addition to the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Regarding claim 9, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein the first localized machine learning construct comprises a first vehicle. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Regarding claim 10, the rejection of claim 9 is further incorporated, and further, the claim recites: wherein the second localized machine learning construct comprises a second vehicle. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Regarding claim 11, the rejection of claim 10 is further incorporated, and further, the claim recites: further comprising transferring descriptors for the first set of convolutional layers using a mesh network comprising the first vehicle and the second vehicle. This limitation is an insignificant extra-solution activity and thus the judicial exception is not integrated into a practical application. The claim as a whole is directed to an abstract idea.
The claim does not include any additional elements that amount to significantly more than the judicial exception. This limitation is just a nominal or tangential addition to the claim, and is also well-understood, routine and conventional as evidenced by MPEP §2106.05(d)(II)(I), “transmitting data over a network”. This limitation therefore remains insignificant extra-solution activity even upon reconsideration, and does not amount to significantly more. Even when considered in combination, this additional element represents an insignificant extra-solution activity which cannot provide an inventive concept. The claim is not patent eligible. 

Regarding claim 12, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein the second localized machine learning construct comprises a second data flow graph machine. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 13, the rejection of claim 12 is further incorporated, and further, the claim recites: further comprising augmenting learning from the first localized machine learning construct by the second localized machine learning construct. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 14, the rejection of claim 13 is further incorporated, and further, the claim recites: wherein the augmenting learning is accomplished using a second group of data obtained within the second localized machine learning construct. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 15, the rejection of claim 13 is further incorporated, and further, the claim recites: further comprising sending results of the augmenting learning to a third machine learning construct. This limitation is an insignificant extra-solution activity and thus the judicial exception is not integrated into a practical application. The claim as a whole is directed to an abstract idea.
The claim does not include any additional elements that amount to significantly more than the judicial exception. This limitation is just a nominal or tangential addition to the claim, and is also well-understood, routine and conventional as evidenced by MPEP §2106.05(d)(II)(I), “transmitting data over a network”. This limitation therefore remains insignificant extra-solution activity even upon reconsideration, and does not amount to significantly more. Even when considered in combination, this additional element represents an insignificant extra-solution activity which cannot provide an inventive concept. The claim is not patent eligible.

Regarding claim 16, the rejection of claim 15 is further incorporated, and further, the claim recites: further comprising analyzing a third data group by the third machine learning construct using the results of the augmenting learning. This limitation recites additional mental steps in addition to the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Regarding claim 17, the rejection of claim 12 is further incorporated, and further, the claim recites: wherein the first localized machine learning construct comprises a convolutional neural net. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 18, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein the determining the first set of convolutional layers comprises machine learning. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 19, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein the determining further comprises determining a first set of max pooling layers. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 20, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein the determining further comprises determining a first set of hidden layers. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 21, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein the determining further comprises determining a first set of weights. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 22, the rejection of claim 21 is further incorporated, and further, the claim recites: wherein the determining the first set of weights is accomplished using forward propagation and backward propagation. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 25, the rejection of claim 1 is further incorporated, and further, the claim recites: further comprising applying a fourth data group to the second localized machine learning construct. This limitation recites additional mental steps in addition to the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.
Regarding claim 26, the rejection of claim 25 is further incorporated, and further, the claim recites: further comprising determining a second set of convolutional layers on the second localized machine learning construct using the fourth data group. This limitation recites additional mental steps in addition to the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 27, 
Step 1 Analysis: Claim 27 is directed to a process, which falls within one of the four statutory categories. 
Step 2A Prong 1 Analysis: Claim 27 recites, in part, applying the first data group to a first localized machine learning construct, determining a first set of convolutional layers within the first localized machine learning construct…, adjudicating similarity between the first localized machine learning construct and a second localized machine learning construct, and analyzing a second data group by the second localized machine learning construct. The limitations of applying the first data group to a first localized machine learning construct, determining a first set of convolutional layers within the first localized machine learning construct…, adjudicating similarity between the first localized machine learning construct and a second localized machine learning construct, and analyzing a second data group by the second localized machine learning construct, as drafted, are processes that, under broadest reasonable interpretation, covers performance of the limitation in the mind or pen and paper, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements – “first localized machine learning construct”, “second machine learning construct”, “first set of convolutional layers”, and “first data flow graph machine”. These elements that are recited are only generally linked to the judicial exception. Additionally, the claim recites “a computer program product”, “a non-transitory computer readable medium”, and “one or more processors”. Thus, the elements in the claim are recited at a high level generality (i.e. as a generic processor performing a generic computer function of generating an index) such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim further recites: obtaining a first data group in a first locality and sending the first set of convolutional layers to the second localized machine learning construct, based on the similarity that was adjudicated meeting a threshold. These limitations are considered insignificant extra solution activities and thus the judicial exception is not integrated into a practical application. The claim as a whole is directed to an abstract idea. 
Step 2B Analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of utilizing a first localized machine learning construct, second machine learning construct, first set of convolutional layers, and first data flow graph machine to perform the steps of the claimed process amount to no more than generally linking the elements to the judicial exception. Furthermore, the computer program product, a non-transitory computer readable medium, and one or more processors amount to no more than mere instructions to apply the exception using a generic computer component. Additionally, the limitation of obtaining a first data group in a first locality is well-understood, routine, and conventional, as evidenced by MPEP §2106.05(d)(II)(iv), “Storing and retrieving information in memory”. Furthermore, the limitation of sending the first set of convolutional layers to the second localized machine learning construct, based on the similarity that was adjudicated meeting a threshold is well-understood, routine, and conventional, as evidenced by MPEP §2106.05(d)(II)(I), “transmitting data over a network”. These limitations therefore remain insignificant extra-solution activity even upon reconsideration, and do not amount to significantly more. Even when considered in combination, these additional elements amount to generally linking the elements to the judicial exception, mere instructions to apply the exception using a generic computer component, and insignificant extra-solution activity, which cannot provide an inventive concept. The claim is not patent eligible.

Regarding claim 28, 
Step 1 Analysis: Claim 28 is directed to a process, which falls within one of the four statutory categories. 
Step 2A Prong 1 Analysis: Claim 28 recites, in part, applying the first data group to a first localized machine learning construct, determining a first set of convolutional layers within the first localized machine learning construct…, adjudicating similarity between the first localized machine learning construct and a second localized machine learning construct, and analyzing a second data group by the second localized machine learning construct. The limitations of applying the first data group to a first localized machine learning construct, determining a first set of convolutional layers within the first localized machine learning construct…, adjudicating similarity between the first localized machine learning construct and a second localized machine learning construct, and analyzing a second data group by the second localized machine learning construct, as drafted, are processes that, under broadest reasonable interpretation, covers performance of the limitation in the mind or pen and paper, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements – “first localized machine learning construct”, “second machine learning construct”, “first set of convolutional layers”, and “first data flow graph machine”. These elements that are recited are only generally linked to the judicial exception. Additionally, the claim recites “a computer system”, “a memory”, and “one or more processors”. Thus, the elements in the claim are recited at a high level generality (i.e. as a generic processor performing a generic computer function of generating an index) such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim further recites: obtaining a first data group in a first locality and sending the first set of convolutional layers to the second localized machine learning construct, based on the similarity that was adjudicated meeting a threshold. These limitations are considered insignificant extra solution activities and thus the judicial exception is not integrated into a practical application. The claim as a whole is directed to an abstract idea. 
Step 2B Analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of utilizing a first localized machine learning construct, second machine learning construct, first set of convolutional layers, and first data flow graph machine to perform the steps of the claimed process amount to no more than generally linking the elements to the judicial exception. Furthermore, the computer system, memory, and one or more processors amount to no more than mere instructions to apply the exception using a generic computer component. Additionally, the limitation of obtaining a first data group in a first locality is well-understood, routine, and conventional, as evidenced by MPEP §2106.05(d)(II)(iv), “Storing and retrieving information in memory”. Furthermore, the limitation of sending the first set of convolutional layers to the second localized machine learning construct, based on the similarity that was adjudicated meeting a threshold is well-understood, routine, and conventional, as evidenced by MPEP §2106.05(d)(II)(I), “transmitting data over a network”. These limitations therefore remain insignificant extra-solution activity even upon reconsideration, and do not amount to significantly more. Even when considered in combination, these additional elements amount to generally linking the elements to the judicial exception, mere instructions to apply the exception using a generic computer component, and insignificant extra-solution activity, which cannot provide an inventive concept. The claim is not patent eligible.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-3, 12-18, 20-22, and 25-28 are rejected under 35 U.S.C. 103 as being unpatentable over Gupta et al. ("US 10338931 B2", hereinafter "Gupta") in view of Chilimbi et al. ("US 20150324690 A1" cited by Applicant in the IDS filed 10/24/2018, hereinafter "Chilimbi").


Regarding claim 1, Gupta teaches A computer-implemented method for data analysis comprising:
obtaining a first data group in a first locality (“For example, in one embodiment, processing components in a deep learning system (e.g., a machine learning system) can each receive a set of inputs and collectively generate an output based on the set of inputs.” [col 4, lines 38-41; See further “In yet another aspect, the assignment component 104 can form a group of processing components from the processing components 102.sub.1-N based on a location (e.g., a locality) of a particular processing component with respect to other processing components 102.sub.1-N in the system 100. A location of the processing component can be indicated, for example, based on an identity of a particular processor in which the processing component resides” [col 9, lines 58-65]); 
applying the first data group to a first localized machine learning construct (“Each of processing components 102.sub.1-N (or, in some embodiments, one or more of the processing components 102.sub.1-N) can be or include a processing engine that processes data associated with a machine learning process (e.g., a deep learning process).” [col 6, lines 29-33; See further: “A machine learning process associated with processing components 102.sub.1-N can include, but is not limited to, a deep Boltzmann machine algorithm, a deep belief network algorithm, a convolution neural network algorithm, a stacked auto-encoder algorithm, etc. In certain embodiments, one or more of the processing components 102.sub.1-N can be an artificial intelligence component.” [col 6, lines 40-46; Examiner is interpreting a localized machine learning construct to be equivalent to a processing component at a remote location.]); 
adjudicating similarity between the first localized machine learning construct and a second localized machine learning construct (“Alternatively, the defined criterion associated with the third group can be different than (e.g., distinct from) the defined criterion associated with the first group and/or the second group. The defined criterion can be associated with the data 108, the updated output data generated by the processing component 102.sub.1, the processing component 102.sub.1 and/or the other processing component from the processing components 102.sub.1-N.” [col 11, lines 46-53]); 
sending communication data to the second localized machine learning construct (“In response to determination of a second group by the assignment component 104, the processing component 102.sub.1 and the processing component 102.sub.3 can exchange data. For example, the processing component 102.sub.1 can transmit the updated output data to the processing component 102.sub.3. Furthermore, the processing component 102.sub.3 can transmit the data (e.g., communication data) to processing component 102.sub.1.” [col 11, 23-30; note: Gupta discloses the data that is transmitted can be data that is associated with a deep learning training process. (col, 18, lines 51-58)], based on the similarity that was adjudicated meeting a threshold (“The assignment component 104 can repeatedly form groups within the processing components 102.sub.1-N until a defined criterion associated with the data 108 and/or the processing components 102.sub.1-N is satisfied. By way of example, but not limitation, the defined criterion can be a number of groups formed by the assignment component 104 reaching a defined value or an error value (e.g., a training error value, etc.) associated with the data 108 reaching a defined value. Accordingly, collaborative groups of processing components can be dynamically synchronized by the assignment component 104 for parallel learning during a deep learning process.” [col 10, lines 15-26]); and 
analyzing a second data group by the second localized machine learning construct using the first set of convolutional layers (“A subsequent act for the deep learning process can be a subsequent processing act that is performed by a processing components 102.sub.1-N after a first process act associated with the data 108. For example, processing component 102.sub.1 can generate other output data based on the first portion of the data 108 and/or data received from at least one other processing component (e.g., during a subsequent act for the deep learning process), processing component 102.sub.2 can generate other output data based on the second portion of the data 108 and/or data received from at least one other processing component (e.g., during the subsequent act for the deep learning process), etc.” [col 11, lines 4-15; note: This process is equivalent to “analyzing a second data group”. A first set of convolutional layers is taught by Chilimbi as cited below.]).
	While Gupta teaches using a convolutional neural network and sending communication data to the second localized machine learning construct, the reference doesn’t go into details of determining a first set of convolutional neural network layers and sending a first set of convolutional layers to a second localized machine learning construct.
	Chilimbi teaches determining a first set of convolutional layers within the first localized machine learning construct based on the first data group wherein the first set of convolutional layers comprises a first data flow graph machine (“Convolutional neural networks may represent a class of neural networks that are biologically inspired by early work on the visual cortex. Neurons in a layer may be connected to spatially local neurons in the next layer modeling local visual receptive fields. In addition, these connections may share weights which allows for feature detection regardless of position in the visual field. The weight sharing may also reduce the number of free parameters to be learned and consequently these models are easier to train compared to similar size networks where neurons in a layer are fully connected to every neuron in a neighboring layer… Visual tasks may leverage large scale neural networks for learning visual features. Recent work has demonstrated that DNNs comprised of convolutional layers (e.g., 5 convolutional layers) for learning visual features followed by fully connected layers (e.g., 3 fully connected layers) for combining these learned features to make a classification decision may achieve state-of-the-art performance on visual object recognition tasks.” [¶0030-¶0031; note: Examiner is interpreting a first data flow graph machine to be equivalent to a convolutional neural network.])
	sending a first set of convolutional layers to a second localized machine learning construct (“In at least one embodiment of model training, data values may be communicated across neuron layers. Since the model is partitioned across multiple machines (e.g., Machine 1, Machine 2, etc.) within each replica (e.g., 704A, 704N, etc.) some of this communication may be non-local.” [¶0059; See further ¶0064-¶0066 discloses further details regarding sending convolutional layers to different machines.]) .
	Gupta and Chilimbi are both in the same field of endeavor of distributed deep learning. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Gupta’s teachings by substituting the communication data of Gupta with the convolutional layers taught by Chilimbi. One would have been motivated to send a first set of convolutional layers to another localized machine learning construct in order to receive and update weight values asynchronously to reduce communication traffic volume and offload additional computations. [¶0065-¶0066, Chilimbi]

Regarding claim 2, Gupta/Chilimbi teaches The method of claim 1 where Gupta teaches wherein the similarity is adjudicated based on machine learning construct context for the first localized machine learning construct and the second localized machine learning construct (“Thus, the first two processing components to finish processing their respective portions of data 108 can be grouped together in some embodiments. Additionally or alternatively, the first criterion and/or the second criterion can be associated with a processor location of the processing component 102.sub.2 with respect to the processing component 102.sub.3, associated with defined data (e.g., a defined group list) for the processing component 102.sub.2 and the processing component 102.sub.3 that is determined prior to processing of the data 108 by the processing component 102.sub.2 and the processing component 102.sub.3 or based on any number of factors.” [col 13, lines 55-65]).

Regarding claim 3, Gupta/Chilimbi teaches The method of claim 1 where Gupta teaches wherein the threshold is updated based on the analyzing a second group of data by the second localized machine learning construct (“Then, the assignment component 104 can assign at least the processing component 102.sub.1 and another processing component from the processing components 102.sub.1-N to a third group based on a defined criterion. The defined criterion associated with the third group can correspond to the defined criterion associated with the first group and/or the second group. Alternatively, the defined criterion associated with the third group can be different than (e.g., distinct from) the defined criterion associated with the first group and/or the second group. The defined criterion can be associated with the data 108, the updated output data generated by the processing component 102.sub.1, the processing component 102.sub.1 and/or the other processing component from the processing components 102.sub.1-N. In response to formation of the third group by the assignment component 104, the processing component 102.sub.1 and the other processing component from the processing components 102.sub.1-N can exchange data, etc.” [col 11, lines 40-56]).

Regarding claim 12, Gupta/Chilimbi teaches The method of claim 1 where Gupta teaches wherein the second localized machine learning construct comprises a second data flow graph machine (“A machine learning process associated with processing components 102.sub.1-N can include, but is not limited to, a deep Boltzmann machine algorithm, a deep belief network algorithm, a convolution neural network algorithm, a stacked auto-encoder algorithm, etc. In certain embodiments, one or more of the processing components 102.sub.1-N can be an artificial intelligence component.” [col 6, lines 40-46]).

Regarding claim 13, Gupta/Chilimbi teaches The method of claim 12 where Gupta teaches further comprising augmenting learning from the first localized machine learning construct by the second localized machine learning construct (“During a first act for a deep learning process, the processing components can generate respective output data based on processing the respective portions of the data 108 received by the processing components. After performing the first act for the deep learning process, one or more of the (or, in some embodiments, each of the) processing components 102.sub.1-N can store respective output data in one or more respective memories operatively coupled to the processing components 102.sub.1-N. In an example, the processing component 102.sub.1 can generate output data based on the first portion of the data 108 (e.g., during the first act for the deep learning process), the processing component 102.sub.2 can generate output data based on the second portion of the data 108 (e.g., during the first act for the deep learning process), etc.” [col 8, lines 55-67; Examiner is interpreting augmenting learning as equivalent to a deep learning process.]).

Regarding claim 14, Gupta/Chilimbi teaches The method of claim 13  where Gupta teaches wherein the augmenting learning is accomplished using a second group of data obtained within the second localized machine learning construct (“(“During a first act for a deep learning process, the processing components can generate respective output data based on processing the respective portions of the data 108 received by the processing components. After performing the first act for the deep learning process, one or more of the (or, in some embodiments, each of the) processing components 102.sub.1-N can store respective output data in one or more respective memories operatively coupled to the processing components 102.sub.1-N. In an example, the processing component 102.sub.1 can generate output data based on the first portion of the data 108 (e.g., during the first act for the deep learning process), the processing component 102.sub.2 can generate output data based on the second portion of the data 108 (e.g., during the first act for the deep learning process), etc.” [col 8, lines 55-67]).

Regarding claim 15, Gupta/Chilimbi teaches The method of claim 13 where Gupta teaches further comprising sending results of the augmenting learning to a third machine learning construct (“One or more of the processing components 102.sub.1-5 can receive portions of the data 108 (e.g., the portions of the data 108 can be the same or different in size and/or content) via the network 106 during a deep learning process associated with the data 108. For example, the processing component 102.sub.1 can receive a first portion of the data 108, the processing component 102.sub.2 can receive a second portion of the data 108, the processing component 102.sub.3 can receive a third portion of the data 108” [col 13, lines 5-13]).

Regarding claim 16, Gupta/Chilimbi teaches The method of claim 15 where Gupta teaches further comprising analyzing a third data group by the third machine learning construct using the results of the augmenting learning (“One or more of (or, in some embodiments, each of) the processing components 102.sub.1-5 can process respective portions of the data 108 during a first act for a deep learning process associated with the data 108.” [col 13, lines 22-26]).

Regarding claim 17, Gupta/Chilimbi teaches The method of claim 12 where Gupta teaches wherein the first localized machine learning construct comprises a convolutional neural net (“A machine learning process associated with processing components 102.sub.1-N can include, but is not limited to, a deep Boltzmann machine algorithm, a deep belief network algorithm, a convolution neural network algorithm, a stacked auto-encoder algorithm, etc.” [col 6, lines 40-44]).

Regarding claim 18, Gupta/Chilimbi teaches The method of claim 1, however while Gupta teaches machine learning, the reference doesn’t go into details of determining convolutional layers.
Chilimbi teaches wherein the determining the first set of convolutional layers comprises machine learning (“Visual tasks may leverage large scale neural networks for learning visual features. Recent work has demonstrated that DNNs comprised of convolutional layers (e.g., 5 convolutional layers) for learning visual features followed by fully connected layers (e.g., 3 fully connected layers) for combining these learned features to make a classification decision may achieve state-of-the-art performance on visual object recognition tasks. The DNNs may be used to train models on tasks such as speech recognition, text processing, and/or other tasks also.” [¶0031; See further [¶0002-¶0004 discloses machine learning]).
Gupta and Chilimbi are both in the same field of endeavor of distributed deep learning. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Gupta’s teachings by implementing a CNN and sending a first set of CNN layers to processing components as taught by Chilimbi. One would have been motivated to make this modification in order to include computation and communications optimizations to improve system efficiency [Abstract, Chilimbi]

Regarding claim 20, Gupta/Chilimbi teaches The method of claim 1 where Chilimbi teaches wherein the determining further comprises determining a first set of hidden layers (“As shown in FIG. 2, computing machines called neurons (e.g., v.sub.1, v.sub.2, v.sub.3, etc.) associated with the first layer 202 receive an input 204. The first layer 202 represents the input layer. Each of the individual neurons in the first layer 202 outputs a single output to each of the neurons in the second layer 206 of neurons via connections between the neurons in each layer. The second layer 206 represents a layer for learning low-level features. Accordingly, each neuron in the second layer 206 receives multiple inputs and outputs a single output to each of the neurons in the third layer 208. The third layer 208 represents a layer for learning mid-level features. A same process happens for layer 210, which represents a layer for learning high-level features, and layer 212, which represents a layer for learning desired outputs. In layer 212, the output comprises a label 214 representative of the input 204.” [¶0004; a first set of hidden layers implicit.]).
Gupta and Chilimbi are both in the same field of endeavor of distributed deep learning. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Gupta’s teachings by implementing a CNN and sending a first set of CNN layers to processing components as taught by Chilimbi. One would have been motivated to make this modification in order to include computation and communications optimizations to improve system efficiency [Abstract, Chilimbi]

Regarding claim 21, Gupta/Chilimbi teaches The method of claim 1 where Gupta teaches wherein the determining further comprises determining a first set of weights (“In an aspect, model weights for a deep learning system can be communicated amongst a subset of processing components (e.g., a set of parallel processing components). One or more of the processing components can amalgamate the model weights to compute a composite model weight for the deep learning system.” [col 4, lines 49-54]).

Regarding claim 22, Gupta/Chilimbi teaches The method of claim 21 where Chilimbi teaches wherein the determining the first set of weights is accomplished using forward propagation and backward propagation (“In at least one embodiment, model training on a machine (e.g., Machine 1, Machine 2, etc.) may be multi-threaded with different data items assigned to threads that share the model weights. Each thread allocates a training context for feed-forward evaluation and back propagation, as described above. This training context may store the activations and weight update values computed during back-propagation for each layer.” [¶0057; See further ¶0033-¶0036 discloses further details of forward/back propagation.]).
Gupta and Chilimbi are both in the same field of endeavor of distributed deep learning. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Gupta’s teachings by implementing a CNN and sending a first set of CNN layers to processing components as taught by Chilimbi. One would have been motivated to make this modification in order to include computation and communications optimizations to improve system efficiency [Abstract, Chilimbi]

Regarding claim 25, Gupta/Chilimbi teaches The method of claim 1 where Gupta teaches further comprising applying a fourth data group to the second localized machine learning construct (“The data 108 can be raw, compressed and/or processed data and can include, but is not limited to, any number of different types of data such as audio data, video data, textual data and/or numerical data. In different embodiments, the sizes of the respective portions of data 108 received by different ones of processing components 102.sub.1-N can be the same or different from time to time.” [col 7, lines 41-46; Gupta discloses the portions of data can be the same or different which would imply a fourth data group could be received by a second machine learning construct.]).

Regarding claim 26, Gupta/Chilimbi teaches The method of claim 25 further comprising determining a second set of convolutional layers on the second localized machine learning construct using the fourth data group (“The processing components 102.sub.1-N can perform a deep learning process associated with the data 108. For example, the processing components 102.sub.1-N can collectively determine output data based on the data 108. The processing components 102.sub.1-N can, for example, collectively train a model for a deep learning process based on the data 108. The processing components 102.sub.1-N can also generate respective models (e.g., a current models) for each iteration of a deep learning process. The respective models (e.g., the current models) can represent a current state of a neural network associated with the processing components 102.sub.1-N. In another example, the processing components 102.sub.1-N can collectively determine a solution for a task associated with the data 108. In yet another example, the processing components 102.sub.1-N can collectively determine features, classifications and/or patterns associated with the data 108. In yet another example, the processing components 102.sub.1-N can collectively perform a set of processing acts and/or a deep learning process associated with the data 108.” [col 8, lines 13-31; training in iterations would imply being able to use a fourth data group with a second machine learning construct.]).
While Gupta discloses using a convolutional neural network, the reference doesn’t go into details of determining convolutional neural network layers.
	Chilimbi teaches determining a second set of convolutional layers (“Convolutional neural networks may represent a class of neural networks that are biologically inspired by early work on the visual cortex. Neurons in a layer may be connected to spatially local neurons in the next layer modeling local visual receptive fields. In addition, these connections may share weights which allows for feature detection regardless of position in the visual field. The weight sharing may also reduce the number of free parameters to be learned and consequently these models are easier to train compared to similar size networks where neurons in a layer are fully connected to every neuron in a neighboring layer… Visual tasks may leverage large scale neural networks for learning visual features. Recent work has demonstrated that DNNs comprised of convolutional layers (e.g., 5 convolutional layers) for learning visual features followed by fully connected layers (e.g., 3 fully connected layers) for combining these learned features to make a classification decision may achieve state-of-the-art performance on visual object recognition tasks.” [¶0030-¶0031])
	Gupta and Chilimbi are both in the same field of endeavor of distributed deep learning. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Gupta’s teachings by implementing a CNN and sending a first set of CNN layers to processing components as taught by Chilimbi. One would have been motivated to make this modification in order to include computation and communications optimizations to improve system efficiency [Abstract, Chilimbi]

	Regarding claim 27, Gupta teaches A computer program product embodied in a non-transitory computer readable medium for data analysis, the computer program product comprising code which causes one or more processors to perform operations of ([col 28, line 53 - col 29, line 16 discloses computer readable medium]):
obtaining a first data group in a first locality (“For example, in one embodiment, processing components in a deep learning system (e.g., a machine learning system) can each receive a set of inputs and collectively generate an output based on the set of inputs.” [col 4, lines 38-41; See further “In yet another aspect, the assignment component 104 can form a group of processing components from the processing components 102.sub.1-N based on a location (e.g., a locality) of a particular processing component with respect to other processing components 102.sub.1-N in the system 100. A location of the processing component can be indicated, for example, based on an identity of a particular processor in which the processing component resides” [col 9, lines 58-65]); 
applying the first data group to a first localized machine learning construct (“Each of processing components 102.sub.1-N (or, in some embodiments, one or more of the processing components 102.sub.1-N) can be or include a processing engine that processes data associated with a machine learning process (e.g., a deep learning process).” [col 6, lines 29-33; See further: “A machine learning process associated with processing components 102.sub.1-N can include, but is not limited to, a deep Boltzmann machine algorithm, a deep belief network algorithm, a convolution neural network algorithm, a stacked auto-encoder algorithm, etc. In certain embodiments, one or more of the processing components 102.sub.1-N can be an artificial intelligence component.” [col 6, lines 40-46; Examiner is interpreting a localized machine learning construct to be equivalent to a processing component at a remote location.]); 
adjudicating similarity between the first localized machine learning construct and a second localized machine learning construct (“Alternatively, the defined criterion associated with the third group can be different than (e.g., distinct from) the defined criterion associated with the first group and/or the second group. The defined criterion can be associated with the data 108, the updated output data generated by the processing component 102.sub.1, the processing component 102.sub.1 and/or the other processing component from the processing components 102.sub.1-N.” [col 11, lines 46-53]); 
sending communication data to the second localized machine learning construct (“In response to determination of a second group by the assignment component 104, the processing component 102.sub.1 and the processing component 102.sub.3 can exchange data. For example, the processing component 102.sub.1 can transmit the updated output data to the processing component 102.sub.3. Furthermore, the processing component 102.sub.3 can transmit the data (e.g., communication data) to processing component 102.sub.1.” [col 11, 23-30; note: Gupta discloses the data that is transmitted can be data that is associated with a deep learning training process. (col, 18, lines 51-58)]], based on the similarity that was adjudicated meeting a threshold (“The assignment component 104 can repeatedly form groups within the processing components 102.sub.1-N until a defined criterion associated with the data 108 and/or the processing components 102.sub.1-N is satisfied. By way of example, but not limitation, the defined criterion can be a number of groups formed by the assignment component 104 reaching a defined value or an error value (e.g., a training error value, etc.) associated with the data 108 reaching a defined value. Accordingly, collaborative groups of processing components can be dynamically synchronized by the assignment component 104 for parallel learning during a deep learning process.” [col 10, lines 15-26]); and 
analyzing a second data group by the second localized machine learning construct using the first set of convolutional layers (“A subsequent act for the deep learning process can be a subsequent processing act that is performed by a processing components 102.sub.1-N after a first process act associated with the data 108. For example, processing component 102.sub.1 can generate other output data based on the first portion of the data 108 and/or data received from at least one other processing component (e.g., during a subsequent act for the deep learning process), processing component 102.sub.2 can generate other output data based on the second portion of the data 108 and/or data received from at least one other processing component (e.g., during the subsequent act for the deep learning process), etc.” [col 11, lines 4-15; note: This process is equivalent to “analyzing a second data group”. A first set of convolutional layers is taught by Chilimbi as cited below.]).
	While Gupta teaches using a convolutional neural network and sending communication data to the second localized machine learning construct, the reference doesn’t go into details of determining a first set of convolutional neural network layers and sending a first set of convolutional layers to a second localized machine learning construct.	
Chilimbi teaches determining a first set of convolutional layers within the first localized machine learning construct based on the first data group wherein the first set of convolutional layers comprises a first data flow graph machine (“Convolutional neural networks may represent a class of neural networks that are biologically inspired by early work on the visual cortex. Neurons in a layer may be connected to spatially local neurons in the next layer modeling local visual receptive fields. In addition, these connections may share weights which allows for feature detection regardless of position in the visual field. The weight sharing may also reduce the number of free parameters to be learned and consequently these models are easier to train compared to similar size networks where neurons in a layer are fully connected to every neuron in a neighboring layer… Visual tasks may leverage large scale neural networks for learning visual features. Recent work has demonstrated that DNNs comprised of convolutional layers (e.g., 5 convolutional layers) for learning visual features followed by fully connected layers (e.g., 3 fully connected layers) for combining these learned features to make a classification decision may achieve state-of-the-art performance on visual object recognition tasks.” [¶0030-¶0031; note: Examiner is interpreting a first data flow graph machine to be equivalent to a convolutional neural network.])
	sending a first set of convolutional layers to a second localized machine learning construct (“In at least one embodiment of model training, data values may be communicated across neuron layers. Since the model is partitioned across multiple machines (e.g., Machine 1, Machine 2, etc.) within each replica (e.g., 704A, 704N, etc.) some of this communication may be non-local.” [¶0059; See further ¶0064-¶0066 discloses further details regarding sending convolutional layers to different machines.]) .
	Gupta and Chilimbi are both in the same field of endeavor of distributed deep learning. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Gupta’s teachings by substituting the communication data of Gupta with the convolutional layers taught by Chilimbi. One would have been motivated to send a first set of convolutional layers to another localized machine learning construct in order to receive and update weight values asynchronously to reduce communication traffic volume and offload additional computations. [¶0065-¶0066, Chilimbi]


	Regarding claim 28, Gupta teaches A computer system for data analysis comprising: a memory which stores instructions; one or more processors attached to the memory wherein the one or more processors, when executing the instructions which are stored, are configured to ([col 1, lines 30-37]):
obtain a first data group in a first locality (“For example, in one embodiment, processing components in a deep learning system (e.g., a machine learning system) can each receive a set of inputs and collectively generate an output based on the set of inputs.” [col 4, lines 38-41; See further “In yet another aspect, the assignment component 104 can form a group of processing components from the processing components 102.sub.1-N based on a location (e.g., a locality) of a particular processing component with respect to other processing components 102.sub.1-N in the system 100. A location of the processing component can be indicated, for example, based on an identity of a particular processor in which the processing component resides” [col 9, lines 58-65]); 
apply the first data group to a first localized machine learning construct (“Each of processing components 102.sub.1-N (or, in some embodiments, one or more of the processing components 102.sub.1-N) can be or include a processing engine that processes data associated with a machine learning process (e.g., a deep learning process).” [col 6, lines 29-33; See further: “A machine learning process associated with processing components 102.sub.1-N can include, but is not limited to, a deep Boltzmann machine algorithm, a deep belief network algorithm, a convolution neural network algorithm, a stacked auto-encoder algorithm, etc. In certain embodiments, one or more of the processing components 102.sub.1-N can be an artificial intelligence component.” [col 6, lines 40-46; Examiner is interpreting a localized machine learning construct to be equivalent to a processing component at a remote location.]); 
adjudicate similarity between the first localized machine learning construct and a second localized machine learning construct (“Alternatively, the defined criterion associated with the third group can be different than (e.g., distinct from) the defined criterion associated with the first group and/or the second group. The defined criterion can be associated with the data 108, the updated output data generated by the processing component 102.sub.1, the processing component 102.sub.1 and/or the other processing component from the processing components 102.sub.1-N.” [col 11, lines 46-53]); 
send communication data to the second localized machine learning construct (“In response to determination of a second group by the assignment component 104, the processing component 102.sub.1 and the processing component 102.sub.3 can exchange data. For example, the processing component 102.sub.1 can transmit the updated output data to the processing component 102.sub.3. Furthermore, the processing component 102.sub.3 can transmit the data (e.g., communication data) to processing component 102.sub.1.” [col 11, 23-30; note: Gupta discloses the data that is transmitted can be data that is associated with a deep learning training process. (col, 18, lines 51-58)], based on the similarity that was adjudicated meeting a threshold (“The assignment component 104 can repeatedly form groups within the processing components 102.sub.1-N until a defined criterion associated with the data 108 and/or the processing components 102.sub.1-N is satisfied. By way of example, but not limitation, the defined criterion can be a number of groups formed by the assignment component 104 reaching a defined value or an error value (e.g., a training error value, etc.) associated with the data 108 reaching a defined value. Accordingly, collaborative groups of processing components can be dynamically synchronized by the assignment component 104 for parallel learning during a deep learning process.” [col 10, lines 15-26]); and 
analyze a second data group by the second localized machine learning construct using the first set of convolutional layers (“A subsequent act for the deep learning process can be a subsequent processing act that is performed by a processing components 102.sub.1-N after a first process act associated with the data 108. For example, processing component 102.sub.1 can generate other output data based on the first portion of the data 108 and/or data received from at least one other processing component (e.g., during a subsequent act for the deep learning process), processing component 102.sub.2 can generate other output data based on the second portion of the data 108 and/or data received from at least one other processing component (e.g., during the subsequent act for the deep learning process), etc.” [col 11, lines 4-15; note: This process is equivalent to “analyzing a second data group”. A first set of convolutional layers is taught by Chilimbi as cited below.]).
	While Gupta teaches using a convolutional neural network and sending communication data to the second localized machine learning construct, the reference doesn’t go into details of determining a first set of convolutional neural network layers and sending a first set of convolutional layers to a second localized machine learning construct.
	Chilimbi teaches determine a first set of convolutional layers within the first localized machine learning construct based on the first data group wherein the first set of convolutional layers comprises a first data flow graph machine (“Convolutional neural networks may represent a class of neural networks that are biologically inspired by early work on the visual cortex. Neurons in a layer may be connected to spatially local neurons in the next layer modeling local visual receptive fields. In addition, these connections may share weights which allows for feature detection regardless of position in the visual field. The weight sharing may also reduce the number of free parameters to be learned and consequently these models are easier to train compared to similar size networks where neurons in a layer are fully connected to every neuron in a neighboring layer… Visual tasks may leverage large scale neural networks for learning visual features. Recent work has demonstrated that DNNs comprised of convolutional layers (e.g., 5 convolutional layers) for learning visual features followed by fully connected layers (e.g., 3 fully connected layers) for combining these learned features to make a classification decision may achieve state-of-the-art performance on visual object recognition tasks.” [¶0030-¶0031; note: Examiner is interpreting a first data flow graph machine to be equivalent to a convolutional neural network.])
	sending a first set of convolutional layers to a second localized machine learning construct (“In at least one embodiment of model training, data values may be communicated across neuron layers. Since the model is partitioned across multiple machines (e.g., Machine 1, Machine 2, etc.) within each replica (e.g., 704A, 704N, etc.) some of this communication may be non-local.” [¶0059; See further ¶0064-¶0066 discloses further details regarding sending convolutional layers to different machines.]) .
	Gupta and Chilimbi are both in the same field of endeavor of distributed deep learning. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Gupta’s teachings by substituting the communication data of Gupta with the convolutional layers taught by Chilimbi. One would have been motivated to send a first set of convolutional layers to another localized machine learning construct in order to receive and update weight values asynchronously to reduce communication traffic volume and offload additional computations. [¶0065-¶0066, Chilimbi]


Claims 4-6 are rejected under 35 U.S.C. 103 as being unpatentable over Gupta in view of Chilimbi and further in view of Sun et al. ("Big Data Based Retail Recommender System Of Non E-Commerce", hereinafter "Sun").

Regarding claim 4, Gupta/Chilimbi teaches The method of claim 1 however fails to explicitly teach wherein the first localized machine learning construct comprises a first retail establishment.
Sun teaches wherein the first localized machine learning construct comprises a first retail establishment (“In this system, retail stores are regarded as users with different personality, and the result values of a Sales→Rate transform of the product sales are regarded as the users’ evaluation of the items. Through this mechanism, the sales data of some mature retail stores could be used to train the collaborative filtering recommender model, and sales prediction of other stores would be made before practical product promotion. So that non e-commerce companies can get guidance for product promotion at different retail stores and design their sales management strategies more precisely, Thus get more efficient sales with lower cost” [pg. 2, § Introduction, ¶5; See further pg. 4, under Figure 2. “Let the sales of store u on the sold product i is sui, the estimated sales of store v on the unsold product j”]).
Gupta, Chilimbi and Sun are all in the same field of endeavor of distributed deep learning. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. Sun teaches a MapReduce computing model for a retail recommender system. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the teachings of Gupta/Chilimbi to integrate the remote retail establishments of Sun with the processor devices at remote locations. One would have been motivated to make this modification in order to implement a scalable data processing system to provide sales recommendations. [Abstract, Sun]

Regarding claim 5, Gupta/Chilimbi/Sun teaches The method of claim 4 where
Sun teaches wherein the second localized machine learning construct comprises a second retail establishment (“In this system, retail stores are regarded as users with different personality, and the result values of a Sales→Rate transform of the product sales are regarded as the users’ evaluation of the items. Through this mechanism, the sales data of some mature retail stores could be used to train the collaborative filtering recommender model, and sales prediction of other stores would be made before practical product promotion. So that non e-commerce companies can get guidance for product promotion at different retail stores and design their sales management strategies more precisely, Thus get more efficient sales with lower cost” [pg. 2, § Introduction, ¶5; See further pg. 4, under Figure 2. “Let the sales of store u on the sold product i is sui, the estimated sales of store v on the unsold product j”]).
Gupta, Chilimbi and Sun are all in the same field of endeavor of distributed deep learning. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. Sun teaches a MapReduce computing model for a retail recommender system. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the teachings of Gupta/Chilimbi to integrate the remote retail establishments of Sun with the processor devices at remote locations. One would have been motivated to make this modification in order to implement a scalable data processing system to provide sales recommendations. [Abstract, Sun]

Regarding claim 6, Gupta/Chilimbi teaches The method of claim 1 however fails to explicitly teach wherein the analyzing comprises determining a sales recommendation for a retail establishment associated with the second localized machine learning construct.
Sun teaches wherein the analyzing comprises determining a sales recommendation for a retail establishment associated with the second localized machine learning construct (“Through this mechanism, the sales data of some mature retail stores could be used to train the collaborative filtering recommender model, and sales prediction of other stores would be made before practical product promotion. So that non e-commerce companies can get guidance for product promotion at different retail stores and design their sales management strategies more precisely…Experimental results show that the retail recommender system can effectively predict sales for specific retail store and product.” [pg. 2, § Introduction, ¶5]).
Gupta, Chilimbi and Sun are all in the same field of endeavor of distributed deep learning. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. Sun teaches a MapReduce computing model for a retail recommender system. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the teachings of Gupta/Chilimbi to integrate the remote retail establishments of Sun with the processor devices at remote locations. One would have been motivated to make this modification in order to implement a scalable data processing system to provide sales recommendations. [Abstract, Sun]

Claims 9-11 are rejected under 35 U.S.C. 103 as being unpatentable over Gupta in view of Chilimbi and further in view of Naranjo et al. ("Evaluation of V2V and V2I Mesh Prototypes based on a Wireless Sensor Network", hereinafter "Naranjo").

Regarding claim 9, Gupta/Chilimbi teaches The method of claim 1 however fails to explicitly teach wherein the first localized machine learning construct comprises a first vehicle.
Naranjo teaches wherein the first localized machine learning construct comprises a first vehicle (“Three testbed vehicles were used in the field experiments (Figure 1). The first one was an instrumented Peugeot 307, equipped with an acquisition computer with CAN bus access, one non-contact speed sensor, a gyroscopic platform, an Astech G-12 GPS receiver, UMTS Internet access and a touch panel accessible to the driver in order for them to be able to manage the acquisition system.” [pg. 2081, § B. Instrumented Vehicles, ¶2]).
Gupta, Chilimbi and Naranjo are all in the same field of endeavor of distributed computing environments. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. Naranjo teaches V2V mesh prototypes. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the teachings of Gupta/Chilimbi to integrate the remote devices embedded on a vehicle over a mesh network as taught by Naranjo. One would have been motivated to make this modification in order to implement an intelligent vehicle-to-vehicle mesh network to improve communications between vehicles. [Abstract, Naranjo]

Regarding claim 10, Gupta/Chilimbi/Naranjo teaches The method of claim 9 where Naranjo teaches wherein the second localized machine learning construct comprises a second vehicle (“The second vehicle was a Peugeot 207 without additional instrumentation. This means that it only equips the minimum equipment for the mesh network tests. In this case, this car equipped a Trimble R4 GPS receiver” [pg. 2081, § B. Instrumented Vehicles, ¶3]).
Gupta, Chilimbi and Naranjo are all in the same field of endeavor of distributed computing environments. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. Naranjo teaches V2V mesh prototypes. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the teachings of Gupta/Chilimbi to integrate the remote devices embedded on a vehicle over a mesh network as taught by Naranjo. One would have been motivated to make this modification in order to implement an intelligent vehicle-to-vehicle mesh network to improve communications between vehicles. [Abstract, Naranjo]

Regarding claim 11, Gupta/Chilimbi/Naranjo teaches The method of claim 10 where Naranjo teaches further comprising transferring descriptors for the first set of convolutional layers using a mesh network comprising the first vehicle and the second vehicle (“In this paper we present an evaluation of V2V and V2I prototypes based on wireless sensor mesh networks, tested in real vehicles with real communication hardware” [Abstract; See further “The third experiment (Figure 4) shows the mesh performance of the VANET in the transmission of a message with origin in node 13 and destination in the infrastructure node. In this case, both nodes are not in a direct line of sight and the network reconfigures the message routes to assure the data transmission” [pg. 2084, left col, bottom para; note: Chilimbi teaches the first set of convolutional layers]).
Gupta, Chilimbi and Naranjo are all in the same field of endeavor of distributed computing environments. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. Naranjo teaches V2V mesh prototypes. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the teachings of Gupta/Chilimbi to transfer data such as the convolutional layers taught by Chilimbi from a vehicle to another vehicle over a mesh network as taught by Naranjo. One would have been motivated to make this modification in order to implement an intelligent vehicle-to-vehicle mesh network to improve communications between vehicles. [Abstract, Naranjo]

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Gupta in view of Chilimbi and further in view of Liu et al. ("Sparse Convolutional Neural Networks", hereinafter "Liu").

Regarding claim 19, Gupta/Chilimbi teaches The method of claim 1 however fails to explicitly teach wherein the determining further comprises determining a first set of max pooling layers.
Liu teaches wherein the determining further comprises determining a first set of max pooling layers (“The model consists of 5 convolutional layers and two fully connected layers, interlaced with subsampling layers, local normalizing layers, max pooling layers, rectified linear unit layers and dropout layers.” [pg. 811, § 6.1 Setup, ¶1]).
Gupta, Chilimbi and Liu are all disclose the use of convolutional neural networks. Gupta teaches a method for parallel deep learning synchronization among processing components. Chilimbi teaches a method of training deep neural networks. Liu discloses the use of spare convolutional neural networks to lower computation cost. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the teachings of Gupta/Chilimbi to implement a first set of max pooling layers as taught by Liu. The use of max pooling layers in CNNs is well known and thus one would have been motivated to make this modification in order to train a CNN model in order to reduce computational cost. [pg. 806, § Introduction, ¶1]

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Shen et al. ("Multi-vehicle Motion Coordination Using V2V Communication) discloses using V2V communication.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL H HOANG whose telephone number is (571)272-8491. The examiner can normally be reached Mon-Fri 8:30AM-4:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/M.H.H./Examiner, Art Unit 2122     

/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122