DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

The Office Action is in response to claims filed on 4/23/2018 where claims 1 - 20 are pending and ready for examination.

The information disclosure statement (IDS) submitted on 4/23/2018 is in compliance with the provisions of 37 CFR 1.97. The information disclosure statement is being considered by the examiner.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Claim Rejections - 35 USC § 102

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.



Claims 1, 3 – 4, 6 – 8, 11, and 13 – 15 are rejected under 35 USC 102(a))1) as being anticipated by Chilimibi, “Project Adam: Building an Efficient and Scalable Deep Learning Training System”
Regarding claim 1, Chilimbi discloses a method, comprising:
executing a distributed deep learning (DL) model training process to train model parameters of a DL model using a plurality of worker nodes executing on one or more server nodes of a computing system (Chilimbi;
see e.g. Page 574, Section 2.3 Distributed Deep Learning Training “ ... large models are partitioned across multiple model worker machines ... a common set of parameters ...”
see e.g. Page 573, Figure 5, Distributed Training System Architecture comprising model workers;
see e.g. Page 574, section 3.2 Model Training  “ ... model worker machines as shown in Fig. 6 ...”
see e.g. Page 575, Section 3.2.3 Reducing Memory copies “ ... network library on top of the Windows socket API ... output values need to be communicated across the network ...”
see e.g. Page 576,  Section 3.3. Global Parameter Server “ ... speed up network, update, and disk IO processing ...”
see e.g. Page 577, Section 3.3.4 Communication Isolation “Parameter server machines have two 10Gb NICs ... maximize network bandwidth ...”); and
(Chilimbi; 
see e.g. Page 576, Section 3.3 Global Parameter Server “The parameter server is in constant communication with the model training machines received updates to model parameters ...”
see e.g. Fig. 7. Parameter Server Node Architecture
see e.g. Figure 5. Distributed Training System Architecture)

Regarding claim 3, Chilimibi discloses the method of claim 1, wherein executing the parameter server within the networking infrastructure of the computing system comprises executing a parameter server node on a physical network device of the networking infrastructure (Chilimibi;
see e.g. see e.g. Fig. 7. Parameter Server Node Architecture illustrating the parameter server node on a physical network device)
Regarding claim 4, Chilimibi discloses the method of claim 3, wherein the physical network device comprises a network interface card installed in a server node (Chilimibi; 
see e.g. Page 577, Section 3.3.4 Communication Isolation “Parameter server machines have two 10Gb NICs ... maximize network bandwidth ...”
The Examiner notes NIC is equivalent to a network interface card  )

Regarding claim 6, Chilimibi discloses the method of claim 1, wherein executing the parameter server within the networking infrastructure of the computing system comprises executing a parameter server node on a virtual network element connected to or executing on a server node in the computing system (Chilimibi; Chilimbi teaches the utilization of virtual network interface card (i.e. virtual network element);
see e.g. Page 577, Section 3.3.4 Communication Isolation “Parameter server machines have two 10Gb NICs ... maximize network bandwidth ...”
The Examiner notes NIC is equivalent to a network interface card  )
).
Regarding claim 7, Chilimibi discloses the method of claim 6, wherein the virtual network element comprises one of a virtual network interface card and a virtual switch (Chilimibi; Chilimbi teaches the utilization of virtual network interface card (i.e. virtual network element)
see e.g. Page 577, Section 3.3.4 Communication Isolation “Parameter server machines have two 10Gb NICs ... maximize network bandwidth ...”
The Examiner notes NIC is equivalent to a network interface card  )
Regarding claim 8, Chilimibi discloses the method of claim 1, wherein executing the parameter server within the networking infrastructure of the computing system comprises distributing the parameter server over a plurality of network elements within the networking infrastructure of the computing system (Chilimibi; Chilimibi teaches the utilization of multiple networking interface cards;
see e.g. Page 577, Section 4.2 System Hardware;
 “... Each machine is a HP Proliant Server ... two 10Gb NICS and one GB NIC ...”)
Regarding claim 11, Chilimbi discloses   an article of manufacture comprising a processor-readable storage medium having stored program code of one or more software programs, wherein the program code is executable by one or more processors to implement method steps comprising:
executing a distributed deep learning (DL) model training process to train model parameters of a DL model using a plurality of worker nodes executing on one or more server nodes of a computing system (Chilimbi;
see e.g. Page 574, Section 2.3 Distributed Deep Learning Training “ ... large models are partitioned across multiple model worker machines ... a common set of parameters ...”
see e.g. Page 573, Figure 5, Distributed Training System Architecture comprising model workers;
see e.g. Page 574, section 3.2 Model Training  “ ... model worker machines as shown in Fig. 6 ...”
see e.g. Page 575, Section 3.2.3 Reducing Memory copies “ ... network library on top of the Windows socket API ... output values need to be communicated across the network ...”
see e.g. Page 576,  Section 3.3. Global Parameter Server “ ... speed up network, update, and disk IO processing ...”
see e.g. Page 577, Section 3.3.4 Communication Isolation “Parameter server machines have two 10Gb NICs ... maximize network bandwidth ...”); and
(Chilimbi; 
see e.g. Page 576, Section 3.3 Global Parameter Server “The parameter server is in constant communication with the model training machines received updates to model parameters ...”
see e.g. Fig. 7. Parameter Server Node Architecture
see e.g. Figure 5. Distributed Training System Architecture).
. Regarding claim 13, Chilimbi discloses  the article of manufacture of claim 11, wherein executing the parameter server within the networking infrastructure of the computing system comprises executing a parameter server node on a physical network device of the networking infrastructure (see e.g. see e.g. Fig. 7. Parameter Server Node Architecture illustrating the parameter server node on a physical network device, wherein the physical network device comprises at least one of a network interface card installed in a server node  (see e.g. Page 577, Section 3.3.4 Communication Isolation “Parameter server machines have two 10Gb NICs ... maximize network bandwidth ...”) and a computational switch device which is network connected to the one or more server nodes of the computing system.
The Examiner notes NIC is equivalent to a network interface card
Regarding claim 14, Chilimbi discloses the article of manufacture of claim 11, wherein executing the parameter server within the networking infrastructure of the computing system comprises executing a parameter server node on a virtual network element connected to or executing on a server node in the computing system, wherein the virtual network element (Chilimibi; Chilimbi teaches the utilization of virtual network interface card (i.e. virtual network element);
see e.g. Page 577, Section 3.3.4 Communication Isolation “Parameter server machines have two 10Gb NICs ... maximize network bandwidth ...”
The Examiner notes NIC is equivalent to a network interface card  )
. Regarding claim 15, Chilimbi discloses the article of manufacture of claim 11, wherein executing the parameter server within the networking infrastructure of the computing system comprises distributing the parameter server over a plurality of network elements within the networking infrastructure of the computing system  (Chilimibi; Chilimibi teaches the utilization of multiple networking interface cards;
see e.g. Page 577, Section 4.2 System Hardware;
 “... Each machine is a HP Proliant Server ... two 10Gb NICS and one GB NIC ...”
)


Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the 


Claims 2, 12, 16, and 18 – 20 are rejected under 35 USC 103 as being unpatentable over Chilimbi in view of Bernat (US 2019/0042515)
Regarding claim 2, Chilimbi discloses the method of claim 1, Chilimbi suggests hardware accelerator devices (see e.g. Page 581, Section 5 Related Work “ ... hardware acceleration for neural network models ...”), but does not expressly disclose wherein the plurality of worker nodes comprise virtual worker nodes that execute on hardware accelerator devices.
However in analogous art Bernat discloses:
wherein the plurality of worker nodes comprise virtual worker nodes that execute on hardware accelerator devices (Bernat; Bernat teaches the utilization of accelerator hardware within the context of deep learning models;
see e.g. [0034] “ ... each server 146 may host a standalone operating system and provide a server function, or servers may be virtualized ...”
see e.g. [0082] “ ... accelerator hardware 420 may be, for example an FPGA , or a GPU, and may be programmed with a deep learning model ... accelerator hardware programmed to perform a training task maybe referred to as a “deep learning module” ... accelerator hardware  with the deep learning model ...”)
Therefore it would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chilimbi with Bernat’s conventional 
Regarding claim 12, claim 12 comprises the same and/or similar subject matter as claim 2 and is considered an obvious variation; therefore it is rejected under the same rationale.
Regarding claim 16, Chlimbi discloses a computing system, comprising:
a server cluster comprising a plurality of server nodes (Chilimbi; see e.g. Page 577, Section 4.2 System Hardware “ .. a cluster of 120 identical machines  ... two 10Gb NICS and one 1G Nic ...”), configured to execute a plurality of worker nodes to perform a distributed deep learning (DL) model training process to train model parameters of a DL model (Chilimibi; see e.g. Page 574, Section 2.3 Distributed Deep Learning Training “ ... large models are partitioned across multiple model worker machines ... a common set of parameters ...”
see e.g. Page 573, Figure 5, Distributed Training System Architecture comprising model workers;
see e.g. Page 574, section 3.2 Model Training  “ ... model worker machines as shown in Fig. 6 ...”
; and
networking infrastructure to network connect the plurality of server nodes within the sever cluster Chilimbi; see e.g. Page 577, Section 4.2 System Hardware “ .. a cluster of 120 identical machines  ... two 10Gb NICS and one 1G Nic ...”), wherein the networking infrastructure is configured to execute a parameter server which aggregates local model parameters computed by the plurality of worker nodes and distributes aggregated model (Chilimbi; 
see e.g. Page 576, Section 3.3 Global Parameter Server “The parameter server is in constant communication with the model training machines received updates to model parameters ...”
see e.g. Fig. 7. Parameter Server Node Architecture
see e.g. Figure 5. Distributed Training System Architecture).

However Chilimbi does not address the conventional utilization of accelerator devices and therefore does not expressly disclose:
wherein the server nodes comprise accelerator devices
However in analogous art Bernat discloses:
wherein the server nodes comprise accelerator devices (Bernat; Bernat teaches the utilization of accelerator hardware within the context of deep learning models;
see e.g. [0034] “ ... each server 146 may host a standalone operating system and provide a server function, or servers may be virtualized ...”
see e.g. [0082] “ ... accelerator hardware 420 may be, for example an FPGA , or a GPU, and may be programmed with a deep learning model ... accelerator hardware programmed to perform a training task maybe referred to as a “deep learning module” ... accelerator hardware  with the deep learning model ...”)

Regarding claim 18, Chilimbi in view of Bernat disclose the computing system of claim 16, wherein the parameter server executes on a physical network device of the networking infrastructure, wherein the physical network device comprises at least one of a network interface card installed in a server node and a computational switch device which is network connected to the server node (Chilimibi;
see e.g. see e.g. Fig. 7. Parameter Server Node Architecture illustrating the parameter server node on a physical network device.
see e.g. Page 577, Section 3.3.4 Communication Isolation “Parameter server machines have two 10Gb NICs ... maximize network bandwidth ...”
The Examiner notes NIC is equivalent to a network interface card
Regarding claim 19, Chilimibi in view of Bernat disclose the computing system of claim 16, wherein the parameter server executes on a virtual  network element connected to or executing on a server node in the computing system, wherein the virtual network element comprises one of a virtual network interface card and a virtual a switch (Chilimbi; Chilimbi teaches the utilization of virtual network interface card (i.e. virtual network element);
see e.g. Page 577, Section 3.3.4 Communication Isolation “Parameter server machines have two 10Gb NICs ... maximize network bandwidth ...”
The Examiner notes NIC is equivalent to a network interface card  ).
Regarding claim 20, Chilimbi in view of Barat discloses the computing system of claim 16, wherein parameter server is distributed over a plurality of network elements of the networking infrastructure of the computing system (Chilimibi; Chilimbi teaches the utilization of virtual network interface card (i.e. virtual network element);
see e.g. Page 577, Section 3.3.4 Communication Isolation “Parameter server machines have two 10Gb NICs ... maximize network bandwidth ...”
The Examiner notes NIC is equivalent to a network interface card  )
.



Claim 17 is rejected under 35 USC 103 as being unpatentable over Chilimbi in view of Bernat  and in further view of Egi (US 2015/0026385)
Regarding claim 17, Chilimbi in view of Bernat disclose the computing system of claim 16, but does not expressly disclose wherein the plurality of worker nodes comprise virtual worker nodes that execute on the accelerator  devices.
However in analogous art Egi discloses:
wherein the plurality of worker nodes comprise virtual worker nodes that execute on the accelerator  devices (Egi; Egi teaches virtual worker nodes executing on GPU/FPGA acceleration hardware; see e.g. [0036] “ ... compute entity (i.e, Virtual Machine/Container/Applicatin/Task/Job/etc.) is created on a  worker processor ... resource etype ... GPU, FGA “)
Therefore it would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chilimbi with Egi’s worker acceleration scheme. The motivation being the combined invention provides for increased efficiencies in achieving deep learning.

Claim 5 is rejected under 35 USC 103 as being unpatentable over Chilimbi in view of Tomasicchio (US 2019/0028185)
Regarding claim 5, Chilimbi discloses the method of claim 3, and discloses switches (see e.g. Page 577 , Section 4.2 System Hardware “ ... switches ...”) but does not address their conventional computational abilities known to those of ordinary skill in the art, and therefore does not expressly disclose wherein the physical network device comprises a computational switch device which is network connected to the one or more server nodes of the computing system.

However in analogous art Tomasicchio discloses:
herein the physical network device comprises a computational switch device which is network connected to the one or more server nodes of the computing system (Tomasicchio; Tomasicchio discloses a switch and its related computational abilities;
see e.g. [0166] “ ... fast core algorithm designed for switch configuration with direct mathematical matrix-based computation ...”
see e.g. [0160] “ ... BSP 11 is in charge of switching according to the routing map ...”
see e.g. [0095] “ ... Burst switching processor ...”)

Therefore it would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chilimbi with Tomasicchio conventional computational switching scheme. The motivation being the combined invention provides for increased efficiencies in transmittal and reception of data.



Claims 9 and 10 are rejected under 35 USC 103 as being unpatentable over Chilimbi in view of Volz, “Virtualization – divide and multiply our servers”, November 2002
Regarding claim 9, Chilimbi discloses   the method of claim 8, wherein distributing the parameter server over the plurality of network elements comprises:
logically dividing the parameter server into a plurality of local parameter server nodes (Chilimibi; Chilimbi teaches utilizing a multiplicity of parameter servers (i.e. logically dividing the parameter server;
see e.g. Section 3.2.6 Parameter Server Communication “ ... parameter server machines ...”); and
(Chilmibi; see e.g. Page 577, Section 4.2 System Hardware;
 “... Each machine is a HP Proliant Server ... two 10 Gb NICS and one GB NIC ...”)
As evidence of logically dividing servers Volz discloses:
logically dividing the parameter server into a plurality of local parameter server nodes (Volz; Volz teaches one of ordinary skill in the art the conventional dividing or splitting of servers;
“ ... each physical server can theoretically be split in to several dozens – or even hundreds- of virtual servers, performing whatever server function you need ...”)
Therefore it would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chilmibi to include the system optimization of dividing and/or splitting servers logically. The motivation being the combined solution provides for increased efficiencies in processing model parameter data.
Regarding claim 10, Chilimibi discloses   the method of claim 8, wherein distributing the parameter server over the plurality of network elements comprises:
logically dividing the parameter server into a plurality of local parameter server nodes (Chilimibi; Chilimibi teaches dividing parameter servers among a cluster of parameter nodes;
see e.g. Page 577, Section 4.2 Hardware “...  a cluster of 120 identical machines ...”);
(Chilimibi; Chilimibi teaches the local parameter server nodes utilize a networking infrastructure comprising network interface cards;
see e.g. Page 577, Section 4.2 Hardware “ ... two 10Gb NICs and one 1 Gb NIC ...”);
designating one of the local parameter server nodes of the parameter server to be a master parameter server node (Chilimibi; Chilimibi teaches one of the parameter nodes is assigned the role of a global parameter node (i.e. master parameter server node))
see e.g. Section 3. ADAM System Architecture “ ... update a shared model via a global parameter server ...”); and
utilizing the master parameter server node to aggregate the local model parameters provided by other parameter server nodes of the parameter server, and to distribute the aggregated model parameter to the other parameter server nodes of the parameter server (Chilimibi; Chilimibi teaches the global parameter server aggregates and distributes model parameter data)
see e.g. Section 2.3 Distributed Deep Learning Training “ ... update a shared model via a global parameter server ... For speed of operating each model replica operates in parallel and asynchronously publishes model weights updates to and receives updated parameter weights ...”
see e.g. Section 3. ADAM System Architecture “ ... update a shared model via a global parameter server ...”);

see e.g. Section 2.3 Distributed Deep Learning Training “ ... update a shared model via a global parameter server ... For speed of operating each model replica operates in parallel and asynchronously publishes model weights updates to and receives updated parameter weights ...”
see e.g. Section 3. ADAM System Architecture “ ... update a shared model via a global parameter server ...”
see e.g. Section 3.2.2 Fast Weight Updates
see e.g. Section 3.2.6 Parameter Server Communication “We have implemented two different protocols for updating parameter weights ...”).

As evidence of logically dividing servers Volz discloses:
logically dividing the parameter server into a plurality of local parameter server nodes (Volz; Volz teaches one of ordinary skill in the art the conventional dividing or splitting of servers;
“ ... each physical server can theoretically be split in to several dozens – or even hundreds- of virtual servers, performing whatever server function you need ...”)






Any inquiry concerning this communication or earlier communications from the Examiner should be directed to TODD L. BARKER whose telephone number is (571) 270 0257. The Examiner can normally be reached on Monday through Friday, 7:30am to 5:00pm.

If attempts to reach the Examiner by telephone are unsuccessful, the Examiner's supervisor Vivek Srivastava can be reached on (571) 272 7304.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Shouldyou have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/TODD L BARKER/Primary Examiner, Art Unit 2449