DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 7, 13-14, 16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Loffee et al (US 2016/0217368) and further in view of Hechtman et al (US 2020/0125949).
 	For claim 1, Sainath et al teach a method comprising: 
receiving, at a batch normalization layer of a neural network associated with a first computing device, a first layer output from a first neural network layer of the neural network, wherein the first layer output is based on a local batch of training examples of a global batch, the global batch comprising training examples (e.g. abstract, paragraph 4, figure 1, “Batch Normalization layer 108 receives Layer A outputs 106”, paragraph 26 disclose “In particular, the neural network system 100 can be trained on multiple batches of training examples in order to determine trained values of the parameters of the neural network layers. A batch of training examples is a set of multiple training example”, so “multiple batches of training examples corresponds to the claimed “local batch and remote batch”); 
determining, based at least in part on a component of the first layer output, as a local batch normalization statistic, a first value based at least in part on local batch mean and a second value based at least in part on a local batch variance for the local batch (e.g. paragraph 39: “…the batch normalization layer computes…means and standard deviations to generate a respective normalized output for each of the training examples in the batch”); 
subsequent to the determining of the first value and the second value, transmitting the local batch normalization statistic to a second computing device training a copy of the neural network using the remote batch (e.g. figure1, Neural Network Layer B receive Batch Normalization layer outputs”); 
receiving, from the second computing device, a remote batch normalization statistic associated with the remote batch (e.g. paragraph 39: “…the batch normalization layer computes…means and standard deviations to generate a respective normalized output for each of the training examples in the batch”);
determining, based at least in part on the local batch normalization statistic, a global batch mean and a global batch variance (e.g. paragraph 39: “…the batch normalization layer computes…means and standard deviations to generate a respective normalized output for each of the training examples in the batch”); and 
generating a normalized component of a normalized output associated with the component of the first layer output based at least in part on the global batch mean and the global batch variance (e.g. figure 1, Neural Network output). 
	Loffe et al do not further specify a remote batch of training example. Hechtman et al teach a remote batch of training example (e.g. paragraph 5, “…The distributed means and variances that are used by a given device are determined based on per-replica means and variances computed by the given device and per-replica means and variances computed by other devices that are in the same sub-group as the given device.”). It would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Hechtman et al into the teaching of Loffe et al to improve the effectiveness of the training process (e.g. paragraph 8, Hechtman et al). 
	Claims 7 and 16 are rejected for the same reasons as discussed in claim 1 above, wherein Loffe et al teach computer program execute by data processing apparatus in paragraph 5. 
	For claims 13 and 20, Loffe et al do not specify the first portion comprises a first number of training examples and the second portion comprises a second number of training examples different from the first number. Hechtman teaches the first portion comprises a first number of training examples and the second portion comprises a second number of training examples different from the first number (e.g. paragraph 8, “2, 4, or 8” “512 or 1024”).  It would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Hechtman et al into the teaching of Loffe et al to improve the effectiveness of the training process (e.g. paragraph 8, Hechtman et al).
	For claim 14, Loffe et al do not specify the remote batch normalization statistic comprises, for the remote batch, a remote batch mean and a remote batch variance. Hechtman et al teach the remote batch normalization statistic comprises, for the remote batch, a remote batch mean and a remote batch variance (e.g. paragraph 5, “…The distributed means and variances that are used by a given device are determined based on per-replica means and variances computed by the given device and per-replica means and variances computed by other devices that are in the same sub-group as the given device.”). It would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Hechtman et al into the teaching of Loffe et al to improve the effectiveness of the training process (e.g. paragraph 8, Hechtman et al).
	
Allowable Subject Matter
Claims 2-6, 8-12, 15 and 17-19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAQUAN ZHAO whose telephone number is (571)270-1119 or email daquan.zhao1@uspto.gov.  If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tran Thai Q, can be reached on (571)272-7382.  The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/DAQUAN ZHAO/Primary Examiner, Art Unit 2484