Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-3, 5-17, and 19-21 are rejected under 35 U.S.C. 103 as being unpatentable over Chang (“Deep Competitive Pathway Networks”, 2017).

	Regarding claim 1, Chang teaches A computing system, comprising: one or more processors; and one or more non-transitory computer-readable media that store a convolutional neural network implemented by the one or more processors, the convolutional neural network comprising: ([p. 270 §4.1] "On ImageNet, we trained from scratch for 100 epochs. As shown in Table 1, we constructed several CoPaNets with 2 pathways for ImageNet. The learning rate began at 0.1 and was divided by 10 after every 30 epochs. The model was implemented using Torch7 from the Github repository fb.resnet.torch" The Github repository for Torch as linked by Chang contains Lua code that necessarily must be executed by a non-transitory computer-readable media utilizing processors to execute the code.)
	one or more convolutional blocks, each of the one or more convolutional blocks comprising: ([p. 268 §3.2] "CoPaNets can be simply constructed by stacking CoPa units. Let the opponent factor k denote the number of pathway in a CoPa unit and the widening factor m multiplies the number of features in convolutional layers" [p. 269 §3.3] " to reuse the features from previous CoPa block (stacked by many CoPa units)." CoPa block taught as stack of CoPa units which are taught as containing convolutional layers, therefore, CoPa block interpreted as synonymous with convolutional block.)
	a linear bottleneck layer; and ([p. 269 §3.2] "For each pathway, we adopted a “bottleneck” residual type unit comprising three convolutional layers (1×1, 3×3, 1×1)... we placed BN and ReLU after all but the last convolutional layer in every pathway").
	one or more convolutional layers. ([p. 269 §3.2] "For each pathway, we adopted a “bottleneck” residual type unit comprising three convolutional layers"). 
	
	While Chang does not explicitly teach using a computing system with one or more processors and one or more non-transitory computer-readable media that store a convolutional neural network implemented by the one or more processors, it would be obvious to one of ordinary skill in the art before the effective filing date that the Torch library in the referenced Github repository contains code that necessarily must be executed to represent a neural network on one or more processors connected to one or more non-transitory computer-readable media.

	Regarding claim 2, Chang teaches The computing system of claim 1, wherein: the linear bottleneck layer is configured to perform a linear transformation in a first dimensional space; and ([p. 266 §2.1] "xl denotes the input feature of the l-th residual unit, id(xl) performs identity mapping, and fl represents layers of the convolutional transformation of the l-th residual unit" [p. 269 §3.2] "we placed BN and ReLU after all but the last convolutional layer in every pathway" See also 1x1 bottom convolution layer in FIG. 2(a).  First dimensional space interpreted as 1x1.)
	the one or more convolutional layers comprise one or more expansion convolutional layers that are configured to perform one or more non-linear transformations in a second dimensional space, the second dimensional space comprising a larger number of dimensions than the first dimensional space. ([p. 266 §2.1] "xl denotes the input feature of the l-th residual unit, id(xl) performs identity mapping, and fl represents layers of the convolutional transformation of the l-th residual unit" [p. 269 §3.2] "we placed BN and ReLU after all but the last convolutional layer in every pathway" See also 3x3 convolution layer in FIG. 2(a).  Second dimensional space interpreted as 3x3.). 

	Regarding claim 3, Chang teaches The computing system of claim 2, wherein the one or more convolutional layers comprise one or more separable convolutional layers. ([p. 269 §3.2] "For each pathway, we adopted a “bottleneck” residualtype unit comprising three convolutional layers (1×1, 3×3, 1×1). Alternatively, we could select a “basic” residual-type unit comprising two convolutional layers (3×3, 3×3)." three depthwise convolutional layers in the  (1×1, 3×3, 1×1) system interpreted as separable.  Similarly the two 3x3 layers in the alternative system interpreted as separable.  See also FIG. 2). 

	Regarding claim 5, Chang teaches The computing system of claim 1, wherein the one or more convolutional blocks comprise a plurality of convolutional blocks arranged in a stack one after the other. ([p. 268 §3.2] "CoPaNets can be simply constructed by stacking CoPa units. Let the opponent factor k denote the number of pathway in a CoPa unit and the widening factor m multiplies the number of features in convolutional layers" [p. 269 §3.3] " to reuse the features from previous CoPa block (stacked by many CoPa units)." CoPa block taught as stack of CoPa units which are taught as containing convolutional layers, therefore, CoPa block interpreted as synonymous with convolutional block. FIG. 2 shows the blocks are stacked one after another.  Stacking as used by Chang interpreted as synonymous with arranging in a stack one after the other.). 

	Regarding claim 6, Chang teaches The computing system of claim 1, wherein at least one of the one or more convolutional blocks comprises a residual shortcut connection between its linear bottleneck layer and a subsequent linear bottleneck layer of a subsequent convolutional block. ([p. 269 §3.3] "The cross-block shortcuts were motivated by DenseNet (Huang et al., 2016a) which reused features from all previous layers with matching feature map sizes. In contrast to DenseNet (Huang et al., 2016a), we propose a novel feature reuse strategy: to reuse the features from previous CoPa block (stacked by many CoPa units). This is accomplished by adding identity shortcuts after pooling layers and concatenate with the output of the next block" Cross-block shortcut interpreted as synonymous with residual shortcut connection.). 

	Regarding claim 7, Chang teaches The computing system of claim 1, wherein at least one of the one or more convolutional blocks comprises a residual shortcut connection between its linear bottleneck layer and a next sequential linear bottleneck layer of a next sequential convolutional block. ([p. 269 §3.3] "The cross-block shortcuts were motivated by DenseNet (Huang et al., 2016a) which reused features from all previous layers with matching feature map sizes. In contrast to DenseNet (Huang et al., 2016a), we propose a novel feature reuse strategy: to reuse the features from previous CoPa block (stacked by many CoPa units). This is accomplished by adding identity shortcuts after pooling layers and concatenate with the output of the next block" Cross-block shortcut interpreted as synonymous with residual shortcut connection.). 

	Regarding claim 8, Chang teaches The computing system of claim 1, wherein, for each of the one or more convolutional blocks, the linear bottleneck layer is structurally prior to the one or more convolutional layers. ([p. 269 §3.2] "In practice, a “bottleneck” residual-type unit is deeper than a “basic” one, providing higher dimensional features. In the proposed CoPaNet, we placed BN and ReLU after all but the last convolutional layer in every pathway" Chang teaches that the bottleneck layer is the stack of BN and ReLU elements in the CoPa unit.  See also FIG. 2(a).). 

	Regarding claim 9, Chang teaches The computing system of claim 1, wherein, for each of the one or more convolutional blocks, the linear bottleneck layer is structurally subsequent to the one or more convolutional layers. ([p. 269 §3.2] "In practice, a “bottleneck” residual-type unit is deeper than a “basic” one, providing higher dimensional features. In the proposed CoPaNet, we placed BN and ReLU after all but the last convolutional layer in every pathway" Chang teaches that the bottleneck layer is the stack of BN and ReLU elements in the CoPa unit.  See also FIG. 2(a) which shows that the stack also comes subsequent to a convolutional layer.). 

	Regarding claim 10, Chang teaches A computing system, comprising: one or more processors; and one or more non-transitory computer-readable media that store a convolutional neural network implemented by the one or more processors, the convolutional neural network comprising: ([p. 270 §4.1] "On ImageNet, we trained from scratch for 100 epochs. As shown in Table 1, we constructed several CoPaNets with 2 pathways for ImageNet. The learning rate began at 0.1 and was divided by 10 after every 30 epochs. The model was implemented using Torch7 from the Github repository fb.resnet.torch" The Github repository for Torch as linked by Chang contains Lua code that necessarily must be executed by a non-transitory computer-readable media utilizing processors to execute the code.)
	one or more inverted residual blocks, each of the one or more inverted residual blocks comprising: (See FIG. 2(c) in Chang. With respect to FIG. 5B of the instant specification, an inverted residual block is interpreted as synonymous with the proposed CoPaNet-R unit x N described in FIG. 2(c) of Chang.)
	one or more convolutional layers configured to provide a first output; and ([p. 269 §3.2] "For each pathway, we adopted a “bottleneck” residualtype unit comprising three convolutional layers" See FIG. 2(a).  Output of convolutional layer is passed directly to bottleneck layer.)
	a linear bottleneck layer configured to receive the first output and generate a second output; ([p. 269 §3.2] "In practice, a “bottleneck” residual-type unit is deeper than a “basic” one, providing higher dimensional features. In the proposed CoPaNet, we placed BN and ReLU after all but the last convolutional layer in every pathway")
	wherein the linear bottleneck layer is further configured to receive a residual and add the residual to the second output to provide a third output. (X_l interpreted as synonymous with residual which is received by bottleneck layer. In FIG. 2(a) Output of bottom convolution layer is added to the residual to provide the third output.). 

	Regarding claim 11, Chang teaches The computing system of claim 10, wherein the linear bottleneck layer is further configured to provide a second residual to a subsequent linear bottleneck layer of a subsequent inverted residual block. (FIG. 2(c ) shows three subsequent CoPa units each with two sets of two subsequent bottleneck layers such that the bottleneck layers in the third CoPaNet-R CoPa unit would be receiving a second residual.   Inverted residual block interpreted as synonymous with CoPaNet-R residual section such that each inverted residual block contains one of the sequential CoPa units.). 

	Regarding claim 12, Chang teaches The computing system of claim 10, wherein the linear bottleneck layer is further configured to provide a second residual to a next sequential linear bottleneck layer of a next sequential inverted residual block (FIG. 2(c ) shows three subsequent CoPa units each with two sets of two subsequent bottleneck layers such that the bottleneck layers in the third CoPaNet-R CoPa unit would be receiving a second residual.   Inverted residual block interpreted as synonymous with CoPaNet-R residual section such that each inverted residual block contains one of the sequential CoPa units.). 

	Regarding claim 13, Chang teaches The computing system of claim 10, wherein the linear bottleneck layer is further configured to receive the residual from a previous linear bottleneck layer of a previous inverted residual block. (FIG. 2(c ) shows three subsequent CoPa units each with two sets of two subsequent bottleneck layers such that the bottleneck layers in the third CoPaNet-R CoPa unit would be receiving a second residual.   Inverted residual block interpreted as synonymous with CoPaNet-R residual section such that each inverted residual block contains one of the sequential CoPa units.). 

	Regarding claim 14, Chang teaches The computing system of claim 10, wherein the linear bottleneck layer is further configured to receive the residual from a previous sequential linear bottleneck layer of a previous sequential inverted residual block. (FIG. 2(c ) shows three subsequent CoPa units each with two sets of two subsequent bottleneck layers such that the bottleneck layers in the third CoPaNet-R CoPa unit would be receiving a second residual.   Inverted residual block interpreted as synonymous with CoPaNet-R residual section such that each inverted residual block contains one of the sequential CoPa units.). 

	Regarding claim 15, Chang teaches The computing system of claim 10, wherein at least one of the one or more inverted residual blocks comprises an initial linear bottleneck layer structurally positioned prior to the one or more convolutional layers of such at least one inverted residual block. (Inverted residual block interpreted as synonymous with CoPaNet-R residual section such that each inverted residual block contains one of the sequential CoPa units. FIG. 2(a) shows that the bottleneck layers in the CoPa unit are both positioned prior to convolutional layers.). 

	Regarding claim 16, Chang teaches The computing system of claim 10, wherein: the linear bottleneck layer is configured to operate in a first dimensional space; and ([p. 266 §2.1] "xl denotes the input feature of the l-th residual unit, id(xl) performs identity mapping, and fl represents layers of the convolutional transformation of the l-th residual unit" [p. 269 §3.2] "we placed BN and ReLU after all but the last convolutional layer in every pathway" See also 1x1 bottom convolution layer in FIG. 2(a).  First dimensional space interpreted as 1x1.)
	the one or more convolutional layers comprise one or more expansion convolutional layers that are configured to operate in a second dimensional space, the second dimensional space comprising a larger number of dimensions than the first dimensional space. ([p. 266 §2.1] "xl denotes the input feature of the l-th residual unit, id(xl) performs identity mapping, and fl represents layers of the convolutional transformation of the l-th residual unit" [p. 269 §3.2] "we placed BN and ReLU after all but the last convolutional layer in every pathway" See also 3x3 convolution layer in FIG. 2(a).  Second dimensional space interpreted as 3x3.). 

	Regarding claim 17, Chang teaches The computing system of claim 10, wherein the one or more convolutional layers comprise one or more separable convolutional layers. ([p. 269 §3.2] "For each pathway, we adopted a “bottleneck” residualtype unit comprising three convolutional layers (1×1, 3×3, 1×1). Alternatively, we could select a “basic” residual-type unit comprising two convolutional layers (3×3, 3×3)." three depthwise convolutional layers in the  (1×1, 3×3, 1×1) system interpreted as separable.  Similarly the two 3x3 layers in the alternative system interpreted as separable.  See also FIG. 2). 

	Regarding claim 19, Chang teaches The computing system of claim 10, wherein the one or more convolutional blocks comprise a plurality of convolutional blocks arranged in a stack one after the other. ([p. 268 §3.2] "CoPaNets can be simply constructed by stacking CoPa units. Let the opponent factor k denote the number of pathway in a CoPa unit and the widening factor m multiplies the number of features in convolutional layers" [p. 269 §3.3] " to reuse the features from previous CoPa block (stacked by many CoPa units)." CoPa block taught as stack of CoPa units which are taught as containing convolutional layers, therefore, CoPa block interpreted as synonymous with convolutional block. FIG. 2 shows the blocks are stacked one after another.  Stacking as used by Chang interpreted as synonymous with arranging in a stack one after the other.). 

	Regarding claim 20, Chang teaches A neural network system implemented by one or more computers, wherein the neural network system is configured to receive an input image and to generate an output for the input image, and wherein the neural network system comprises: ([p. 270 §4.1] "On ImageNet, we trained from scratch for 100 epochs. As shown in Table 1, we constructed several CoPaNets with 2 pathways for ImageNet. The learning rate began at 0.1 and was divided by 10 after every 30 epochs. The model was implemented using Torch7 from the Github repository fb.resnet.torch" The Github repository for Torch as linked by Chang contains Lua code that necessarily must be executed by a non-transitory computer-readable media utilizing processors to execute the code.)
	a convolutional subnetwork comprising: ([p. 269 §3.2] "we propose a novel feature reuse strategy: to reuse the features from previous CoPa block (stacked by many CoPa units). This is accomplished by adding identity shortcuts after pooling layers and concatenate with the output of the next block. We refer to our model with the cross-block shortcuts as CoPaNet-R" See also FIG. 2(c ) CoPaNet-R interpreted as synonymous with convolutional subnetwork.)
	a linear bottleneck layer; and ([p. 269 §3.2] "For each pathway, we adopted a “bottleneck” residual type unit comprising three convolutional layers (1×1, 3×3, 1×1)... we placed BN and ReLU after all but the last convolutional layer in every pathway").
	one or more convolutional layers. ([p. 269 §3.2] "For each pathway, we adopted a “bottleneck” residual type unit comprising three convolutional layers"). 

	Regarding claim 21,  The neural network system of claim 20, wherein the convolutional subnetwork comprises an inverted residual subnetwork, the inverted residual subnetwork comprising a residual shortcut connection between the linear bottleneck layer and one or more of: a next linear bottleneck layer of a next inverted residual subnetwork or a previous linear bottleneck layer of a previous inverted residual subnetwork. (FIG. 2(c ) shows three subsequent CoPa units each with two sets of two subsequent bottleneck layers such that the bottleneck layers in the third CoPaNet-R CoPa unit would be receiving a second residual.   Inverted residual block interpreted as synonymous with CoPaNet-R residual section such that each inverted residual block contains one of the sequential CoPa units.). 

Claims 4 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Chang and in view of Huang (US 20190179674 A1). 

	Regarding claim 4, Chang teaches The computing system of claim 3.
	However, Chang does not explicitly teach each of the one or more separable convolutional layers is configured to separately apply both a depthwise convolution and a pointwise convolution during processing of an input to such separable convolutional layer to generate a layer output.  

Huang, in the same field of endeavor, teaches each of the one or more separable convolutional layers is configured to separately apply both a depthwise convolution and a pointwise convolution during processing of an input to such separable convolutional layer to generate a layer output. ([¶0038] "In some cases, a convolution layer may be a depthwise separable convolution. In such scenario, a convolution layer may be factorized into a depthwise convolution and a 1×1 pointwise convolution to combine the outputs of the depthwise convolution."). 

	Chang and Huang are both directed towards stacked convolutional neural networks with separable depthwise convolutional layers.  Therefore, Chang and Huang are analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Chang with the teachings of Huang by using both depthwise and pointwise convolution operations. Huang teaches as a motivation for combination ([¶0034] “Systems and method provided herein may have the advantage of lower costs and power consumption, and higher performance over current technologies. An improved computation performance may be achieved at least by a computing unit capable of performing parallel operations. Data may be processed in parallel for efficient computation. The parallel operations may correspond to data processing in a layer of a convolutional neural network and feed to a next layer in a pipeline manner.”).  

	Regarding claim 18,  Chang teaches The computing system of claim 17.	However, Chang does not explicitly teach each of the one or more separable convolutional layers is configured to separately apply both a depthwise convolution and a pointwise convolution during processing of an input to such separable convolutional layer to generate a layer output.  

Huang, in the same field of endeavor, teaches each of the one or more separable convolutional layers is configured to separately apply both a depthwise convolution and a pointwise convolution during processing of an input to such separable convolutional layer to generate a layer output. ([¶0038] "In some cases, a convolution layer may be a depthwise separable convolution. In such scenario, a convolution layer may be factorized into a depthwise convolution and a 1×1 pointwise convolution to combine the outputs of the depthwise convolution."). 

	Chang and Huang are both directed towards stacked convolutional neural networks with separable depthwise convolutional layers.  Therefore, Chang and Huang are analogous art in the same field of endeavor.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Chang with the teachings of Huang by using both depthwise and pointwise convolution operations. Huang teaches as a motivation for combination ([¶0034] “Systems and method provided herein may have the advantage of lower costs and power consumption, and higher performance over current technologies. An improved computation performance may be achieved at least by a computing unit capable of performing parallel operations. Data may be processed in parallel for efficient computation. The parallel operations may correspond to data processing in a layer of a convolutional neural network and feed to a next layer in a pipeline manner.”).  

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Szegedy (“Going deeper with convolutions”, 2015) is considered to be relevant to the claimed invention because it shares many similar convolutional neural network features with the claimed invention.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720.  The examiner can normally be reached on M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/SB/Examiner, Art Unit 2124                                                                                                                                                                                                        
/LUIS A SITIRICHE/Primary Examiner, Art Unit 2126