Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


DETAILED OFFICE ACTION

Status of Claims

Claims 1-26 are pending in this Office Action.



Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b) (2) (C) for any potential 35 U.S.C. 102(a) (2) prior art against the later invention.


1.	Claims 1,13,15,17,23,24,25 and 26   are rejected under 35 U.S.C 103 as being patentable over Das et al.  (USPUB 20180322390) in view of ALOIMONOS et al.  (USPUB 20160221190).


As per claim 1, Das et al. teaches A method of training a multi-layer neural network model (multiple layer neural network shown within FIG. 11A and Paragraph [0197]), comprising:
determining a first network model and a second network model (Paragraph [0190]-“…Convolution is a specialized kind of mathematical operation performed by two functions to produce a third function that is a modified version of one of the two original functions.  In convolutional network terminology, the first function to the convolution can be referred to as the input, while the second function can be referred to as the convolution kernel.  The output may be referred to as the feature map….” And Paragraphs [0192] and [0194]), 
the first network model providing information for training the second network model (Paragraphs [0197-0198] - “…The kernels associated with the convolutional layers perform convolution operations, the output of which is sent to the next layer.  The dimensionality reduction performed within the convolutional layers is one aspect that enables the CNN to scale to process large images. …”) ;  setting a downscaling layer for at least one layer in the first network model( Paragraph [0162]- “…a downscaled value can be output and consumers of the downscale value can store the scale factor from the scale factor register in conjunction with the output value for use when performing subsequent operations using scaled values….”), and transmitting filter parameters of the downscaling layer to the second network model as training information ( Paragraph [0191]- “…The nodes in the CNN input layer are organized into a set of "filters" (feature detectors inspired by the receptive 
fields found in the retina), and the output of each set of filters is propagated to nodes in successive layers of the network.  The computations for a CNN include applying the convolution mathematical operation to each filter to produce the output of that filter….”).  

	However within analogous art, ALOIMONOS et al. teaches wherein the number of filters and filter kernel of the downscaling layer are identical to those of layers to be trained in the second network model (The similarity of the network layers taught within Paragraph [0048] - “The first convolution layer can have 32 filters of size 5x5, the second convolution layer can have 32 filters of size 5x5…”);
	One of ordinary skill in the art would have been motivated to combine the teaching of ALOIMONOS et al.  within the modified teaching of the Optimized computer hardware for machine learning operations mentioned by Das et al.  because the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. provides a system and method for implementing the processing of a set of video images to obtain a collection of semantic entities with computer implemented machine learning ( Paragraph [0008]).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. within the modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al. for implementation of a system and method for the processing of a set of video images to obtain a collection of semantic entities with computer implemented machine learning ( Paragraph [0008]).


As per claim 13, Das et al. teaches A system for training a multi-layer neural network model (multiple layer neural network shown within FIG. 11A and Paragraph [0197]), comprising:
a server which stores at least one first network model( Paragraph [0190]- “…Convolution is a specialized kind of mathematical operation performed by two functions to produce a third function that is a modified version of one of the two original functions.  In convolutional network terminology, the first function to the convolution can be referred to as the input, while the second function can be referred to as the convolution kernel.  The output may be referred to as the feature map….” And Paragraphs [0192] and [0194]), the first network model providing information for training the second network model (Paragraphs [0197-0198] - “…The kernels associated with the convolutional layers perform convolution operations, the output of which is sent to the next layer.  The dimensionality reduction performed within the convolutional layers is one aspect that enables the CNN to scale to process large images. …”) ;  wherein the server sets a downscaling layer for at least one layer of the first network model and outputs filter parameters of the downscaling layer as training information ( Paragraph [0162]- “…a downscaled value can be output and consumers of the downscale value can store the scale factor from the scale factor register in conjunction with the output value for use when performing subsequent operations using scaled values….”), and a terminal which stores the second network model, the terminal being used to train layers in the second network model by using training information output by the server ( Paragraph [0191]- “…The nodes in the CNN input layer are organized into a set of "filters" (feature detectors inspired by the receptive fields found in the retina), and the output of each set of filters is propagated to nodes in successive layers of the network.  The computations for a CNN include applying the convolution mathematical operation to each filter to produce the output of that filter….”).  
	Das et al. does not explicitly teach wherein the number of filters and filter kernels of the downscaling layer are identical to those of layers to be trained in the second network model; 
	However within analogous art, ALOIMONOS et al. teaches wherein the number of filters and filter kernels of the downscaling layer are identical to those of layers to be trained in the second network model (The similarity of the network layers taught within Paragraph [0048] - “The first convolution layer can have 32 filters of size 5x5, the second convolution layer can have 32 filters of size 5x5…”);
	One of ordinary skill in the art would have been motivated to combine the teaching of ALOIMONOS et al.  within the modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  because the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. provides a system and method for implementing the processing of a set of video images to obtain a collection of semantic entities with computer implemented machine learning ( Paragraph [0008]).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. within the modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al. for implementation of a system and method for the processing of a set of video images to obtain a collection of semantic entities with computer implemented machine learning ( Paragraph [0008]).

As per claim 15, Combination of Das et al. and ALOIMONOS et al. teach claim 13,
Das et al. teaches  wherein the terminal initiates a picture processing request to the server( Paragraphs [0139] and [0177-0178]) , the picture processing request including a terminal identity and pictures requested to be processed( Table 2 shows identification and Paragraphs [0195-0196]) ; the server further determines the terminal initiating the picture processing request and the second network model stored in the terminal according to the terminal identity in the received picture processing request( Paragraphs [0213-0215] and [0270]) .  

As per claim 17, Das et al. teaches An apparatus for training a multi-layer neural network model (multiple layer neural network shown within FIG. 11A and Paragraph [0197]), comprising:  a storage configured to store at least one network model (Paragraph [0190] - “…Convolution is a specialized kind of mathematical operation performed by two functions to produce a third function that is a modified version of one of the two original functions.  In convolutional network terminology, the first function to the convolution can be referred to as the input, while the second function can be referred to as the convolution kernel.  The output may be referred to as the feature map….” And Paragraphs [0192] and [0194]), the network model providing information for training a network model in other apparatus (Paragraphs [0197-0198] - “…The kernels associated with the convolutional layers perform convolution operations, the output of which is sent to the next layer.  The dimensionality reduction performed within the convolutional layers is one aspect that enables the CNN to scale to process large images. …”); one or more processors that are configured to set a downscaling layer for at least one layer of the network model stored in the storage( Paragraph [0162]- “…a downscaled value can be output and consumers of the downscale value can store the scale factor from the scale factor register in conjunction with the output value for use when performing subsequent operations using scaled values….”), an output module configured to output filter parameters of the downscaling layer as training information to the other apparatus ( Paragraph [0191]- “…The nodes in the CNN input layer are organized into a set of "filters" (feature detectors inspired by the receptive fields found in the retina), and the output of each set of filters is propagated to nodes in successive layers of the network.  The computations for a CNN include applying the convolution mathematical operation to each filter to produce the output of that filter….”). 
	Das et al. does not explicitly teach wherein the number of filters and the filter kernel of the downscaling layer are identical to those of the layers to be trained in the network model in the other apparatus;
	However within analogous art, ALOIMONOS et al. teaches wherein the number of filters and the filter kernel of the downscaling layer are identical to those of the layers to be trained in the network model in the other apparatus (The similarity of the network layers taught within Paragraph [0048] - “The first convolution layer can have 32 filters of size 5x5, the second convolution layer can have 32 filters of size 5x5…”);
	One of ordinary skill in the art would have been motivated to combine the teaching of ALOIMONOS et al.  within the modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  because the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. provides a system and method for implementing the processing of a set of video images to obtain a collection of semantic entities with computer implemented machine learning ( Paragraph [0008]).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. within the modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al. for implementation of a system and method for the processing of a set of video images to obtain a collection of semantic entities with computer implemented machine learning ( Paragraph [0008]).


As per claim 23, Das et al. teaches An application method of a multi-layer neural network model ( multiple layer neural network shown within FIG. 11A and Paragraph [0197]) comprising: storing a second network model trained based on a training method which comprises: determining a first network model and the second network model( Paragraph [0190]- “…Convolution is a specialized kind of mathematical operation performed by two functions to produce a third function that is a modified version of one of the two original functions.  In convolutional network terminology, the first function to the convolution can be referred to as the input, while the second function can be referred to as the convolution kernel.  The output may be referred to as the feature map….” And Paragraphs [0192] and [0194]), the first network model providing information for training the second network model (Paragraphs [0197-0198] - “…The kernels associated with the convolutional layers perform convolution operations, the output of which is sent to the next layer.  The dimensionality reduction performed within the convolutional layers is one aspect that enables the CNN to scale to process large images. …”); setting a downscaling layer for at least one layer in the first network model( Paragraph [0162]- “…a downscaled value can be output and consumers of the downscale value can store the scale factor from the scale factor register in conjunction with the output value for use when performing subsequent operations using scaled values….”), 
 transmitting filter parameters of the downscaling layer to the second network model as training information (Paragraph [0191] - “…The nodes in the CNN input layer are organized into a set of "filters" (feature detectors inspired by the receptive fields found in the retina), and the output of each set of filters is propagated to nodes in successive layers of the network.  The computations for a CNN include applying the convolution mathematical operation to each filter to produce the output of that filter….”); receiving a data set corresponding to task requirements that can be executed by the stored second network model (Paragraphs [0176-0178]); computing the data set in each of layers from top to bottom in the stored second network model, and outputting the results (Paragraphs [0264] and [0294]).  
	Das et al. does not explicitly teach wherein the number of filters and filter kernel of the downscaling layer are identical to those of layers to be trained in the second network model;	However within analogous art, ALOIMONOS et al. teaches wherein the number of filters and filter kernel of the downscaling layer are identical to those of layers to be trained in the second network model (The similarity of the network layers taught within Paragraph [0048]- “The first convolution layer can have 32 filters of size 5x5, the second convolution layer can have 32 filters of size 5x5…”);
	One of ordinary skill in the art would have been motivated to combine the teaching of ALOIMONOS et al.  within the modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  because the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. provides a system and method for implementing the processing of a set of video images to obtain a collection of semantic entities with computer implemented machine learning ( Paragraph [0008]).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. within the modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al. for implementation of a system and method for the processing of a set of video images to obtain a collection of semantic entities with computer implemented machine learning ( Paragraph [0008]).

As per claim 24,  
Limitations within claim 24 are similar to the limitations within claim 23, therefore the prior art on record teaching claim 23 teaches the limitations within claim 24. 

As per claim 25, Combination of Das et al. and ALOIMONOS et al. teach claim 24,
Das et al. teaches further comprising: a post-processing module configured to execute a post-processing on the results output by the processing module (post processing taught within Paragraphs [0245] and [0301]).  


As per claim 26, Das et al. teaches A non-transitory computer-readable storage medium storing instructions (non-transitory machine readable storage media taught within Paragraph [0317]) for causing a computer to perform a training method of a multi-layer neural network model according to a training method when executed by the computer (multiple layer neural network shown within FIG. 11A and Paragraph [0197]), the training method comprises:  
determining a first network model and a second network model (Paragraph [0190] - “…Convolution is a specialized kind of mathematical operation performed by two functions to produce a third function that is a modified version of one of the two original functions.  In 
convolutional network terminology, the first function to the convolution can be referred to as the input, while the second function can be referred to as the convolution kernel.  The output may be referred to as the feature map….” And Paragraphs [0192] and [0194]), the first network model providing information for training the second network model (Paragraphs [0197-0198] - “…The kernels associated with the convolutional layers perform convolution operations, the output of which is sent to the next layer.  The dimensionality reduction performed within the convolutional layers is one aspect that enables the CNN to scale to process large images. …”); setting a downscaling layer for at least one layer in the first network model( Paragraph [0162]- “…a downscaled value can be output and consumers of the downscale value can store the scale factor from the scale factor register in conjunction with the output value for use when performing subsequent operations using scaled values….”), transmitting filter parameters of the downscaling layer to the second network model as training information( Paragraph [0191]- “…The nodes in the CNN input layer are organized into a set of "filters" (feature detectors inspired by the receptive fields found in the retina), and the output of each set of filters is propagated to nodes in successive layers of the network.  The computations for a CNN include applying the convolution mathematical operation to each filter to produce the output of that filter….”).  
	Das et al. does not explicitly teach wherein the number of filters and filter kernel of the downscaling layer are identical to those of layers to be trained in the second network model;	However within analogous art, ALOIMONOS et al. teaches wherein the number of filters and filter kernel of the downscaling layer are identical to those of layers to be trained in the second network model (The similarity of the network layers taught within Paragraph [0048]- “The first convolution layer can have 32 filters of size 5x5, the second convolution layer can have 32 filters of size 5x5…”);
	One of ordinary skill in the art would have been motivated to combine the teaching of ALOIMONOS et al.  within the modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  because the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. provides a system and method for implementing the processing of a set of video images to obtain a collection of semantic entities with computer implemented machine learning ( Paragraph [0008]).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. within the modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al. for implementation of a system and method for the processing of a set of video images to obtain a collection of semantic entities with computer implemented machine learning ( Paragraph [0008]).

2.	Claims 2,3,8,14,18 and 19 are rejected under 35 U.S.C 103 as being patentable over Das et al.  (USPUB 20180322390) in view of ALOIMONOS et al.  (USPUB 20160221190) in further view of Jian Cheng (NPL doc: “Recent advances in efficient computation of deep
convolutional neural networks”, Front Inform Technol Electron Eng 2018 19(1): ,Crosschecked Jan 26, 2018,Pages – 64- 72).



As per claim 2, Combination of  Das et al. and ALOIMONOS et al.  teach claim 1, 
Within analogous art, Jian Cheng teaches further comprising: dividing layers in the first network model into groups( page 66, col. 1- 3.3 –group-level pruning ) , wherein each group includes at least one layer and corresponds to one layer to be trained in the second network model(first and second network model taught within page 71 – col. 1 – 6-teacher-student network ) ; and wherein setting a downscaling layer for at least one layer in the first network model( page 67,-col. 1 ) , comprises: setting a downscaling layer for each group in the first network model respectively(page 66, col. 1 – vector-level and kernel-level prunings) , wherein the number of filters and the filter kernel of the downscaling layer set for the group are identical to those of the layers to be trained corresponding to the group( page 67- col. 1- 2- 3.4-filter-level pruning).
	One of ordinary skill in the art would have been motivated to combine the teaching of Jian Cheng within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. because the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng  provides a system and method for decreasing the computational complexity of the convolutional Layers( Page 65, col. 2 , lines 19 - 23).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng  within the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al.  for implementation of a system and method for decreasing the computational complexity of the convolutional Layers( Page 65, col. 2 , lines 19 - 23).


As per claim 3, Combination of  Das et al. and ALOIMONOS et al.  and Jian Cheng teach claim 2, 
Within analogous art, Jian Cheng teaches further comprising: training each of layers of the second network model by using the filter parameters of each downscaling layer as the training information(Page 67, Col. 1- “…Filter-level pruning methods prune the convolutional filters or channels which make the deep networks thinner. After the filter pruning for one layer, the number of input channels of the next layer is also reduced. Thus, filter-level pruning is more efficient for accelerating deep networks. Luo et al.(2017) proposed a filter-level pruning method named ThiNet. They used the next layer’s feature map to guide the filter pruning in the current layer….”) , the output results of the first network model and the output results of the second network model(output of the teacher and student network taught within Page 71-Col. 1 -6.Teacher-student network) .  
	One of ordinary skill in the art would have been motivated to combine the teaching of Jian Cheng within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. because the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng  provides a system and method for decreasing the computational complexity of the convolutional Layers( Page 65, col. 2 , lines 19 - 23).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng  within the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al.  for implementation of a system and method for decreasing the computational complexity of the convolutional Layers( Page 65, col. 2 , lines 19 - 23).

As per claim 8,  Combination of  Das et al. and ALOIMONOS et al.  and Jian Cheng teach claim 2, 
Das et al. teaches wherein layers of which the type is the same as that of the corresponding layer to be trained is included in the group( training of the layers of neural network taught within Paragraphs  [0205] and [0213]) .

As per claim 14, Combination of  Das et al. and ALOIMONOS et al. teach claim 13,
Within analogous art, Jian Cheng teaches wherein the server further outputs the output results of the first network model as training information ( CPU and GPU which are elements of servers for training within CNN network model taught within Page 65, Col. 1 – 2.Background-“…For the CNN training phase, the computational complexity is not a critical problem, thanks to the high-performance GPUs or CPU clouds….”) ;  the terminal trains each of layers to be trained of the second network model by using the filter parameters of the downscaling layer (Page 67, Col. 1- “…Filter-level pruning methods prune the convolutional filters or channels which make the deep networks thinner. After the filter pruning for one layer, the number of input channels of the next layer is also reduced. Thus, filter-level pruning is more efficient for accelerating deep networks. Luo et al.(2017) proposed a filter-level pruning method named ThiNet. They used the next layer’s feature map to guide the filter pruning in the current layer….”), the output results of the first network model and the output results of the second network model(output of the teacher and student network taught within Page 71-Col. 1 -6.Teacher-student network) .  
	One of ordinary skill in the art would have been motivated to combine the teaching of Jian Cheng within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. because the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng  provides a system and method for decreasing the computational complexity of the convolutional Layers( Page 65, col. 2 , lines 19 - 23).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng  within the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al.  for implementation of a system and method for decreasing the computational complexity of the convolutional Layers( Page 65, col. 2 , lines 19 - 23).


As per claim 18,  Combination of  Das et al. and ALOIMONOS et al. teach claim 17,
Within analogous art, Jian Cheng teaches further comprising: a grouping module configured to dividing the layers in the network model stored in the storage into groups( page 66, col. 1- 3.3 –group-level pruning ), wherein each group includes at least one layer and corresponds to one layer to be trained in the network model in the other apparatus(first and second network model taught within page 71 – col. 1 – 6-teacher-student network );  wherein the processors are further used to set a downscaling layer for each group(page 66, col. 1 – vector-level and kernel-level prunings), wherein the number of filters and the filter kernel of the downscaling layer set for the group are identical to those of the layers to be trained corresponding to the group( page 67- col. 1- 2- 3.4-filter-level pruning)  .
	One of ordinary skill in the art would have been motivated to combine the teaching of Jian Cheng within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. because the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng  provides a system and method for decreasing the computational complexity of the convolutional Layers( Page 65, col. 2 , lines 19 - 23).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng  within the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al.  for implementation of a system and method for decreasing the computational complexity of the convolutional Layers( Page 65, col. 2 , lines 19 - 23).

As per claim 19,  Combination of  Das et al. and ALOIMONOS et al.  and Jian Cheng teach claim 18, 
Das et al.  teaches wherein the output module is further used to output the output results of the network model stored in the storage as training information to the other apparatus( output circuity  and processing taught within Paragraphs [0063] , [0203] and  [0266-0267] ) .  


3.	Claims 4,11,12 and 20   are rejected under 35 U.S.C 103 as being patentable over Das et al.  (USPUB 20180322390) in view of ALOIMONOS et al.  (USPUB 20160221190) in further view of Jian Cheng (NPL  doc: “Recent advances in efficient computation of deep convolutional neural networks”, Front Inform Technol Electron Eng 2018 19(1): ,Crosschecked Jan 26, 2018,Pages – 64- 72) and Wang et al.  (USPUB 20170345130).

As per claim 4, Combination of  Das et al. and ALOIMONOS et al.  and Jian Cheng teach claim 2, 
Within analogous art, Wang et al. teaches wherein the downscaling layer sequentially includes a basis matrix layer and an identity mapping layer( mapping layers taught within Paragraph [0219-0220] and [0342]) , wherein the number of filters and the filter kernel of the basis matrix layer are identical to those of the corresponding layer to be trained and size of an output feature map of the identity mapping layer is identical to that of the last layer in the group( Paragraphs [0454-0455] and feedforwarding of the neural network taught within Paragraphs [0558-0560]) ) .  
	One of ordinary skill in the art would have been motivated to combine the teaching of Wang et al. within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and  the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng because the  Enhancing Visual Data Using And Augmenting Model Libraries  mentioned by Wang et al. provides a system and method for implementing  hierarchical algorithm developed during the down-sampling process( Paragraph [0010]).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the  Enhancing Visual Data Using And Augmenting Model Libraries  mentioned by Wang et al. within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and  the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng for implementation of a system and method for hierarchical algorithm developed during the down-sampling process( Paragraph [0010]).


As per claim 11,  Combination of  Das et al. and ALOIMONOS et al.  and Jian Cheng and  Wang et al. teach claim 4,
 Within analogous art, Wang et al. teaches wherein the last layer in the group and the identity mapping layer have the same output precision( mapping layers taught within Paragraphs [0219-0220] and [0425] and [0428]) , and the basis matrix layer and the layer to be trained corresponding to the group have the same output precision( Paragraph [00520]- “…reproduced as a set of representations of the frame such as a matrix of vectors with generalized information for portions of the frame or groups of one or more pixels….” And Paragraph [0560]- “…the library can therefore be thought of as a "matrix" of example based models, with its rows corresponding to the scene content type, and its columns corresponding to the image artefact severity.”) .  
	One of ordinary skill in the art would have been motivated to combine the teaching of Wang et al. within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and  the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng because the  Enhancing Visual Data Using And Augmenting Model Libraries  mentioned by Wang et al. provides a system and method for implementing  hierarchical algorithm developed during the down-sampling process( Paragraph [0010]).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the  Enhancing Visual Data Using And Augmenting Model Libraries  mentioned by Wang et al. within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and  the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng for implementation of a system and method for hierarchical algorithm developed during the down-sampling process( Paragraph [0010]).

As per claim 12, Combination of  Das et al. and ALOIMONOS et al.  and Jian Cheng and  Wang et al. teach claim 11,
Within analogous art, Wang et al. teaches wherein the output precision is equal to or less than 32 bits( Paragraph [0093]- “…a reduction in pixel data precision (e.g. from 32-bit to 16-bit) and quantization of visual data are methods for producing lower-quality visual data from higher-quality visual data…”).  

As per claim 20, Combination of  Das et al. and ALOIMONOS et al.  and Jian Cheng teach claim 18,
Within analogous art, Wang et al. teaches wherein the downscaling layer sequentially includes a basis matrix layer and an identity mapping layer ( mapping layers taught within Paragraph [0219-0220] and [0342]), wherein the number of filters and the filter kernel of the basis matrix layer are identical to those of the corresponding layer to be trained and size of an output feature map of the identity mapping layer is identical to that of the last layer in the group( Paragraphs [0454-0455] and feedforwarding of the neural network taught within Paragraphs [0558-0560]) ) . 
	One of ordinary skill in the art would have been motivated to combine the teaching of Wang et al. within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and  the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng because the  Enhancing Visual Data Using And Augmenting Model Libraries  mentioned by Wang et al. provides a system and method for implementing  hierarchical algorithm developed during the down-sampling process( Paragraph [0010]).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the  Enhancing Visual Data Using And Augmenting Model Libraries  mentioned by Wang et al. within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and  the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng for implementation of a system and method for hierarchical algorithm developed during the down-sampling process( Paragraph [0010]).


4.	Claims 5 and 21   are rejected under 35 U.S.C 103 as being patentable over Das et al.  (USPUB 20180322390) in view of ALOIMONOS et al.  (USPUB 20160221190) in further view of Jian Cheng (NPL  doc: “Recent advances in efficient computation of deep convolutional neural networks”, Front Inform Technol Electron Eng 2018 19(1): ,Crosschecked Jan 26, 2018,Pages – 64- 72) and Wang et al.  (USPUB 20170345130) and Junho Yim (NPL  doc: “Gift from Knowledge Distillation:Fast Optimization, Network Minimization and Transfer Learning”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July -2017, Pages-  4134-4139).


As per claim 5, Combination of  Das et al. and ALOIMONOS et al.  and Jian Cheng and  Wang et al. teach claim 4,
Within analogous art, Junho Yim teaches wherein the similarity between information saved in the output feature map of the basis matrix layer and information saved in the output feature map of the identity mapping layer is higher than a threshold( Page 4134, col. 1- “…The key difference between the Gramian matrix in [6]and ours is that we compute the Gramian matrix across layers,whereas the Gramian matrix in [6] computes the inner products between features within a layer. Figure 1 shows the concept diagram of our proposed method of transferring distilled knowledge. The extracted feature maps from two layers are used to generate the flow of solution procedure(FSP) matrix. The student DNN is trained to make its FSP matrix similar to that of the teacher DNN….”) .  
	One of ordinary skill in the art would have been motivated to combine the teaching of Junho Yim within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and  the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng and the  Enhancing Visual Data Using And Augmenting Model Libraries  mentioned by Wang et al. because the Gift from Knowledge Distillation:Fast Optimization, Network Minimization and Transfer Learning mentioned by Junho Yim provides a system and method for implementing  
a pretrained deep neural network (DNN) is refined and transferred to another DNN( Abstract). 
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Gift from Knowledge Distillation:Fast Optimization, Network Minimization and Transfer Learning mentioned by Junho Yim within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al. and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng and  the  Enhancing Visual Data Using And Augmenting Model Libraries mentioned by Wang et al. for implementation of a system and method for a pretrained deep neural network (DNN) is refined and transferred to another DNN( Abstract). 


As per claim 21,  Combination of  Das et al. and ALOIMONOS et al.  and Jian Cheng and  Wang et al. teach claim 20,
Within analogous art, Junho Yim teaches wherein the similarity between information saved in the output feature map of the basis matrix layer and information saved in the output feature map of the identity mapping layer is higher than a threshold ( Page 4134, col. 1- “…The key difference between the Gramian matrix in [6]and ours is that we compute the Gramian matrix across layers,whereas the Gramian matrix in [6] computes the inner products between features within a layer. Figure 1 shows the concept diagram of our proposed method of transferring distilled knowledge. The extracted feature maps from two layers are used to generate the flow of solution procedure(FSP) matrix. The student DNN is trained to make its FSP matrix similar to that of the teacher DNN….”) .  
One of ordinary skill in the art would have been motivated to combine the teaching of Junho Yim within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and  the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng and the  Enhancing Visual Data Using And Augmenting Model Libraries  mentioned by Wang et al. because the Gift from Knowledge Distillation:Fast Optimization, Network Minimization and Transfer Learning mentioned by Junho Yim provides a system and method for implementing  
a pretrained deep neural network (DNN) is refined and transferred to another DNN( Abstract). 
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Gift from Knowledge Distillation:Fast Optimization, Network Minimization and Transfer Learning mentioned by Junho Yim within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al. and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng and  the  Enhancing Visual Data Using And Augmenting Model Libraries mentioned by Wang et al. for implementation of a system and method for a pretrained deep neural network (DNN) is refined and transferred to another DNN( Abstract). 



5.	Claims 9 and 10  are rejected under 35 U.S.C 103 as being patentable over Das et al.  (USPUB 20180322390) in view of ALOIMONOS et al.  (USPUB 20160221190) in further view of Jian Cheng (NPL  doc: “Recent advances in efficient computation of deep convolutional neural networks”, Front Inform Technol Electron Eng 2018 19(1): ,Crosschecked Jan 26, 2018,Pages – 64- 72) and Liu et al. (USPUB 20200012940).

  
As per claim 9,  Combination of  Das et al. and ALOIMONOS et al.  and Jian Cheng teach claim 2, 
Within analogous art, Liu et al. teaches wherein a standardization layer is included in the group( Paragraph [0057]- “As shown in Table 1, the convolutional neural network may include several convolutional layers as well as down-convolutions as alternatives to max-pooling layers.  Rectified Linear Units may be used as activation functions and Batch Normalization may be used for regularization.  No further techniques may be used for regularization since the neural network can be trained end to end using widely available video data,...” and Paragraph [0150]) .  
	One of ordinary skill in the art would have been motivated to combine the teaching of Liu et al. within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and  the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng because the Frame interpolation via adaptive convolution and adaptive separable convolution  mentioned by Liu et al.  provides a system and method for implementing  frame interpolation via adaptive convolution ( Paragraph [0003]).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Frame interpolation via adaptive convolution and adaptive separable convolution  mentioned by Liu et al. within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al. and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and  the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng for implementation of a system and method for hierarchical algorithm developed during the down-sampling process( Paragraph [0003]).

As per claim 10,  Combination of  Das et al. and ALOIMONOS et al.  and Jian Cheng teach claim 2, 
Within analogous art, Liu et al. teaches wherein the number of layers contained in the group is determined according to the depth of the first network model( Paragraphs [0038-0039]) .  
	One of ordinary skill in the art would have been motivated to combine the teaching of Liu et al. within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al.  and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and  the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng because the Frame interpolation via adaptive convolution and adaptive separable convolution  mentioned by Liu et al.  provides a system and method for implementing  frame interpolation via adaptive convolution ( Paragraph [0003]).
	Therefore, it would have been obvious for one in the ordinary skills in the art before the effective filing date of the claimed invention to implement the Frame interpolation via adaptive convolution and adaptive separable convolution  mentioned by Liu et al. within the combined modified teaching of the Optimized compute hardware for machine learning operations mentioned by Das et al. and the Learning manipulation actions from unconstrained videos mentioned by ALOIMONOS et al. and  the Recent advances in efficient computation of deep convolutional neural networks mentioned by Jian Cheng for implementation of a system and method for hierarchical algorithm developed during the down-sampling process( Paragraph [0003]).



It is noted that any citations to specific, pages, columns, lines, or figures in the prior art references and any interpretation of the reference should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. See MPEP 2123. 


Allowable Subject Matter

6.          Claims 6,7,16 and 22 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

7.         The following is an examiner’s statement of reasons for objecting the claims as allowable subject matter: 

As to claim 6, prior art of record does not teach or suggest the limitation mentioned within claim 6   “ training the basis matrix layer and the identity mapping layer, such that the residual error between the output feature maps in a set of output feature maps of the last layer in the group and the output feature maps in a set of output feature maps of the identity mapping layer is less than a set value when the input feature map of the first layer in the group is identical to the input feature map of the basis matrix layer.”

As to claim 7, Claim 7 depends on objected allowable claim 6 , therefore claim 7 is objected as objected allowable claim .

As to claim 16, prior art of record does not teach or suggest the limitation mentioned within claim 16  “ wherein the server further divides layers in the first network model into groups, wherein each group includes at least one layer and corresponds to one layer to be trained in the second network model, and sets a downscaling layer for each group, wherein the number of filters and the filter kernel of the downscaling layer set for the group are identical to those of the layers to be trained corresponding to the group.”

As to claim 22, prior art of record does not teach or suggest the limitation mentioned within claim 22  “…an internal training module configured to train the basis matrix layer and the identity mapping layer, such that the residual error between the output feature map of the last layer in the group and the output feature map of the identity mapping layer is less than a set value when the input feature map of the first layer in the group is identical to the input feature map of the basis matrix layer.”


Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Examiner’s Notes

8.	The Examiner acknowledges the following prior arts below as pertinent to the current applications claim limitations and inventive concept, although the following prior arts shown below were not relied upon to address the limitations within the claim , they are analogous art mentioning the inventive concept key points on ( teacher/student model within neural networks, Deep neural network ( DNN), multiple layers and groups and filtering and kernel filter within image processing etc.).

1) 	USPUB- 20190378006	
2)	USPUB- 20190205748
3)	USPUB- 20200110982
4) 	USPUB- 20190122077
5) 	USPUB- 20180060649
6) 	USPAT- 10832133

7)	Yu Cheng,"Model Compression and Acceleration for Deep Neural Networks,"IEEE Signal Processing Magazine,9th January 2018, Pages 127-134.

8) 	Yoon Kim,"Sequence-Level Knowledge Distillation",arXiv:1606.07947v4, 22 Sep 2016, Pages 1-6.

9)	Doyeob Yeo1,"SEQUENTIAL KNOWLEDGE TRANSFER IN TEACHER–STUDENT FRAMEWORK USING DENSELY DISTILLED FLOW-BASED INFORMATION," 2018 25th IEEE International Conference on Image Processing (ICIP),06 September 2018,Pages 674-677.


Conclusion

9. 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to OMAR S. ISMAIL whose telephone number is (571)272-9799 and Fax # (571)273-9799. The examiner can normally be reached on M-F: 9:00 AM - 6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http:/ If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, David C. Payne can be reached on (571)272-3024. The fax phone number for the organization where this application or proceeding is assigned is (571)273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free)? If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/OMAR S ISMAIL/Examiner, Art Unit 2637