DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This communication is responsive to the amendment filed 11/03/2022.
Claims 11, 12, 14, 17, 18, and 20 have been amended, claims 21-25 canceled and claims 11-20 have been added.
Claims 11-20 are pending with claims 1 and 10 as independent claims.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 11-20 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (US 2018/0130177, filed 02/19/2016, hereinafter as Wang) in view of Zhang et al. (US 2016/0379352, filed 11/03/2015, hereinafter as Zhang).

Claim 11: A method comprising:
a) implementing a convolutional neural network in a processing circuit, the convolutional neural network configured to 
receive an input data structure comprising a group of values corresponding to signal samples and to generate a corresponding classification output indicative of a selected one among a plurality of predefined classes, (Wang discloses in [0377-0384] “The original video data 70 is provided as input visual data… The original video data 70 is then split into single full-resolution frames at step 80 (or step 190), i.e. into a sequence of images at the full resolution and/or quality of the original video data 70… the full-resolution frames can be grouped into scenes or sections of frames having common features… use a range of predefined scene classes and then automatically classifying scenes using the predefined classes.” EX.: video data may be received as input data structure and the input video data may be processed to scenes that can be classified into predefined classes)
wherein the convolutional neural network comprises an ordered sequence of layers, each layer of the sequence configured to receive a corresponding layer input data structure comprising a group of input values, and generate a corresponding layer output data structure comprising a group of output values by convolving the layer input data structure with at least one corresponding filter comprising a corresponding group of weights, the layer input data structure of the first layer of the sequence corresponding to the input data structure, and the layer input data structure of a generic layer of the sequence different from the first layer corresponding to the layer output data structure generated by the previous layer in the sequence; (Wang discloses in [0216, 0371, 0420-0421 and 0435] “the hierarchical algorithm comprises a plurality of connected layers, and the connected layers may be sequential… convolutional neural network models (or hierarchical algorithms) can be transmitted along with the low-resolution frames of video data because the convolutional neural network models reduce the data transmitted in comparison with learned dictionaries being transmitting along with the low-resolution images… sub-pixel convolution layer, produces a high resolution image from the low resolution feature maps directly, with a more distinguishable filter for each feature map… convolution with a stride of 1/r in the low-resolution space is performed with a filter W.sub.d of size k.sub.s with a weight spacing 1/r to activate different parts of W.sub.d for the convolution. The weights that fall between the pixels are not activated and therefore not calculated… The first layer of the network is described as obtaining low resolution features whilst the high resolution features are only learnt in the last layer. It is not necessary to learn the low resolution features in a high resolution space. By keeping the resolution low for the first couple of layers of convolutions, the number of operations required can be reduced.” EX.: the first layer may receive input visual data. A second layer may downscale the input visual data. see fig. 22) and
b) training the convolutional neural network to update the weights of the filters of the layers by exploiting a training set of training input data structures belonging to known predefined classes, the training comprising:
b1) generating a modified convolutional neural network by downscaling, for at least one layer of the sequence of layers of the convolutional neural network, the at least one corresponding filter to obtain a downscaled filter comprising a reduced number of weights; (Wang discloses in [0417-0418 and 0512-0514] the steps 1010-1070 of fig. 10 represent model training and optimization process that down-samples or downscales parameters/weights of each image of the input 1010 and save the downscaled parameters/weights as one or more models in a library. See fig. 10)
b2) downscaling the training input data structures to obtain corresponding downscaled training input data structures comprising a reduced number of values; (Wang discloses in [0417-0418 and 0512-0514] “An initial model is selected based on a metric of the selected scene that can be compared to metrics associated with the reconstruction models stored in the library. The selection of the initial model may be based on only these metrics, or alternatively multiple initial models may be applied independently to the downsampled video to produce an enhanced lower-resolution frame or recreate a frame, the quality of the enhanced or recreated frame being compared with the original scene to select the most appropriate initial model from the group…The lower-resolution frame may be 33% to 50% of the data size relative to the data size of the original-resolution frame, while the representations of the frame can be anything from 1% to 50% of the data size of the original-resolution frame.” EX. The downsampled data/frames/samples have reduced dimension values that are 33% to 50% of the data size of the original resolution frame)
b3) for each downscaled training input data structure of at least a subset of the training set, providing such downscaled training input data structure to the modified convolutional neural network to generate a corresponding classification output, and comparing the classification output with the predefined class to which the training input data structure corresponding to the downscaled training input data structure belongs b4) and updating the weights of the filters of the layers based on the comparisons; (Wang discloses in [0417-0418, 0483, 0500-0507, and 0512-0514] “The quality of the recreated or enhanced scene is compared with the original using objective metrics such as error rate, PSNR and SSIM and/or subjective measures. An appropriate model is then selected at step 1040 based on these quality comparisons as well as whether to use solely a model, or a set of representations of the frame, or the lower-resolution frame. The library from which the models are selected comprises a set of pre-trained models which have been generated from example, or training, videos and which are associated with metrics to enable comparison of the video from which the models were generated with the selected scene being enhanced…Along with the reconstruction models, in some embodiments data needs to be stored relating to the example or training video for each reconstruction model in the library to enable each model to be matched to a scene that is being up-scaled… the data stored relating to the example or training video can be metadata or metrics related to the video data, or it can be samples or features of the example or training video.” EX. The metrics/metadata represent training sample or predefined classes that appropriate/suitable selected model would utilize to compare downscaled frames/data to the training samples or the predefined classes in order to classify or recreate high-resolution frames. Thus, the super resolution techniques would output representations that can be used to enhance the high-resolution images created from the lower-resolution images or the downscaled images)
Wang discloses in [0589-0590] “the dimension of the full resolution images 2110 extracted may be reduced based on at least one predetermined factor at step 1810. In these embodiments, at step 1810, the at least one predetermined factor may be used to select individual pixels from the extracted full resolution images 2110 to form a lower resolution representation 2120 of the extracted full resolution images 2110… a predetermined factor of 2 may indicate every other pixel in both the horizontal and vertical dimensions is selected, as shown in FIG. 21… the reduced dimension visual data may be concatenated in step 1820, to form a lower resolution representation 2120 of the extracted full resolution images 2110.” EX.: if the original size of a frame resolution is 6X6 pixels and the predetermined factor is 2, then the frame may be downscaled to 3X3 pixels or 50% of the original. See figs. 20 and 21.   
Wang does not explicitly disclose a reduced number of weights. However, Zhang, in an analogous art, discloses in ([0073] “A plurality of filters 211 are used to filter the input feature map 209. Each filter 211 is characterized by dimensions of k×k×D, where the additional variable, k, represents height and width of each filter 211… the optional pooling layer 216 is defined by parameters p and s, where p×p defined the region for the pooling operation, and s represents the stride for the filter 211.” EX.: the weights of filter 211 has been reduced from kXk dimensions values to pXp dimensions vales due to the pooling layer down-sampling process. See fig. 3)
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Wang with the teaching of Zhang to provide “a method for training a neural network to perform assessments of image quality.” See Zhang [0007]. 

Claim 12: The rejection of the method of claim 11 is incorporated, wherein the b) training the convolutional neural network further comprises:
b5) Wang does not explicitly disclose reiterating for a first number of times the sequence of b3) and b4); b6) generating a further modified convolutional neural network by upscaling the downscaled filters to obtain upscaled filters comprising an increased number of weights; b7) for each training input data structure of at least a subset of the training set, providing such training input data structure to the further modified convolutional neural network to generate a corresponding classification output, and comparing the classification output with the predefined class to which the training input data structure belongs; b8) updating the weights of the filters of the layers based on the comparisons. However, Zhang discloses in ([0079 and 0091-0093] “layer 1 includes 64 filters, where the kernel for each filter is 5×5. Each kernel is used to swipe each image and produce a response for the filter. This process is completed for each of the filters within the layer and a final representation is produced. The final representations are provided to at least one pooling layer. The process may be repeated any number of times that is deemed appropriate, with the final representation being fed to the comparative layer 401.” EX.: the loop in fig. 7 illustrates comparator 500 classifies input data from image signal processing 701 to either adequate or inadequate weights of filters that produce adequate image may be stored in database 704 and weights of the filters that produce inadequate images may be adjusted/updated and returned to database 705 to perform in new image input)
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Wang with the teaching of Zhang to provide “a method for training a neural network to perform assessments of image quality.” See Zhang [0007].

Claim 13: The rejection of the method of claim 12 is incorporated, wherein the b) training the convolutional neural network further comprises b9) Wang does not explicitly disclose reiterating for a second number of times the sequence of b7) and b8. However, Zhang discloses in ([0079 and 0091-0093] “The process may be repeated any number of times that is deemed appropriate, with the final representation being fed to the comparative layer 401. As a matter of convention, embodiments of neural networks that includes the comparative layer 401 are generally referred to herein as a “NRIQA neural network.””).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Wang with the teaching of Zhang to provide “a method for training a neural network to perform assessments of image quality.” See Zhang [0007].

Claim 14: The rejection of the method of claim 11 is incorporated, wherein:
the input data structure and each training input data structure comprise at least one respective data structure channel, each data structure channel comprising a corresponding matrix arrangement of a first number of values; (Wang discloses in [0415-0421] “both the low and high resolution image have C colour channels, thus can be represented as real-valued tensors of size H×W×C and rH×rW×C respectively… The final convolution filter W.sub.L is of size n.sub.L−1×r.sup.2C×k.sub.L×k.sub.L, where C is the number of colour channels in the original image and r the resampling ratio.”)
each filter of a layer comprises a set of filter channels, each filter channel of the set being associated with a corresponding data structure channel of the corresponding layer input data structure, each filter channel comprising a corresponding matrix arrangement of a first number of weights, (Wang discloses in [0415-0422] “convolution with a stride of 1/r in the low-resolution space is performed with a filter W.sub.d of size k.sub.s with a weight spacing 1/r to activate different parts of W.sub.d for the convolution.”)
the b2) 
downscaling the at least on filter of a layer to obtain a downscaled filter comprises generating a reduced matrix arrangement of weights comprising a second number of weights lower than the first number of weights; (Wang discloses in ([0583-0590] “at step 1810, the at least one predetermined factor may be used to select individual pixels from the extracted full resolution images 2110 to form a lower resolution representation 2120 of the extracted full resolution images 2110. For example, in some embodiments, a predetermined factor of 2 may indicate every other pixel in both the horizontal and vertical dimensions is selected, as shown in FIG. 21. It will be appreciated that other values of the predetermined factor may be used in other embodiments, furthermore in some of these other embodiments both the horizontal and vertical dimensions may each have a different predetermined factor applied.” EX.: representation 2120 may be reduced arrangement of weights comprising a second number of weights that has weights lower than the weights of the original data represented by matrix 2110. See fig. 21)

Claim 15: The rejection of the method of claim 14 is incorporated, wherein the b2) downscaling a training input data structure to obtain a corresponding downscaled training input data structure comprises generating for each data structure channel a reduced matrix arrangement of values comprising a second number of values lower than the first number of values; (rejected based on rationale used in rejection of claim 14)

Claim 16: The rejection of the method of claim 11 is incorporated, wherein the input data structure and the training input data structures are digital images comprising a plurality of pixels, each value of the group of values depending on a corresponding pixel of the plurality; (Wang discloses in [0387-389] “each frame is down-sampled into lower resolution frames at a suitably lower resolution… By employing a deep learning approach to generating the model in embodiments, a non-linear hierarchical model can be created in some of these embodiments to reconstruct a higher-resolution frame from the lower-resolution frame.” EX.: the low-resolution frame and the high-resolution image correspond to low-quality image and high-quality image, respectively, which depend on the number of pixels).

Claim 17: The rejection of the method of claim 11 is incorporated, further comprising, after the convolutional neural network has been trained:
c) storing last updated weights in a weight database; Wang discloses in ([0092 and 0502] “the selection of the hierarchical algorithm from the library of learned hierarchical algorithms is determined by metric data associated with the lower-quality visual data.” EX.: updated weights saved/stored as models in library to be associated with the downscaled or lower-quality visual data. see fig. 9)
d) at a user device, sending an input data structure to a classification server; e) at the classification server, retrieving the last updated weights from the weight database and setting the convolutional neural network with the retrieved weights; f) at the classification server, providing the input data structure received from the user device to the convolutional neural network to obtain a corresponding classification output; g) at the classification network, sending the obtained classification output to the user device; (Wang discloses in [0080-0081 and 0532] “By lowering the quality of visual data (for example by lowering the resolution of video data) in some embodiments, less data can be sent across a network from a first node to a second node in order for the second node to display the visual data from the first node. In some embodiments, the lower quality visual data together with a model to be used for reconstruction can allow for less data to be transmitted than if the original higher-quality version of the same visual data is transmitted between nodes… the scene type can then be classified into a particular category depending on its content using a metric.”. EX.: a high-resolution image may be downscaled to lower-resolution image. Then, the lower-resolution image with parameters, e.g. weights/filters to aid constructing a high-resolution image from the lower-resolution image, may be sent to another node/computer/device that would apply the parameters/weights to reconstruct the low-resolution image to a higher-resolution image utilizing a classifier).

Claim 18: The rejection of the method of claim 11 is incorporated, wherein the convolutional neural network further comprises a further ordered sequence of fully-connected layers, each fully-connected layer of the further sequence being configured to receive a corresponding further layer input data structure comprising a group of further input values, and generate a corresponding further layer output data structure comprising a group of further output values, wherein each further output value of the further layer output data structure is a function of all the input values of the further layer input data structure; (Wang discloses in [0216] “the hierarchical algorithm comprises a plurality of connected layers, and the connected layers may be sequential.” See fig. 7).

Claim 19: The rejection of the method of claim 11 is incorporated, wherein at least one layer of the sequence is followed by a corresponding still further layer, the still further layer being configured to generate a subsampled version of the layer output data structure generated by the at least one layer; (Wang discloses in [0216, 0235, and 0359] “the hierarchical algorithm comprises a plurality of connected layers, and the connected layers may be sequential… An example layered neural network is shown in FIG. 2a having three layers 10, 20, 30, each layer 10, 20, 30 formed of a plurality of neurons 25, but where no sparsity constraints have been applied so all neurons 25 in each layer 10, 20, 30 are networked to all neurons 25 in any neighbouring layers 10, 20, 30.” See fig. 7).

Claim 20: A convolutional neural network training system, comprising:
a training device configured to implement a convolutional neural network configured to receive an input data structure comprising a group of values corresponding to signal samples and to generate a corresponding classification output indicative of a selected one among a plurality of predefined classes, wherein the convolutional neural network comprises an ordered sequence of layers, each layer of the sequence configured to receive a corresponding layer input data structure comprising a group of input values, and generate a corresponding layer output data structure comprising a group of output values by convolving the layer input data structure with at least one corresponding filter comprising a corresponding group of weights, the layer input data structure of the first layer of the sequence corresponding to the input data structure, and the layer input data structure of a generic layer of the sequence different from the first layer corresponding to the layer output data structure generated by the previous layer in the sequence, (rejected based on rationale used in rejection of claim 11)
generate a modified convolutional neural network by downscaling, for at least one layer of the sequence of layers of the convolutional neural network, the at least one corresponding filter to obtain a downscaled filter comprising a reduced number of the weights; (rejected based on rationale used in rejection of claim 11)
a training database storing a training set of training input data structures belonging to known predefined classes, wherein the training device is further configured to downscale the training input data structures to obtain corresponding downscaled training input data structures comprising a reduced number of values; (rejected based on rationale used in rejection of claim 11)
a calculation device configured to provide, for each downscaled training input data structure of at least a subset of the training set, such downscaled training input data structure to the modified convolutional neural network to generate a corresponding classification output, and comparing the classification output with the predefined class to which the training input data structure corresponding to the downscaled training input data structure belongs; (rejected based on rationale used in rejection of claim 11) and
a weight database adapted to store the weights of the filters of the layers, the training device being further configured to update the weights of the filters of the layers stored in the weight database based on the comparisons; (rejected based on rationale used in rejection of claim 11).

Claim 21. The rejection of the method of claim 11 is incorporated, further wherein the at least one corresponding filter is downscaled based upon a scaling factor; (Wang discloses in [0589-0590] “the dimension of the full resolution images 2110 extracted may be reduced based on at least one predetermined factor at step 1810… the at least one predetermined factor may be used to select individual pixels from the extracted full resolution images 2110 to form a lower resolution representation 2120 of the extracted full resolution images 2110… a predetermined factor of 2 may indicate every other pixel in both the horizontal and vertical dimensions is selected, as shown in FIG. 21. It will be appreciated that other values of the predetermined factor may be used in other embodiments, furthermore in some of these other embodiments both the horizontal and vertical dimensions may each have a different predetermined factor applied.” EX.: the predetermined factor may be a scaling factor. See fig. 21).

Claim 22. The rejection of the method of claim 11 is incorporated, further wherein the training input data structures are downscaled based upon a scaling factor; (rejected based on rationale used in rejection of claim 21).

Claim 23. The rejection of the method of claim 11 is incorporated, further wherein the at least one corresponding filter is downscaled and the training input data structures are downscaled based upon a common scaling factor; (rejected based on rationale used in rejection of claim 21).

Claim 24. The rejection of the method of claim 13 is incorporated, further wherein the second number of times is different from the first number of times; (Wang discloses in [0589-0590] “the dimension of the full resolution images 2110 extracted may be reduced based on at least one predetermined factor at step 1810… the at least one predetermined factor may be used to select individual pixels from the extracted full resolution images 2110 to form a lower resolution representation 2120 of the extracted full resolution images 2110… a predetermined factor of 2 may indicate every other pixel in both the horizontal and vertical dimensions is selected, as shown in FIG. 21. It will be appreciated that other values of the predetermined factor may be used in other embodiments, furthermore in some of these other embodiments both the horizontal and vertical dimensions may each have a different predetermined factor applied.” EX.: the first number of times may correspond to the number of weights in the representation 2110, which has double the number of weights in the second number of times representing the number of weights in the downscaled frame representation. See fig. 21).

Claim 25. The rejection of the method of claim 24 is incorporated, further wherein the second number of times is greater than the first number of times; (rejected based on rationale used in rejection of claim 24).

Response to Arguments
Applicant's arguments filed 11/03/2022 have been fully considered but they are not persuasive.
Argument: applicant argues that “the applied references fail to disclose and would not have rendered obvious [A] generating a modified convolutional neural network by downscaling, for at least one layer of the sequence of layers of the convolutional neural network, the at least one corresponding filter to obtain a downscaled filter comprising a reduced number of weights; and [B] comparing the classification output with the predefined class to which the training input data structure corresponding to the downscaled training input data structure belongs, as recited by independent claim 11, and similarly recited by independent claim 20.”
Response: Wang discloses in [0500-0507] generation of modified convolutional neural network models, using scaled down frames to generate low-resolution frames, to be saved in a library. See fig. 9. The downscaled low-resolution frames and one or more selected pre-trained neural convolutional models may be packaged and sent to target node. See fig. 10. In [0513], Wang discloses reconstruction model from the library may be utilized to recreate a high-resolution frames using the downscaled frames by comparing the generated frames to original downscaled frames. In other words, the reconstruction model/classifier utilizes the original downscaled frames as reference frames. See fig. 11. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See form 892.
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AHAMED I NAZAR whose telephone number is (571)270-3174. The examiner can normally be reached 10 am to 7 pm Mon-Fri.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen Hong can be reached on 571-272-4124. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AHAMED I NAZAR/Examiner, Art Unit 2178                                                                                                                                                                                                        11/30/2022

/SHAHID K KHAN/Examiner, Art Unit 2178