DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Claims
Original claims 1-13, filed October 11, 2019, are pending in the instant application.

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on October 11, 2019; March 3, 2020; April 9, 2020; August 25, 2020; March 10, 2021; and September 3, 2021, are being considered by the examiner.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-13 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term “deep” in claims 1-13 is a relative term which renders the claims indefinite.  The term “deep” is not defined by the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
The claims recite a “deep neural network (DNN)”.  Within the context of neural networks “depth” generally refers to the number of layers.  A deeper network has more layers.  However, neither the claim nor the specification provide a standard for determining which neural networks qualify as “deep” and which do not.  Is a neural network with 3 layers “deep”?  What about neural networks with 10 layers?  Or 100 layers?  This ambiguity renders the scope of the claims indefinite because it would be unclear to one of ordinary skill in the art whether a given neural network was included in the scope of the claimed “deep neural network”.
For purposes of examination with respect to the prior art, the term “deep neural network” is construed to refer to a neural network with any number of layers.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of 

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1-3, 5-8, and 10-12 is/are rejected under 35 U.S.C. 103 as being unpatentable over ‘Kim’ (“Dynamic frame resizing with convolutional neural network for efficient video compression,” 2017) in view of ‘Leng’ (“Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM,” 2017).
Regarding claim 1, Kim teaches an artificial intelligence (AI) decoding apparatus (e.g. Figure 5, decoder) comprising:
a memory storing one or more instructions (see Note Regarding Computer below); and
a processor configured to execute the stored one or more instructions to (see Note Regarding Computer below):
obtain image data corresponding to a first image (Figure 5, bitstream is received at decoder) that is downscaled from an original image by using first parameters of a first filter kernel comprised in a first deep neural network (DNN) (Figure 5, bitstream is created by encoder, which downscales an input video using DSCNN; Section 4.1, Figure 6, DSCNN has three convolution layers, with 9x9, 5x5, and 3x3 filter kernels whose parameters are learned through training – see Sections 4.2-4.3);
reconstruct a second image corresponding to the first image, based on the obtained image data (Figure 5, decoding procedure accepts obtained bitstream of image data as input and outputs a reconstructed second image); and
obtain a third image that is upscaled from the reconstructed second image (Figure 5, upscaled – i.e. up-sampled – video image is output from USCNN), by performing an operation between the reconstructed second image and second parameters of a second filter kernel comprised in a second DNN (Section 4.1, Figure 6, USCNN performs upscaling using a series of convolution operations using parameterized filter kernels of various sizes) corresponding to the first DNN (The DSCNN and USCNN correspond to one another at least because they are part of the same compression method and the USCNN reverses the downscaling performed by the DSCNN),
wherein each of the second parameters is represented by a product of a scale factor and one among integer values, and each of the integer values is 0 or                                 
                                    ±
                                    
                                        
                                            2
                                        
                                        
                                            n
                                        
                                    
                                
                            , where n is an integer (see Note Regarding Quantization below).


Note Regarding Computer.  Kim teaches various computational algorithms (e.g. Figure 5), and presents results from implementing them (Section 5), but does not explicitly describe the hardware used to implement them.  Accordingly, Kim does not explicitly teach that an algorithm is implemented as an apparatus comprising: a memory storing one or more instructions; and a processor configured to execute the stored one or more instructions to: perform the algorithm.
However, Examiner takes Official Notice that it is old and well known in the art of image analysis to implement a computational algorithm as an apparatus, such as a computer, comprising: a memory storing one or more instructions (and any parameters required to execute the instructions); and a processor configured to execute the stored one or more instructions to: perform the algorithm.  This advantageously allows the algorithm to be performed quickly and efficiently.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to implement the algorithm of Kim as an apparatus comprising: a memory storing one or more instructions (and any parameters required to execute the instructions); and a processor configured to execute the stored one or more instructions to: perform the algorithm in order to advantageously allow the algorithm to be performed quickly and efficiently.

Note Regarding Quantization.  Kim does not teach that its neural networks are quantized.  In particular, Kim does not teach that each of the second parameters is represented by a product of a scale factor and one among integer values, and each of the integer values is 0 or                         
                            ±
                            
                                
                                    2
                                
                                
                                    n
                                
                            
                        
                    , where n is an integer.
(Section 3), where weight parameters of a filter kernel are represented by a product of a scale factor and one among integer values, and each of the integer values is 0 or                         
                            ±
                            
                                
                                    2
                                
                                
                                    n
                                
                            
                        
                    , where n is an integer (Section 3.1, weights of a given layer                         
                            i
                        
                    , such as a convolutional filter kernel layer, are selected from                         
                            
                                
                                    C
                                
                                
                                    i
                                
                            
                            =
                            
                                
                                    0
                                    ,
                                    ±
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                    ±
                                    2
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                    …
                                    ,
                                    ±
                                    
                                        
                                            α
                                            
                                                
                                                    2
                                                
                                                
                                                    N
                                                
                                            
                                        
                                        
                                            i
                                        
                                    
                                
                            
                        
                    ; These values are products of scale factor                         
                            
                                
                                    α
                                
                                
                                    i
                                
                            
                        
                     and integers that are zero or powers of two, i.e.                         
                            ±
                            
                                
                                    2
                                
                                
                                    n
                                
                            
                        
                    , where n is an integer).
Leng teaches that convolutional neural networks (CNNs) have had success in a wide range of computer vision tasks, but disadvantageously require high computational and storage costs, which have become impediments to their popularization (Section 1, first paragraph).  Leng further teaches that, a desire to deploy deep learning systems to low-end devices motivates research in compression deep models to have smaller computation cost and memory footprints (Section 1, first paragraph), that compressing a neural network model by restricting its weights to low precision with a few bits advantageously reduces computation cost and memory footprint (Page 2, first paragraph), and that its techniques focus on such compression and acceleration using extremely low bits weights (Page 2, second paragraph).
Leng further teaches that its quantization techniques provide better performance with lower memory requirements than prior un-quantized, “old fashion” neural network models (Table 1, Section 4.1.2, second paragraph, quantized network performs better than VGG-16 benchmark, while using only 3 bits for weights rather than full precision) and provided better performance than prior state-of-the-art quantized networks (Table 1, Section 4.1.2, last paragraph).

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Kim and Leng to obtain the invention as specified in claim 1.	

Regarding claim 2, Kim in view of Leng teaches the AI decoding apparatus of claim 1, and Kim further teaches that the second DNN is trained in connection with the first DNN and trained based on a training image that is obtained by training the first DNN (e.g. Figure 7, second/USCNN is trained in connection with first/DSCNN based on down-sampled output training image that is obtained by training the first/DSCNN).

Regarding claim 3, Kim in view of Leng teaches the AI decoding apparatus of claim 1, and Leng further teaches that 
a first parameter matrix representing the second parameters is represented by a product of the scale factor and a second parameter matrix comprising the integer values (Section 3.1, weight matrix of an ith layer                                 
                                    
                                        
                                            W
                                        
                                        
                                            i
                                        
                                    
                                
                             is product of scale factor                                 
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                
                             and matrix of integer values selected from                                 
                                    
                                        
                                            C
                                        
                                        
                                            i
                                        
                                    
                                
                            ; Also see sentences below Equation 3, scale factor is multiplied after convolution with integer weight filter kernel matrix – i.e. second parameter matrix – has occurred, which indicates that they are kept separately),
the memory stores the scale factor and the second parameter matrix (see Note Regarding Computer given above with respect to claim 1), and
the processor is further configured to execute the stored one or more instructions to (see Note Regarding Computer given above with respect to claim 1) obtain the third image by performing a convolution operation between the reconstructed second image and the second parameter matrix and then multiplying a result of the performed convolution operation by the scale factor (Page 5, below Equation 3, scale factor is multiplied by result of convolution with integer values).

Regarding claim 5, Kim teaches an artificial intelligence (AI) encoding apparatus (e.g. Figure 5, encoder) comprising:
a memory storing one or more instructions (see Note Regarding Computer below); and
a processor configured to execute the stored one or more instructions to (see Note Regarding Computer below):
obtain a first image that is downscaled from an original image (Figure 5, input/original image is downscaled by DSCNN to output first image), by performing an operation between the original image and first parameters of a filter kernel comprised in a first deep neural network (Figure 5, DSCNN performs the downscaling; Figure 6, Section 4.1, first conv layer of DSCNN performs convolution operation between original/input image and 9x9 filter kernel whose parameters have been set via training); and
encode the obtained first image (Figure 5, encoding procedure),
wherein each of the first parameters is represented by a product of a scale factor and one among integer values, and each of the integer values is 0 or                                 
                                    ±
                                    
                                        
                                            2
                                        
                                        
                                            n
                                        
                                    
                                
                            , where n is an integer (see Note Regarding Quantization below), and
wherein the first DNN corresponds to a second DNN (Figure 5, first/DSCNN corresponds to second/USCNN) comprising a second filter kernel of which second parameters (Section 4.1, Figure 6, DSCNN includes several conv layers with convolutional filter kernels of various sizes whose parameters have been set by training) are used to upscale a second image corresponding to the first image (Figure 5, USCNN upscales a second image, which is the first image after it has been encoded and decoded).

Note Regarding Computer.  Kim teaches various computational algorithms (e.g. Figure 5), and presents results from implementing them (Section 5), but does not explicitly describe the hardware used to implement them.  Accordingly, Kim does not explicitly teach that an algorithm is implemented as an apparatus comprising: a memory storing one or more instructions; and a processor configured to execute the stored one or more instructions to: perform the algorithm.

Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to implement the algorithm of Kim as an apparatus comprising: a memory storing one or more instructions (and any parameters required to execute the instructions); and a processor configured to execute the stored one or more instructions to: perform the algorithm in order to advantageously allow the algorithm to be performed quickly and efficiently.

Note Regarding Quantization.  Kim does not teach that its neural networks are quantized.  In particular, Kim does not teach that each of the first parameters is represented by a product of a scale factor and one among integer values, and each of the integer values is 0 or                         
                            ±
                            
                                
                                    2
                                
                                
                                    n
                                
                            
                        
                    , where n is an integer.
However, Leng does teach techniques for quantizing a neural network (Section 3), where weight parameters of a filter kernel are represented by a product of a scale factor and one among integer values, and each of the integer values is 0 or                         
                            ±
                            
                                
                                    2
                                
                                
                                    n
                                
                            
                        
                    , where n is an integer (Section 3.1, weights of a given layer                         
                            i
                        
                    , such as a convolutional filter kernel layer, are selected from                         
                            
                                
                                    C
                                
                                
                                    i
                                
                            
                            =
                            
                                
                                    0
                                    ,
                                    ±
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                    ±
                                    2
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                    ,
                                    …
                                    ,
                                    ±
                                    
                                        
                                            α
                                            
                                                
                                                    2
                                                
                                                
                                                    N
                                                
                                            
                                        
                                        
                                            i
                                        
                                    
                                
                            
                        
                    ; These values are products of scale factor                         
                            
                                
                                    α
                                
                                
                                    i
                                
                            
                        
                     and integers that are zero or powers of two, i.e.                         
                            ±
                            
                                
                                    2
                                
                                
                                    n
                                
                            
                        
                    , where n is an integer).
Leng teaches that convolutional neural networks (CNNs) have had success in a wide range of computer vision tasks, but disadvantageously require high computational and storage costs, which have become impediments to their popularization (Section 1, first paragraph).  Leng further teaches that, a desire to deploy deep learning systems to low-end devices motivates research in compression deep models to have smaller computation cost and memory footprints (Section 1, first paragraph), that compressing a neural network model by restricting its weights to low precision with a few bits advantageously reduces computation cost and memory footprint (Page 2, first paragraph), and that its techniques focus on such compression and acceleration using extremely low bits weights (Page 2, second paragraph).
Leng further teaches that its quantization techniques provide better performance with lower memory requirements than prior un-quantized, “old fashion” neural network models (Table 1, Section 4.1.2, second paragraph, quantized network performs better than VGG-16 benchmark, while using only 3 bits for weights rather than full precision) and provided better performance than prior state-of-the-art quantized networks (Table 1, Section 4.1.2, last paragraph).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the neural networks of Kim with the quantization of Leng in order to improve the neural networks with the reasonable expectation that this would result in neural networks that provided high performance with advantageously lower computation cost and memory footprint.  This technique for 
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Kim and Leng to obtain the invention as specified in claim 5.	

Regarding claim 6, Kim in view of Leng teaches the AI encoding apparatus of claim 5, and Kim further teaches that the first DNN is trained in connection with the second DNN (Figure 7, first/DSCNN and second/USCNN are trained in connection with one another) and trained based on loss information that is obtained by training the second DNN (Figure 7, Section 4.2, especially Equation 2, first/DSCNN is trained based on loss that includes L2 loss, which is obtained from the training output of second/USCNN).

Regarding claim 7, Kim in view of Leng teaches the AI encoding apparatus of claim 6, and Kim further teaches that the first DNN is trained based on first loss information that is generated by upscaling in the training of the second DNN (Figure 7, Section 4.2, especially Equation 2, first/DSCNN loss includes L2 loss, which is based on output from training second/USCNN), and based on second loss information that is generated by downscaling in training the first DNN (Figure 7, Section 4.2, first/DSCNN loss also includes                                 
                                    
                                        
                                            l
                                            o
                                            s
                                            s
                                        
                                        
                                            var
                                        
                                    
                                
                             and                                 
                                    
                                        
                                            l
                                            o
                                            s
                                            s
                                        
                                        
                                            s
                                            t
                                            r
                                            u
                                            c
                                            t
                                            u
                                            r
                                            e
                                        
                                    
                                
                            , which are both second losses generated by downscaling in training the first/DSCNN).

Regarding claim 8, Kim in view of Leng teaches the AI encoding apparatus of claim 5, and Leng further teaches that 
a first parameter matrix representing the first parameters is represented by a product of the scale factor and a second parameter matrix comprising the integer values (Section 3.1, weight matrix of an ith layer                                 
                                    
                                        
                                            W
                                        
                                        
                                            i
                                        
                                    
                                
                             is product of scale factor                                 
                                    
                                        
                                            α
                                        
                                        
                                            i
                                        
                                    
                                
                             and matrix of integer values selected from                                 
                                    
                                        
                                            C
                                        
                                        
                                            i
                                        
                                    
                                
                            ; Also see sentences below Equation 3, scale factor is multiplied after convolution with integer weight filter kernel matrix – i.e. second parameter matrix – has occurred, which indicates that they are kept separately),
the memory stores the scale factor and the second parameter matrix (see Note Regarding Computer given above with respect to claim 1), and
the processor is further configured to execute the stored one or more instructions to (see Note Regarding Computer given above with respect to claim 1) obtain the first image by performing a convolution operation between the original image and the second parameter matrix and then multiplying a result of the performed convolution operation by the scale factor (Page 5, below Equation 3, scale factor is multiplied by result of convolution with integer values).

Regarding claim 10, Examiner notes that the claim is directed to a method that is substantially the same as the method performed by the apparatus of claim 1.  Kim in view of Leng teaches the apparatus of claim 1.  Accordingly, claim 10 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Leng for substantially the same reasons as claim 1.
Regarding claim 11, Examiner notes that the claim is directed to a method that is substantially the same as the method performed by the apparatus of claim 2.  Kim in view of Leng teaches the apparatus of claim 2.  Accordingly, claim 11 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Leng for substantially the same reasons as claim 2.

Regarding claim 12, Examiner notes that the claim is directed to a method that is substantially the same as the method performed by the apparatus of claim 3.  Kim in view of Leng teaches the apparatus of claim 3.  Accordingly, claim 12 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Leng for substantially the same reasons as claim 3.


Claim(s) 4, 9 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Leng as applied above, and further in view of ‘Xu’ (“Efficient Deep Convolutional Neural Networks Accelerator Without Multiplication and Retraining,” 2018).
Regarding claim 4, Kim in view of Leng teaches the AI decoding apparatus of claim 3.
Leng teaches that “the weights of the network are restricted to be either zero or powers of two so that the expensive floating-point multiplication operation can be replaced by cheaper and faster bit shift operation” (Sentence spanning Pages 4-5) and refers to “efficient convolution with” powers of two (Page 5, below Equation 3).

However, Xu does teach that performing convolution includes performing multiplication (Section 2, Equation 1, w * x, which is multiplication of weight and input image pixel values), but that for weights that are powers of two (Section 3.1, third paragraph; Section 3.3), the multiplication can be replaced by a shift operation and an addition operation (Section 3.3), such that the convolution can be performed by performing a shift operation and an addition operation between a pixel value in the input image and weight values.
Xu teaches that this replacement of multiplication with shift and add operations advantageously reduces computation complexity (Section 3.3 and Table 1).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the apparatus of Kim in view of Leng as applied above with the shift-add multiplication replacement of Xu in order to improve the apparatus with the reasonable expectation that this would result in an apparatus that advantageously had lower computational complexity.  This technique for improving the apparatus of Kim in view of Leng was within the ordinary ability of one of ordinary skill in the art based on the teachings of Leng and Xu.
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Kim, Leng and Xu to obtain the invention as specified in claim 4.	

Regarding claim 9, Kim in view of Leng teaches the AI encoding apparatus of claim 8.
Leng teaches that “the weights of the network are restricted to be either zero or powers of two so that the expensive floating-point multiplication operation can be replaced by cheaper and faster bit shift operation” (Sentence spanning Pages 4-5) and refers to “efficient convolution with” powers of two (Page 5, below Equation 3).
Nevertheless, Leng does not explicitly teach that the convolution operation is performed by performing a shift operation and an addition operation between a pixel value comprised in the second image and the second parameter matrix.
However, Xu does teach that performing convolution includes performing multiplication (Section 2, Equation 1, w * x, which is multiplication of weight and input image pixel values), but that for weights that are powers of two (Section 3.1, third paragraph; Section 3.3), the multiplication can be replaced by a shift operation and an addition operation (Section 3.3), such that the convolution can be performed by performing a shift operation and an addition operation between a pixel value in the input image and weight values.
Xu teaches that this replacement of multiplication with shift and add operations advantageously reduces computation complexity (Section 3.3 and Table 1).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the apparatus of Kim in view of Leng as applied above with the shift-add multiplication replacement of Xu in order to improve the apparatus with the reasonable expectation that this would result in an apparatus that advantageously had lower computational complexity.  This technique for improving the 
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Kim, Leng and Xu to obtain the invention as specified in claim 9.	

Regarding claim 13, Examiner notes that the claim is directed to a method that is substantially the same as the method performed by the apparatus of claim 4.  Kim in view of Leng and Xu teaches the apparatus of claim 4.  Accordingly, claim 13 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Leng and Xu for substantially the same reasons as claim 4.

Conclusion
The following prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
‘Yoshiyama’ (US 2021/0042453 A1)
Suggests replacing weight convolution with a power-of-2 convolution that can be performed using shift and addition operations – [0099]
‘Elhoushi’ (“DeepShift: Towards Multiplication-Less Neural Networks,” 2019)
Teaches techniques for using shifts to replace multiplication in neural networks
‘Guo’ (“A Survey on Methods and Theories of Quantized Neural Networks,” 2018)
‘Kenue’ (“Efficient Convolution Kernels for Computerized Tomography,” 1979)
An early example of using power-of-two kernels for efficient convolution – see Pages 235-236, Binary Kernels
‘Kwok’ (“Loss-aware Weight Quantization of Deep Networks,” 2018)
Teaches a ternary quantization that also reads on the scale factor and integer values of the claimed invention – see Section 3.1
‘Marchesi’ (“Fast Neural Networks Without Multipliers,” 1993)
An early example of restricting neural network weight values to powers of two for computational efficiency – see Section II.A.
‘Jiang’ (“An End-to-End Compression Framework Based on Convolutional Neural Networks,” 2017)
Example of another technique using neural networks to down- and up-scale an image in conjunction with coding – see Figure 2
‘Gorodilov’ (“Neural Networks for Image and Video Compression,” 2018)
Another example of another technique using neural networks to down- and up-scale an image in conjunction with coding – see Figure 1

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEOFFREY E SUMMERS whose telephone number is (571)272-9915. The examiner can normally be reached Monday-Friday, 7:00 AM to 3:30 PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on (571) 272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is 





/GEOFFREY E SUMMERS/Examiner, Art Unit 2669