DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This communication is responsive to the petition decision on 6/11/21.
Claims 1-20 are presented for examination.

IDS Considerations

The information disclosure statement (IDS) submitted on 7/19/19 and 3/24/20 are being considered by the examiner as the submission is in compliance with the provisions of 37 CFR 1.97.

Examiner’s Note: Examiner answer presented on 1/27/21 in incorporated here for any additional explanation, interpretation and motivation. Claims scope has not changed since the issuance of Non-Final rejection. Examiner is using same references for rejecting extremely broad claims with different reasonable interpretations. On 3/29/21 applicant filled petition arguing new grounds of rejection is introduced in Examiner answer dated 1/27/21. As result of this petition decision Examiner is issuing this final rejection for addressing petition decision where the Final rejection dated 7/2/20 was withdrawn. Applicant was given opportunity to amend claim based on Examiner answer. However claims were not amended. So all answer to the arguments presented in Examiner answer presented on 1/27/21 is also applicable for this Final rejection.

Response to Arguments

Applicant's arguments filed 6/15/20 as well as appeal brief filled on 1/27/21 with respect to claims 1-20 has been considered but are not persuasive, because please refer Examiner answer presented on 1/27/21 for details.

Applicant argued in page 7 related to claim 1 that Office fails to identify any feature of either Laukien or Koker that "outputs the guided information and the side information for a decoder to reconstruct the source data".

Examiner disagree on this because (Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In
predictive coding mode, this feedback signal is used to correct the encoder's prediction
signal. In either mode, the time division feedback [side information] predictor [47] feeds
[output the side information] the decoder [2] with the feedback-corrected prediction [48]
signal [guided information]. Please note [0075] [0077] [0105] [0106] Fig. 3, Fig. 11 and
Fig. 14 teach feedforward connection [3] which is a same feedback signal [same side
information] available at different layers of encodes and decoders, because The
processing stages form autoencoder-like "layers," in which feedforward signals are fed
up the hierarchy in a chain of encoder modules, feedback and predictive signals are

and between layers allow for learning and prediction assistance at both the local and
network levels. This is how the same side information is available at the decoder. Laukien [0231] the result of the iterative solving is a sparse code which represents a number of steps towards the minimization of reconstruction errors of the encoder. [0232] Learning in the BISTA encoder is performed using a form of reconstruction error minimization. [0225] The BISTA encoder activate ( ) and its kernels pass the hierarchy's inputs up from layer to layer. Each step around the iterative solver, the previous spike pattern is used to generate a reconstruction of the input. The reconstruction error E.sub.recon forms the input to generate new stimuli S, which are then used to generate activations A, and those are inhibited to generate the new spike pattern. Decoding of encoded video is reconstruction which is generated from low resolution video input and feedback process generated by neural network.
So Laukien clearly teach outputs the guided information and the side information for a decoder to reconstruct the source data. However "outputs the guided information and the side information for a decoder to reconstruct the source data." is also visually depicted in the secondary reference Koker [0169] FIG. 12 as well.
In the above Koker [0169] FIG. 12, x1 -> a = source data, b -> a through 1205 is
side information, a-> c -> y is guided information and b -> y is side information as well.
y is decoder because Koker [0188] The media processor 1502 can enable low
latency decode of multiple high-resolution (e.g., 4K, SK) video streams. The decoded
video streams can be written to a buffer in the on-chip-memory 1505 ... the vision
processor 1504 can accelerate convolution operations for a CNN that is used to perform

the illustrated RNN 1200 can be described has having an input layer 1202 that receives
an input vector [source data], hidden layers 1204 to implement a recurrent function, a
feedback mechanism 1205 [side information] to enable a 'memory' of previous states,
and an output layer 1206 to output a result [guided information]. In Koker FIG. 12
above, connection between x1 and a is source data because 1202 is input vector of the
system. So source data 1202 is directly getting received at hidden layer 1204. Also
Koker's Fig. 12 neural network is related to video stream. Koker FIG. 4B graphics
processing involves encoding/decoding, because Koker [0084] the graphics processing
engines 431, 432, N may comprise different types of graphics processing engines within
a GPU such as graphics execution units, media processing engines (e.g., video
encoders/decoders), samplers, and blit engines. Decoding is reconstruction of video data.
	Applicant argued in page 7 related to claim 1 that the Office provides no description of how block diagrams of Fig, 11 and Fig. 14 can be made to function together, or why the skilled artisan would be motivated to combine them.


    PNG
    media_image1.png
    471
    772
    media_image1.png
    Greyscale


Examiner disagree on this because Fig. 11 and Fig. 14 are block diagram of same system as they have same inputs (7, 9) and output (8) shown above. There is no explanation is need when both the diagram represents the same system.

Applicant argued in page 8-9 related to claim 1 and 19 that the Office does not state what elements of Koker allegedly correspond to the "side information" and to "the source data". Accordingly, Applicant is unable to determine how the Office believes that Koker describes the feature of "receiv[ing] side information correlated with the source data at the first hidden layer of the encoder."

Examiner disagree on this because Koker [0169] FIG. 12 the RNN 1200 operates based on time-steps. The state of the RNN at a given time step is influenced based on 

Applicant argued in page 8 related to claim 1 that the Office fails to cite any input into any layer of a neural network of Laukien that is also input into a decoder. Koker does not support the stated rejection.



Applicant argued in page 8 related to claim 1 that Office fails to state a prima facie rejection.	

Examiner disagree on this because It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Laukien, further incorporating Koker in video/camera technology. One would be motivated to do so, to incorporate receives side information correlated with the source data at the first hidden layer of the encoder. This functionality will improve coding efficiency with predictable results.

Applicant argued in page 8-9 related to claim 6 that Office cites paragraph [0186], again without stating what is considered to be the side information.	There is no 

Examiner disagree on this because Laukien [0090] feedback [side information] from higher layers may similarly be combined, either raw or with some preprocessing, with the feedforward [pass through] stimulus as drawn from the weight [side information, because feedback given as weight] matrix. Raw feedback is pass through side information. Please note: Laukien [0093] the delay encoder layer first computes the stimulus of the hidden cells [hidden layer] using a matrix multiplication, or simply multiplying each input by its corresponding weight. This step is similar to the operation of standard feed-forward [pass through] neural networks and autoencoders. [0092] each connection contains two values: A weight [side information], and an eligibility trace. These connections feed into the array of hidden layer “cells,” which can be of a different dimension. Same algorithm applicable for decoder. Laukien [0186] Decoding information passes down the hierarchy, as each layer generates predictions by decoding its encoder's output, combined with feedback [side information] from higher layer decoder predictions. The feedback input Z.sub.fb is either the higher layer decoder output or the current hidden state in the case of the top layer. The decoder produces two predictions, one based on the feedback and the other, lateral prediction, on the corresponding encoder hidden state. These predictions are combined to produce the decoder's output. As such Laukien above mentioned paragraphs clearly teach claim 6.



Examiner disagree on this because Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal to combine with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the feedback-corrected prediction [48] signal.

Applicant argued in page 9 related to claim 18 that Office cites FIG. 14, and implies that the feedback connection 4 is the side information. However, the feedback connection 4 is not information from an encoder-it is an output of a decoder	

Examiner disagree on this because Laukien [0099] We assume that a decoder (which, in typical embodiments, is a simple linear combination of the hidden states of the encoder) provides predictions from each encoder by using the standard perceptron learning rule with a time delay. In some embodiments, this decoder solely provides predictions from the encoder hidden state, and can feed back in to the encoder (as when using the previously described STDP-based learning rule for the encoder). So  

Applicant argued in page 9 related to claim 19 that no permissible combination of Laukien and Koker is proposed that both "receives side information correlated with the source data at the first hidden layer of the encoder; ... and receives [] guided information and the side information at a first hidden layer of a first decoder for reconstruction of [] source data."	

Examiner disagree on this because examiner has addressed the argument related to "receives side information correlated with the source data at the first hidden layer of the encoder” limitation in page 4-5 of this document together with claim 1. Laukien Fig. 14 teach decoder reconstructing from learning feedback which is side information and prediction which is guided information, because [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this 

Combined teaching of Laukien and Koker meets all claimed limitations of every claim with predictable results. Claims are not specific or inventive over prior art.

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-3, 5-7, 14 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Laukien (U.S. Pub. No. 20190294980 A1), in view of Koker (U.S. Pub. No. 20180307984 A1).

Regarding to claim 1:

1. Laukien teach a hybrid apparatus for coding a video stream, comprising:
a first encoder comprising a neural network having at least one hidden layer, (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size m.sub.vis.sup.i.times.n.sub.vis.sup.i, and hidden (output-facing) layers have size m.sub.hid.sup.i.times.n.sub.hid.sup.i) 
wherein the neural network: receives source data from the video stream (Laukien [0170] In addition, the hierarchy may be configured to produce any desired form of output simply by specifying the size of the output of the bottom decoder. For example, in an embodiment performing video super-resolution, the input [source data] may be the low-resolution video stream, and the output may be the desired high-resolution video) at a first hidden layer of the at least one hidden layer; (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size m.sub.vis.sup.i.times.n.sub.vis.sup.i, and hidden (output-facing) layers have size m.sub.hid.sup.i.times.n.sub.hid.sup.i)
and generates guided information using the source data and the side information; (Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the feedback-corrected prediction [48] signal [guided information]. Laukien [0137] FIG. 11 
and outputs the guided information and the side information (Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal  [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the feedback-corrected prediction [48] signal [guided information]) for a decoder to reconstruct the source data. (Laukien [0231] the result of the iterative solving is a sparse code which represents a number of steps towards the minimization of reconstruction errors of the encoder. [0232] Learning in the BISTA encoder is performed using a form of reconstruction error minimization. [0225] The BISTA encoder activate ( ) and its kernels pass the hierarchy's inputs up from layer to layer. Each step around the iterative 
Please see also Laukien Fig. 14 and [0029] [0169-0171] [0141] [0089-0092] [0137] [0099] [0105-0106] [0073-0077] [0110] [0158] [0225-0232] Fig. 3-5, Fig. 7, Fig. 11, Fig. 13 and Examiner answer presented on 1/27/21 for details)
Laukien do not explicitly teach receives side information correlated with the source data at the first hidden layer:

However Koker teach receives side information correlated with the source data at the first hidden layer: (Koker [0169] FIG. 12 the RNN 1200 operates based on time-steps. The state of the RNN at a given time step is influenced based on the previous time step via the feedback mechanism 1205 [side information]. For a given time step, the state of the hidden layers 1204 is defined by the previous state and the input at the current time step. An initial input (x.sub.1) at a first time step can be processed by the hidden layer 1204. A second input (x.sub.2) can be processed by the hidden layer 1204 using state information [side information] that is determined during the processing of the initial input (x.sub.1) [source data]. See also Koker [0188]) 

It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Laukien, further incorporating Koker in video/camera technology. One would be motivated to do so, to 

Regarding to claim 2:

2. Laukien teach the hybrid apparatus of claim 1, further comprising: a second encoder generating, using the source data, (Laukien [0140] FIG. 14 with the prediction errors of the decoder [2] via the decoder-encoder connection [6]) the side information for input to the first encoder. (Laukien [0140] FIG. 14 also using its feedback connection [4])

Regarding to claim 3:

3. Laukien teach the hybrid apparatus of claim 2, wherein: the second encoder includes a second decoder, and (Laukien FIG. 14) the side information comprises decoded source data from the second decoder. (Laukien [0140] FIG. 14 also using its feedback connection [4])

Regarding to claim 5:

5. Laukien teach the hybrid apparatus of claim 1, wherein: the first encoder includes a first decoder, and (Laukien FIG. 14) at least the first hidden layer of the multiple hidden layers forming the first encoder, and (Laukien FIG. 14) at least a second hidden layer of the multiple hidden layers forming the first decoder, (Laukien FIG. 14)  and the first decoder receiving the guided information (Laukien [0140] FIG. 14 with the prediction errors of the decoder [2] via the decoder-encoder connection [6]) and the side information (Laukien [0140] FIG. 14 also using its feedback connection [4]) for reconstruction of the source data. (Laukien [0231] the result of the iterative solving is a sparse code which represents a number of steps towards the minimization of reconstruction errors of the encoder. [0232] Learning in the BISTA encoder is performed using a form of reconstruction error minimization. [0225] The BISTA encoder activate ( ) and its kernels pass the hierarchy's inputs up from layer to layer. Each step around the iterative solver, the previous spike pattern is used to generate a reconstruction of the input. The reconstruction error E.sub.recon forms the input to generate new stimuli S, which are then used to generate activations A, and those are inhibited to generate the new spike pattern. Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided
information] with that from the same-layer encoder. In representation mode, this
feedback signal is used to augment the representation produced by the encoder. In
predictive coding mode, this feedback signal is used to correct the encoder's prediction
signal. In either mode, the time division feedback [side information] predictor [47] feeds
[transmitting] the decoder [2] with the feedback-corrected prediction [48] signal [guided information]. As such in Fig 14 shows item 5-> item 47-> item 48-> item 2-> item 6 this
loop also provides feedback which can also be considered as side information. Laukien
[0141] this feedback signal is used to augment the representation produced by the

information out of encoder. So what is happening in Laukien Fig. 14 is during the
process of low-resolution video stream as input while getting converted by
different layers of neural network to a high-resolution video and encoders/decoders
takes feedback from same and previous layers of encoder and decoder to improve the
conversion system.)

Laukien do not explicitly teach the neural network comprises multiple hidden layers.

Koker teach the neural network comprises multiple hidden layers. (Koker [0159] [0169] FIG. 12 teach multiple hidden layers)

It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Laukien, further incorporating Koker in video/camera technology. One would be motivated to do so, to incorporate the neural network comprises multiple hidden layers. This functionality will improve efficiency. It will be obvious to have of plurality of hidden layers in Laukien using the algorithm described in Koker and combined teaching meets claim limitation with predictable results.

Regarding to claim 6:

6. Laukien teach the hybrid apparatus of claim 5, wherein: each hidden layer of the first encoder is structured to pass through (Laukien [0093] the delay encoder layer first computes the stimulus of the hidden cells [hidden layer] using a matrix multiplication, or simply multiplying each input by its corresponding weight. This step is similar to the operation of standard feed-forward [pass through] neural networks and autoencoders) the side information (Laukien [0090] feedback [side information] from higher layers may similarly be combined, either raw or with some preprocessing, with the feedforward stimulus as drawn from the weight [side information, because feedback given as weight] matrix. [0092] each connection contains two values: A weight [side information], and an eligibility trace. These connections feed into the array of hidden layer “cells,” which can be of a different dimension. Same algorithm applicable for decoder) such that a first layer of the first decoder receives the side information. (Laukien [0186] Decoding information passes down the hierarchy, as each layer generates predictions by decoding its encoder's output, combined with feedback [side information] from higher layer decoder predictions. The feedback input Z.sub.fb is either the higher layer decoder output or the current hidden state in the case of the top layer. The decoder produces two predictions, one based on the feedback and the other, lateral prediction, on the corresponding encoder hidden state. These predictions are combined to produce the decoder's output)

Also Laukien [0071] an encoder for video might take frames of pixel color values and transform them into a compressed form for storage and transmission, and a decoder might later take such a compressed file and decode it into a form suitable for 

Laukien [0090] feedback [side information] from higher layers may similarly be
combined, either raw or with some preprocessing, with the feedforward [pass through]
stimulus as drawn from the weight [side information, because feedback given as weight] matrix. Raw feedback is pass through side information. Please note: Laukien [0093] the
delay encoder layer first computes the stimulus of the hidden cells [hidden layer] using a
matrix multiplication, or simply multiplying each input by its corresponding weight. This
step is similar to the operation of standard feed-forward [pass through] neural networks
and autoencoders. [0092] each connection contains two values: A weight [side

of hidden layer "cells," which can be of a different dimension. Same algorithm applicable
for decoder. Laukien [0186] Decoding information passes down the hierarchy, as each
layer generates predictions by decoding its encoder's output, combined with feedback
[side information] from higher layer decoder predictions. The feedback input Z.sub.fb is
either the higher layer decoder output or the current hidden state in the case of the top
layer. The decoder produces two predictions, one based on the feedback and the other,
lateral prediction, on the corresponding encoder hidden state. These predictions are
combined to produce the decoder's output.

Alternatively, In the above Koker [0169] FIG. 12, x1 -> a = source data, b -> a through 1205 is side information, a-> c -> y is guided information and b -> y is side information as well. y is decoder because Koker [0188] the media processor 1502 can enable low latency decode of multiple high-resolution (e.g., 4K, SK) video streams. The decoded video streams can be written to a buffer in the on-chip-memory 1505 ... the vision processor 1504 can accelerate convolution operations for a CNN that is used to perform image recognition on the high-resolution video data [video stream]. Koker [0169] teach the illustrated RNN 1200 can be described has having an input layer 1202 that receives an input vector [source data], hidden layers 1204 to implement a recurrent function, a feedback mechanism 1205 [side information] to enable a 'memory' of previous states, and an output layer 1206 to output a result [guided information]. In Koker FIG. 12 above, connection between x1 and a is source data because 1202 is input vector of the system. So source data 1202 is directly getting received at hidden 

Regarding to claim 7:

7. Laukien teach the hybrid apparatus of claim 1, wherein the first encoder includes a first decoder, the hybrid apparatus further comprising: a deterministic transform that transforms the side information (Laukien [0031] the decoder of each i-th processing stage, receiving, by an input of the decoder, the sequence of encoded values generated by the respective encoder, and generating, by the decoder, a sequence of predictions of each next input value that will be received by the input of the respective encoder, the decoder of each ith processing stage except the first processing stage providing the respective predictions as feedback to the decoder of the (i-1)-th processing stage.  [0140] FIG. 14 also using its feedback connection [4]) before providing the side information to the first encoder and the first decoder. (Laukien [0071] Conversely, a "decoder" is a piece of componentry which transforms a frame of data in the language or form of an encoding back into a form similar to that expected as input to an encoder. For example, an encoder for video might take frames of pixel color values and transform them into a compressed form for storage and transmission, and a decoder might later take such a compressed file and decode it into a form suitable for display)

Regarding to claim 14:

14. Laukien teach a method for coding a video stream, comprising: providing source data from the video stream (Laukien [0170 the hierarchy may be configured to produce any desired form of output simply by specifying the size of the output of the bottom decoder. For example, in an embodiment performing video super-resolution, the input [source data] may be the low-resolution video stream, and the output may be the desired high-resolution video) to a first encoder including a neural network; (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size m.sub.vis.sup.i.times.n.
sub.vis.sup.i, and hidden (output-facing) layers have size m.sub.hid.sup.i.times.n.
sub.hid.sup.i)
generating, using the source data, side information; (Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal  [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the feedback-corrected prediction [48] signal [guided information]. Laukien [0137] FIG. 11 shows an arrangement of a typical embodiment of the Routed Predictive Hierarchy Network. In FIG. 11, one sees a series of layers, each 
inputting the side information to the neural network for encoding the source data; and transmitting the encoded source data and the side information from the first encoder to a decoder or to storage, Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal to combine with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the feedback-corrected prediction [48] signal. Laukien feedforward connection [3] of Laukien Fig. 14 is side information. Laukien [0071] an encoder for video might take frames of pixel color values and transform them into a compressed form for storage and transmission, and a decoder might later take such a compressed file and decode it into a form suitable for display. Laukien [0071] teach encoders have storage. Laukien Fig. 14 shows side information which is feedforward connection [3] going to encoder1 to encoder2 or 
successively longer timescales (alternatively at a lower frequency), so they more
efficiently store [side information storage] context information which lower processing
stages need not learn in full. Depending on the chosen mode of operation, the
feedforward signal [side information] of the encoders to higher processing stages may
be either simply a higher-level representation, or a prediction error signal, and in turn
the feedback signal will either be corrective of that representation, or will compensate
for the prediction errors, respectively. As such Laukien Fig. 14 teach side information
from encoder1 is going to the storage of encoder2 without modification. Laukien Fig. 14
shows encoder output 5 is going from encoder1 to decoder1. Please note Laukien Fig.
16 is different representation of same system as both Fig.14 and Fig. 16 have same
inputs (7, 9) and output (8). Encoder /decoder 1, 2, .. N of Fig. 14 are same
Encoder/decoder 1, 2, .. N of N of Fig. 16. Fig. 16 showing encoder output 5 as encoded source data going from encoder1 to decoder 1. 

Laukien [0090] feedback [side information] from higher layers may similarly be combined, either raw [without modification] or with some preprocessing, with the feedforward stimulus as drawn from the weight [side information, because feedback given as weight] matrix. Laukien do not explicitly teach wherein the side information is transmitted without modification by the neural network.

wherein the side information is transmitted without modification by the neural network. (Koker [0162] FIG. 11A-B neurons in a fully connected layer have full connections to all activations [without modification] in the previous layer, as previously described for a feedforward network. The output from the fully connected layers 1108 can be used to generate an output result from the network. The activations within the fully connected layers 1108 can be computed using matrix multiplication instead of convolution)

It will be obvious choice to provide same feedback throughout the neural network using feedforward connection [3] because Koker [0162] FIG. 11 A-B neurons in a fully connected layer have full connections to all activations [without modification] in the previous layer, as previously described for a feedforward network. The output from the fully connected layers 1108 can be used to generate an output result from the network. The activations within the fully connected layers 1108 can be computed using matrix multiplication instead of convolution. Combined teaching meets claimed limitation with predictable results.

Regarding to claim 18:

18. Claim 18 is rejected for the same reason as claim 6. The feedforward connection [3] of Laukien Fig. 14 is side information. Also Laukien Fig. 11 and Fig. 14 shows encoders and hidden layers. 

plurality of hidden
layers which is taught by Koker [0159] the exemplary neural networks described above
can be used to perform deep learning. Deep learning is machine learning using deep neural networks. The deep neural networks used in deep learning are artificial neural
networks composed of multiple hidden layers, as opposed to shallow neural networks
that include only a single hidden layer. Deeper neural networks are generally more
computationally intensive to train. However, the additional hidden layers of the network
enable multistep pattern recognition that results in reduced output error relative to
shallow machine learning techniques. Koker [0169] FIG. 12 as well, Koker [0169] the
illustrated RNN 1200 can be described has having an input layer 1202 that receives an
input vector, hidden layers 1204 to implement a recurrent function, a feedback
mechanism 1205 to enable a 'memory' of previous states, and an output layer 1206 to
output a result.

Regarding to claim 19:

19. Laukien teach a hybrid apparatus for coding a video stream, comprising: a first encoder and a first decoder comprising a neural network having a plurality of hidden layers, (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size m.sub.vis.sup.i.times.n.sub.vis.sup.i, and hidden (output-facing) layers have size m.sub.hid.sup.i.times.n.sub.hid.sup.i) wherein the neural network: (Laukien [0137] FIG. 11 shows an arrangement of a typical embodiment of the Routed Predictive 
receives source data from the video stream (Laukien [0170] the hierarchy may be configured to produce any desired form of output simply by specifying the size of the output of the bottom decoder. For example, in an embodiment performing video super-resolution, the input [source data] may be the low-resolution video stream, and the output may be the desired high-resolution video) at a first hidden layer of the encoder; (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size m.sub.vis.sup.i.times.n.sub.vis.sup.i, and hidden (output-facing) layers have size m.sub.hid.sup.i.times.n.sub.hid.sup.i)
generates guided information using the source data and the side information; and (Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the 
receives the guided information and the side information (Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the feedback-corrected prediction [48] signal [guided information]) at a first hidden layer of the first decoder for reconstruction of the source data. (Laukien [0231] the result of the iterative solving is a sparse code which represents a number of steps towards the minimization of reconstruction errors of the encoder. [0232] Learning in the BISTA encoder is performed using a form of reconstruction error minimization. [0225] The BISTA encoder activate ( ) and its 

Laukien do not explicitly teach a neural network having a plurality of hidden layers; receives side information correlated with the source data at the first hidden layer of the encoder; 

However Koker teach a neural network having a plurality of hidden layers (Koker [0159] the exemplary neural networks described above can be used to perform deep learning. Deep learning is machine learning using deep neural networks. The deep neural networks used in deep learning are artificial neural networks composed of multiple hidden layers, as opposed to shallow neural networks that include only a single hidden layer. Deeper neural networks are generally more computationally intensive to train. However, the additional hidden layers of the network enable multistep pattern recognition that results in reduced output error relative to shallow machine learning techniques. Koker [0169] FIG. 12 as well, Koker [0169] the illustrated RNN 1200 can be described has having an input layer 1202 that receives an input vector, hidden layers 1204 to implement a recurrent function, a feedback mechanism 1205 to enable a 'memory' of previous states, and an output layer 1206 to output a result)
receives side information correlated with the source data at the first hidden layer (Koker [0169] FIG. 12 the RNN 1200 operates based on time-steps. The state of the RNN at a given time step is influenced based on the previous time step via the feedback mechanism 1205 [side information]. For a given time step, the state of the hidden layers 1204 is defined by the previous state and the input at the current time step. An initial input (x.sub.1) at a first time step can be processed by the hidden layer 1204. A second input (x.sub.2) can be processed by the hidden layer 1204 using state information [side information] that is determined during the processing of the initial input (x.sub.1) [source data]) of the encoder; (Koker Fig. 11A [RGB components] and  Fig. 14 shows feedback/side information is also applicable for encoder/decoder training because Koker [0188] During operation, the media processor 1502 and vision processor 1504 can work in concert to accelerate computer vision operations. The media processor 1502 can enable low latency decode of multiple high-resolution (e.g., 4K, 8K) video streams. The decoded video streams can be written to a buffer in the on-chip-memory 1505. The vision processor 1504 can then parse the decoded video and perform preliminary processing operations on the frames of the decoded video in preparation of processing the frames using a trained image recognition model. FIG. 4B [0084] the graphics processing engines 431, 432, N may comprise different types of graphics processing engines within a GPU such as graphics execution units, media processing engines (e.g., video encoders/decoders), samplers, and blit engines.)

Claims 4, 8-13, 15-17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Laukien (U.S. Pub. No. 20190294980 A1), in view of Koker (U.S. Pub. No. 20180307984 A1), further in view of Nishi (U.S. Pub. No. 20200059669 A1).

Regarding to claim 4:

4. Laukien teach the hybrid apparatus of claim 1, wherein: the first encoder includes a first decoder that reconstructs the source data to form reconstructed source data, and (Laukien FIG. 14 [0231] the result of the iterative solving is a sparse code which represents a number of steps towards the minimization of reconstruction errors of the encoder. [0232] Learning in the BISTA encoder is performed using a form of reconstruction error minimization. [0225] The BISTA encoder activate ( ) and its kernels pass the hierarchy's inputs up from layer to layer. Each step around the iterative solver, the previous spike pattern is used to generate a reconstruction of the input)

Laukien do not explicitly teach the neural network is trained to minimize a rate-distortion value between the source data and the reconstructed source data.

However Nishi teach the neural network is trained (Nishi [0335] the discriminator network being a neural network and constituting a generative adversarial network (GAN) with the generator network. [0336] accordingly, generated data for generating a predicted image more similar to an input image can be obtained from the generator network by being trained by the GAN through machine learning. As a result, encoding to minimize a rate-distortion value between the source data and the reconstructed source data. (Nishi [0399] Note that the networks may further receive additional input data. For example, the input data may be signals for notifying candidates of a prediction mode or a quantization step size for rate-distortion (RD) optimization, for instance)

The motivation for combining Laukien and Koker as set forth in claim 1 is equally applicable to claim 4. It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Laukien, further incorporating Koker and Nishi in video/camera technology. One would be motivated to do so, to incorporate neural network is trained to minimize a rate-distortion value between the source data and the reconstructed source data. This functionality will improve quality of image.

Regarding to claim 8:

8. Laukien teach the hybrid apparatus of claim 1, wherein: the side information (Laukien [0140] FIG. 14 also using its feedback connection [4]) comprises a full resolution prediction signal (Laukien [0147] have the bottom decoder use a size equivalent to the full resolution video (m.times.n). The output decoding y of the bottom layer is then compared with the high -resolution training input x in order to generate the prediction errors for learning)

prediction signal generated using motion prediction.

However Nishi teach prediction signal generated using motion prediction. (Nishi [0171] First, a prediction image (Pred) is obtained through typical motion compensation using a motion vector (MV) assigned to the current block.)

Regarding to claim 9:

9. Laukien teach the hybrid apparatus of claim 8, wherein: Laukien do not explicitly teach the neural network is trained to select a transform for a block residual within the full resolution prediction signal to minimize a rate-distortion value.

However Nishi teach the neural network is trained to select a transform for a block residual (Nishi [0416] as a result, encoder part 161 of the auto encoder and decoder part 162 of the auto encoder are trained according to back propagation. Through such training, encoder part 161 of the auto encoder and decoder part 162 of the auto encoder are constructed to reduce the difference between a reconstructed image and an input video) within the full resolution prediction signal to minimize a rate-distortion value. (Nishi [0399] Note that the networks may further receive additional input data. For example, the input data may be signals for notifying candidates of a prediction mode or a quantization step size for rate -distortion (RD) optimization, for instance)

Regarding to claim 10:

10. Laukien teach the hybrid apparatus of claim 1, further comprising: a second encoder generating, using the source data, (Laukien FIG. 4) the side information for input to the first encoder, (Laukien [0140] FIG. 14 also using its feedback connection [4]) 

Laukien do not explicitly teach wherein the second encoder comprises a block-based encoder.

However Nishi teach wherein the second encoder comprises a block-based encoder. (Nishi [0021] FIG. 9A is for illustrating deriving a motion vector of each sub-block based on motion vectors of neighboring blocks)

The motivation for combining Laukien and Koker as set forth in claim 1 is equally applicable to claim 10. It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Laukien, further incorporating Koker and Nishi in video/camera technology. One would be motivated to do so, to incorporate to add a block-based encoder. Block-based encoding is well-known to be used with neural network with predictable results.

Regarding to claim 11:

11. Laukien teach the hybrid apparatus of claim 1, wherein: Laukien do not explicitly teach the side information comprises a per-frame reduced resolution reconstruction of a reduced-resolution base layer.

However Nishi teach the side information comprises a per-frame reduced resolution reconstruction of a reduced-resolution base layer. (Nishi [0515] since there is a demand for real-time viewing of content produced by individuals, which tends to be small in data size, the decoder first receives the base layer as the highest priority and performs decoding and reproduction, although this may differ depending on bandwidth. When the content is reproduced two or more times, such as when the decoder receives the enhancement layer during decoding and reproduction of the base layer and loops the reproduction, the decoder may reproduce a high image quality video including the enhancement layer. If the stream is encoded using such scalable encoding, the video may be low quality when in an unselected state or at the start of the video, but it can offer an experience in which the image quality of the stream progressively increases in an intelligent manner)

Regarding to claim 12:

12. Laukien teach the hybrid apparatus of claim 11, wherein: the neural network generates a high-resolution layer using the per-frame reduced resolution reconstruction. (Laukien [0146] a further example application is in frame-by-frame video prediction. [0170] performing video super-resolution, the input may be the low-

Regarding to claim 13:

13. Laukien teach the hybrid apparatus of claim 12, further comprising: a second encoder generating, using the source data, (Laukien FIG. 4) the side information for input to the first encoder, (Laukien [0140] FIG. 14 also using its feedback connection [4]) 

Laukien do not explicitly teach wherein the second encoder comprises a block-based encoder; and reference frame buffers for storing full-resolution reference frames output from the neural network for use in predicting subsequent frames.

However Nishi teach wherein the second encoder comprises a block-based encoder; and (Nishi [0021] FIG. 9A is for illustrating deriving a motion vector of each sub-block based on motion vectors of neighboring blocks)
reference frame buffers (Nishi [0157] Frame memory 122 is storage for storing reference pictures used in inter prediction, and is also referred to as a frame buffer) for storing full-resolution reference frames output (Nishi FIG. 36. [0500] Note that there may be a plurality of individual streams that are of the same content but different quality.  from the neural network for use in predicting subsequent frames. (Nishi [0449] FIG. 34B is a flowchart illustrating processing operation of encoder 1a that includes processing circuitry 2a and memory 3a. [0450] Similarly to Embodiment 3, processing circuitry 2a first generates, using memory 3a, a predicted image of an input image that is a current image to be encoded, based on generated data output from a generator network that is a neural network in response to a reference image being input to the generator network (step S1a). Next, processing circuitry 2a calculates a prediction error by subtracting the predicted image from the input image (step S2a). Next, processing circuitry 2a generates an encoded image by at least transforming the prediction error (step S3a))

The motivation for combining Laukien and Koker as set forth in claim 1 is equally applicable to claim 13. It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Laukien, further incorporating Koker and Nishi in video/camera technology. One would be motivated to do so, to incorporate to add a block-based encoder. Block-based encoding is well-known to be used with neural network with predictable results.

Regarding to claim 15:

15. Laukien teach the method of claim 14, wherein generating the side information (Laukien [0140] FIG. 14 also using its feedback connection [4]) 

Laukien do not explicitly teach comprises performing motion prediction using the source data to output a prediction signal.

However Nishi teach comprises performing motion prediction using the source data to output a prediction signal. (Nishi [0171] First, a prediction image (Pred) is obtained through typical motion compensation using a motion vector (MV) assigned to the current block)

Regarding to claim 16:

16. Laukien teach the method of claim 15, Laukien do not explicitly teach wherein performing motion prediction using the source data to output a prediction signal comprises using the first encoder for performing the motion prediction.

However Nishi teach wherein performing motion prediction using the source data to output a prediction signal comprises using the first encoder for performing the motion prediction. (Nishi [0171] First, a prediction image (Pred) is obtained through typical motion compensation using a motion vector (MV) assigned to the current block. [0178] the encoder determines whether the current block belongs to a region including complicated motion. The encoder sets the obmc_flag to a value of "1" when 

Regarding to claim 17:

17. Laukien teach the method of claim 14, further comprising: transforming the side information to a same resolution as the source data; and (Laukien [0170] performing video super-resolution, the input may be the low-resolution video stream, and the output may be the desired high-resolution video. [0173] activate Encoder ( ) and its kernels pass the hierarchy's inputs up from layer to layer. The input is first combined with its historical values to generate a derived input, and then converted into a stimulus which is the size and shape of the output encoding)
generating difference information comprising a difference between the source data and the transformed side information, (Laukien [0029] one or more of the first through N-th processing stages further includes a respective predictor coupled in series between the output of the respective encoder and the input of the respective decoder, the predictor of each i-th processing stage of the one or more processing stages also coupled in series between the output of the decoder in the (i+1)-th processing stage and the decoder of the i-th processing stage and configured to provide a corrective [difference] supplementation of the output of the respective encoder using feedback)

wherein providing the source data to the neural network comprises providing the difference information to the neural network.

However Nishi teach wherein providing the source data to the neural network comprises providing the difference information to the neural network. (Nishi [0416] as a result, encoder part 161 of the auto encoder and decoder part 162 of the auto encoder are trained according to back propagation. Through such training, encoder part 161 of the auto encoder and decoder part 162 of the auto encoder are constructed to reduce the difference between a reconstructed image and an input video)

Regarding to claim 20:

20. Laukien teach the hybrid apparatus of claim 19, wherein the neural network further comprises an expander layer that receives the guided information from the first encoder (Laukien [0074] in machine learning and neural networks, collections or arrays of scalar values may be referred to herein as "layers" of "units". When a component (encoder or decoder) has several layers of the same dimensions, each vector of scalar values in the same position in the several layers is referred to as a "cell.")
and transmits the guided information to the first hidden layer of the first decoder. (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size 

Laukien do not explicitly teach and increases an amount of data in the guided information.

However Nishi teach and increases an amount of data in the guided information. (Nishi [0515] If the stream is encoded using such scalable encoding, the video may be low quality when in an unselected state or at the start of the video, but it can offer an experience in which the image quality of the stream progressively increases in an intelligent manner)
Conclusion

THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).

A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any  

Any inquiry concerning this communication or earlier communications from the examiner should be directed to NASIM N NIRJHAR whose telephone number is (571)272-3792.  The examiner can normally be reached on Monday - Friday, 8 am to 5 pm ET.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Christopher Kelley can be reached on (571)272-7331.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/NASIM N NIRJHAR/Primary Examiner, Art Unit 2482