PNG
    media_image1.png
    340
    340
    media_image1.png
    Greyscale
United States Patent and Trademark Office    
        
            
                                
            
        
    

Commissioner for Patents
United States Patent and Trademark Office
P.O. Box 1450
Alexandria, VA 22313-1450
www.uspto.gov











BEFORE THE PATENT TRIAL AND APPEAL BOARD


Application Number: 16/516,784
Filing Date: 19 Jul 2019
Appellant(s): Mukherjee et al.



__________________
Michelle L. Knight
Attorney for Appellant
Registration No. 47711
For Appellant


EXAMINER’S ANSWER





This is in response to the appeal brief file 12/31/20.
(1) Grounds of Rejection to be Reviewed on Appeal

Every ground of rejection set forth in the Office action dated 7/2/2020 from which the appeal is taken is being maintained by the examiner except for the grounds of rejection (if any) listed under the subheading “WITHDRAWN REJECTIONS.”  New grounds of rejection (if any) are provided under the subheading “NEW GROUNDS OF REJECTION.”

The following ground(s) of rejection are applicable to the appealed claims.


Claims 1-3, 5-7, 14 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Laukien (U.S. Pub. No. 20190294980 A1), in view of Koker (U.S. Pub. No. 20180307984 A1).

Regarding to claim 1:

1. Laukien teach a hybrid apparatus for coding a video stream, comprising:
a first encoder comprising a neural network having at least one hidden layer, (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size m.sub.vis.sup.i.times.n.sub.vis.sup.i, and hidden (output-facing) layers have size m.sub.hid.sup.i.times.n.sub.hid.sup.i) 
wherein the neural network: receives source data from the video stream (Laukien [0170] In addition, the hierarchy may be configured to produce any desired form of at a first hidden layer of the at least one hidden layer; (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size m.sub.vis.sup.i.times.n.sub.vis.sup.i, and hidden (output-facing) layers have size m.sub.hid.sup.i.times.n.sub.hid.sup.i)
and generates guided information using the source data and the side information; (Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the feedback-corrected prediction [48] signal [guided information]. Laukien [0137] FIG. 11 shows an arrangement of a typical embodiment of the Routed Predictive Hierarchy Network. In FIG. 11, one sees a series of layers, each composed of an encoder [1] and a neural net layer [41] with preferable connections, encoder modulation connection [42] between them that enables the neural net layer [41] to form a subnetwork chosen by the data coming from the encoder [1], which modulates its inputs via the feedback connection [4], and the encoder [1] to represent the upcoming data from the feedforward connection [3]. The hierarchy is fed a sensor and action input [7] and an 
and wherein the first encoder outputs the guided information and the side information (Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal  [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the feedback-corrected prediction [48] signal [guided information]) for a decoder to reconstruct the source data. (Laukien [0231] the result of the iterative solving is a sparse code which represents a number of steps towards the minimization of reconstruction errors of the encoder. [0232] Learning in the BISTA encoder is performed using a form of reconstruction error minimization. [0225] The BISTA encoder activate ( ) and its kernels pass the hierarchy's inputs up from layer to layer. Each step around the iterative solver, the previous spike pattern is used to generate a reconstruction of the input. The reconstruction error E.sub.recon forms the input to generate new stimuli S, which are then used to generate activations A, and those are inhibited to generate the new spike pattern. Laukien [0186] Decoding information passes down the hierarchy, as each layer generates predictions by decoding its encoder's output, combined with feedback [side information] from higher layer decoder predictions. The decoder produces two predictions, one based on the feedback and the 

Laukien do not explicitly teach receives side information correlated with the source data at the first hidden layer:

However Koker teach receives side information correlated with the source data at the first hidden layer: (Koker [0169] FIG. 12 the RNN 1200 operates based on time-steps. The state of the RNN at a given time step is influenced based on the previous time step via the feedback mechanism 1205 [side information]. For a given time step, the state of the hidden layers 1204 is defined by the previous state and the input at the current time step. An initial input (x.sub.1) at a first time step can be processed by the hidden layer 1204. A second input (x.sub.2) can be processed by the hidden layer 1204 using state information [side information] that is determined during the processing of the initial input (x.sub.1) [source data]) 

It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Laukien, further incorporating Koker in video/camera technology. One would be motivated to do so, to incorporate receives side information correlated with the source data at the first hidden layer of the encoder. This functionality will improve coding efficiency.

Regarding to claim 2:

2. Laukien teach the hybrid apparatus of claim 1, further comprising: a second encoder generating, using the source data, (Laukien [0140] FIG. 14 with the prediction errors of the decoder [2] via the decoder-encoder connection [6]) the side information for input to the first encoder. (Laukien [0140] FIG. 14 also using its feedback connection [4])

Regarding to claim 3:

3. Laukien teach the hybrid apparatus of claim 2, wherein: the second encoder includes a second decoder, and (Laukien FIG. 14) the side information comprises decoded source data from the second decoder. (Laukien [0140] FIG. 14 also using its feedback connection [4])

Regarding to claim 5:

5. Laukien teach the hybrid apparatus of claim 1, wherein: the first encoder includes a first decoder, and (Laukien FIG. 14) the neural network comprises multiple hidden layers, (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size m.sub.vis.sup.i.times.n.sub.vis.sup.i, and hidden (output-facing) layers have size m.sub.hid.sup.i.times.n.sub.hid.sup.i) at least the first hidden layer of the multiple hidden layers forming the first encoder, and (Laukien FIG. 14) at least a second hidden layer of the multiple hidden layers forming the first decoder, (Laukien FIG. 14)  and the first decoder receiving the guided information (Laukien [0140] FIG. 14 with the prediction errors of the decoder [2] via the decoder-encoder connection [6]) and the side information (Laukien [0140] FIG. 14 also using its feedback connection [4]) for reconstruction of the source data. (Laukien [0231] the result of the iterative solving is a sparse code which represents a number of steps towards the minimization of reconstruction errors of the encoder. [0232] Learning in the BISTA encoder is performed using a form of reconstruction error minimization. [0225] The BISTA encoder activate ( ) and its kernels pass the hierarchy's inputs up from layer to layer. Each step around the iterative solver, the previous spike pattern is used to generate a reconstruction of the input. The reconstruction error E.sub.recon forms the input to generate new stimuli S, which are then used to generate activations A, and those are inhibited to generate the new spike pattern)

Regarding to claim 6:

6. Laukien teach the hybrid apparatus of claim 5, wherein: each hidden layer of the first encoder is structured to pass through (Laukien [0093] the delay encoder layer first computes the stimulus of the hidden cells [hidden layer] using a matrix multiplication, or simply multiplying each input by its corresponding weight. This step is similar to the operation of standard feed-forward [pass through] neural networks and autoencoders) the side information (Laukien [0090] feedback [side information] from higher layers may similarly be combined, either raw or with some preprocessing, with  such that a first layer of the first decoder receives the side information. (Laukien [0186] Decoding information passes down the hierarchy, as each layer generates predictions by decoding its encoder's output, combined with feedback [side information] from higher layer decoder predictions. The feedback input Z.sub.fb is either the higher layer decoder output or the current hidden state in the case of the top layer. The decoder produces two predictions, one based on the feedback and the other, lateral prediction, on the corresponding encoder hidden state. These predictions are combined to produce the decoder's output)

Regarding to claim 7:

7. Laukien teach the hybrid apparatus of claim 1, wherein the first encoder includes a first decoder, the hybrid apparatus further comprising: a deterministic transform that transforms the side information (Laukien [0031] the decoder of each i-th processing stage, receiving, by an input of the decoder, the sequence of encoded values generated by the respective encoder, and generating, by the decoder, a sequence of predictions of each next input value that will be received by the input of the respective encoder, the decoder of each ith processing stage except the first processing stage providing the respective predictions as feedback to the decoder of the (i-1)-th before providing the side information to the first encoder and the first decoder. (Laukien [0071] Conversely, a "decoder" is a piece of componentry which transforms a frame of data in the language or form of an encoding back into a form similar to that expected as input to an encoder. For example, an encoder for video might take frames of pixel color values and transform them into a compressed form for storage and transmission, and a decoder might later take such a compressed file and decode it into a form suitable for display)

Regarding to claim 14:

14. Laukien teach a method for coding a video stream, comprising: providing source data from the video stream (Laukien [0170 the hierarchy may be configured to produce any desired form of output simply by specifying the size of the output of the bottom decoder. For example, in an embodiment performing video super-resolution, the input [source data] may be the low-resolution video stream, and the output may be the desired high-resolution video) to a first encoder including a neural network; (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size m.sub.vis.sup.i.times.n.
sub.vis.sup.i, and hidden (output-facing) layers have size m.sub.hid.sup.i.times.n.
sub.hid.sup.i)
generating, using the source data, side information; (Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of 
inputting the side information to the neural network for encoding the source data; and transmitting the encoded source data and the side information from the first encoder to a decoder or to storage, Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal to combine with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side 

Laukien [0090] feedback [side information] from higher layers may similarly be combined, either raw [without modification] or with some preprocessing, with the feedforward stimulus as drawn from the weight [side information, because feedback given as weight] matrix. Laukien do not explicitly teach wherein the side information is transmitted without modification by the neural network.

Koker teach wherein the side information is transmitted without modification by the neural network. (Koker [0162] FIG. 11A-B neurons in a fully connected layer have full connections to all activations [without modification] in the previous layer, as previously described for a feedforward network. The output from the fully connected layers 1108 can be used to generate an output result from the network. The activations within the fully connected layers 1108 can be computed using matrix multiplication instead of convolution)

Regarding to claim 18:

18. Laukien teach the method of claim 14, wherein the first encoder includes a first decoder, (Laukien FIG. 14) the neural network comprises a plurality of hidden layers, and (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size the first encoder passes the side information (Laukien Fig. 14 [0141] feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the feedback-corrected prediction [48] signal. Feedback signal of Fig 14 is side information and feedback [side information] predictor [47] transmits side information from encoder to decoder) through at least one hidden layer to only a first hidden layer of the first decoder. (Laukien [0186] Decoding information passes down the hierarchy, as each layer generates predictions by decoding its encoder's output, combined with feedback from higher layer decoder predictions. The feedback input Z.sub.fb is either the higher layer decoder output or the current hidden state in the case of the top layer. The decoder produces two predictions, one based on the feedback and the other, lateral prediction, on the corresponding encoder hidden state. These predictions are combined to produce the decoder's output)

Regarding to claim 19:

19. Laukien teach a hybrid apparatus for coding a video stream, comprising: a first encoder and a first decoder comprising a neural network having a plurality of hidden layers, (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have  wherein the neural network: (Laukien [0137] FIG. 11 shows an arrangement of a typical embodiment of the Routed Predictive Hierarchy Network. In FIG. 11, one sees a series of layers, each composed of an encoder [1] and a neural net layer [41] with preferable connections, encoder modulation connection [42] between them that enables the neural net layer [41] to form a subnetwork chosen by the data coming from the encoder [1], which modulates its inputs via the feedback connection [4], and the encoder [1] to represent the upcoming data from the feedforward connection [3])
receives source data from the video stream (Laukien [0170] the hierarchy may be configured to produce any desired form of output simply by specifying the size of the output of the bottom decoder. For example, in an embodiment performing video super-resolution, the input [source data] may be the low-resolution video stream, and the output may be the desired high-resolution video) at a first hidden layer of the encoder; (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size m.sub.vis.sup.i.times.n.sub.vis.sup.i, and hidden (output-facing) layers have size m.sub.hid.sup.i.times.n.sub.hid.sup.i)
generates guided information using the source data and the side information; and (Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the 
receives the guided information and the side information (Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the feedback-corrected prediction [48] signal [guided information]) at a first hidden layer of the first decoder for reconstruction of the source data. (Laukien [0231] the result of the iterative solving is a sparse code which 

Laukien do not explicitly teach receives side information correlated with the source data at the first hidden layer of the encoder; 

However Koker teach receives side information correlated with the source data at the first hidden layer (Koker [0169] FIG. 12 the RNN 1200 operates based on time-steps. The state of the RNN at a given time step is influenced based on the previous time step via the feedback mechanism 1205 [side information]. For a given time step, the state of the hidden layers 1204 is defined by the previous state and the input at the current time step. An initial input (x.sub.1) at a first time step can be processed by the hidden layer 1204. A second input (x.sub.2) can be processed by the hidden layer 1204 using state information [side information] that is determined during the processing of the initial input (x.sub.1) [source data]) of the encoder; (Koker Fig. 11A [RGB components] and Fig. 14 shows feedback/side information is also applicable for encoder/decoder training because Koker [0188] During operation, the media processor 1502 and vision 

Claims 4, 8-13, 15-17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Laukien (U.S. Pub. No. 20190294980 A1), in view of Koker (U.S. Pub. No. 20180307984 A1), further in view of Nishi (U.S. Pub. No. 20200059669 A1).

Regarding to claim 4:

4. Laukien teach the hybrid apparatus of claim 1, wherein: the first encoder includes a first decoder that reconstructs the source data to form reconstructed source data, and (Laukien FIG. 14 [0231] the result of the iterative solving is a sparse code which represents a number of steps towards the minimization of reconstruction errors of the encoder. [0232] Learning in the BISTA encoder is performed using a form of reconstruction error minimization. [0225] The BISTA encoder activate ( ) and its 

Laukien do not explicitly teach the neural network is trained to minimize a rate-distortion value between the source data and the reconstructed source data.

However Nishi teach the neural network is trained (Nishi [0335] the discriminator network being a neural network and constituting a generative adversarial network (GAN) with the generator network. [0336] accordingly, generated data for generating a predicted image more similar to an input image can be obtained from the generator network by being trained by the GAN through machine learning. As a result, encoding efficiency can be further improved) to minimize a rate-distortion value between the source data and the reconstructed source data. (Nishi [0399] Note that the networks may further receive additional input data. For example, the input data may be signals for notifying candidates of a prediction mode or a quantization step size for rate-distortion (RD) optimization, for instance)

The motivation for combining Laukien and Koker as set forth in claim 1 is equally applicable to claim 4. It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Laukien, further incorporating Koker and Nishi in video/camera technology. One would be motivated to do so, to incorporate neural network is trained to minimize a rate-distortion value 

Regarding to claim 8:

8. Laukien teach the hybrid apparatus of claim 1, wherein: the side information (Laukien [0140] FIG. 14 also using its feedback connection [4]) comprises a full resolution prediction signal (Laukien [0147] have the bottom decoder use a size equivalent to the full resolution video (m.times.n). The output decoding y of the bottom layer is then compared with the high -resolution training input x in order to generate the prediction errors for learning)

Laukien do not explicitly teach prediction signal generated using motion prediction.

However Nishi teach prediction signal generated using motion prediction. (Nishi [0171] First, a prediction image (Pred) is obtained through typical motion compensation using a motion vector (MV) assigned to the current block.)

Regarding to claim 9:

9. Laukien teach the hybrid apparatus of claim 8, wherein: Laukien do not explicitly teach the neural network is trained to select a transform for a block residual within the full resolution prediction signal to minimize a rate-distortion value.

However Nishi teach the neural network is trained to select a transform for a block residual (Nishi [0416] as a result, encoder part 161 of the auto encoder and decoder part 162 of the auto encoder are trained according to back propagation. Through such training, encoder part 161 of the auto encoder and decoder part 162 of the auto encoder are constructed to reduce the difference between a reconstructed image and an input video) within the full resolution prediction signal to minimize a rate-distortion value. (Nishi [0399] Note that the networks may further receive additional input data. For example, the input data may be signals for notifying candidates of a prediction mode or a quantization step size for rate -distortion (RD) optimization, for instance)

Regarding to claim 10:

10. Laukien teach the hybrid apparatus of claim 1, further comprising: a second encoder generating, using the source data, (Laukien FIG. 4) the side information for input to the first encoder, (Laukien [0140] FIG. 14 also using its feedback connection [4]) 

Laukien do not explicitly teach wherein the second encoder comprises a block-based encoder.

wherein the second encoder comprises a block-based encoder. (Nishi [0021] FIG. 9A is for illustrating deriving a motion vector of each sub-block based on motion vectors of neighboring blocks)

Regarding to claim 11:

11. Laukien teach the hybrid apparatus of claim 1, wherein: Laukien do not explicitly teach the side information comprises a per-frame reduced resolution reconstruction of a reduced-resolution base layer.

However Nishi teach the side information comprises a per-frame reduced resolution reconstruction of a reduced-resolution base layer. (Nishi [0515] since there is a demand for real-time viewing of content produced by individuals, which tends to be small in data size, the decoder first receives the base layer as the highest priority and performs decoding and reproduction, although this may differ depending on bandwidth. When the content is reproduced two or more times, such as when the decoder receives the enhancement layer during decoding and reproduction of the base layer and loops the reproduction, the decoder may reproduce a high image quality video including the enhancement layer. If the stream is encoded using such scalable encoding, the video may be low quality when in an unselected state or at the start of the video, but it can offer an experience in which the image quality of the stream progressively increases in an intelligent manner)

Regarding to claim 12:

12. Laukien teach the hybrid apparatus of claim 11, wherein: the neural network generates a high-resolution layer using the per-frame reduced resolution reconstruction. (Laukien [0146] a further example application is in frame-by-frame video prediction. [0170] performing video super-resolution, the input may be the low-resolution video stream, and the output may be the desired high-resolution video. [0231] the result of the iterative solving is a sparse code which represents a number of steps towards the minimization of reconstruction errors of the encoder. [0232] Learning in the BISTA encoder is performed using a form of reconstruction error minimization.)

Regarding to claim 13:

13. Laukien teach the hybrid apparatus of claim 12, further comprising: a second encoder generating, using the source data, (Laukien FIG. 4) the side information for input to the first encoder, (Laukien [0140] FIG. 14 also using its feedback connection [4]) 

Laukien do not explicitly teach wherein the second encoder comprises a block-based encoder; and reference frame buffers for storing full-resolution reference frames output from the neural network for use in predicting subsequent frames.

wherein the second encoder comprises a block-based encoder; and (Nishi [0021] FIG. 9A is for illustrating deriving a motion vector of each sub-block based on motion vectors of neighboring blocks)
reference frame buffers (Nishi [0157] Frame memory 122 is storage for storing reference pictures used in inter prediction, and is also referred to as a frame buffer) for storing full-resolution reference frames output (Nishi FIG. 36. [0500] Note that there may be a plurality of individual streams that are of the same content but different quality. In other words, by determining which layer to decode up to based on internal factors, such as the processing ability on the decoder side, and external factors, such as communication bandwidth, the decoder side can freely switch between low resolution content and high resolution content while decoding) from the neural network for use in predicting subsequent frames. (Nishi [0449] FIG. 34B is a flowchart illustrating processing operation of encoder 1a that includes processing circuitry 2a and memory 3a. [0450] Similarly to Embodiment 3, processing circuitry 2a first generates, using memory 3a, a predicted image of an input image that is a current image to be encoded, based on generated data output from a generator network that is a neural network in response to a reference image being input to the generator network (step S1a). Next, processing circuitry 2a calculates a prediction error by subtracting the predicted image from the input image (step S2a). Next, processing circuitry 2a generates an encoded image by at least transforming the prediction error (step S3a))

Regarding to claim 15:

15. Laukien teach the method of claim 14, wherein generating the side information (Laukien [0140] FIG. 14 also using its feedback connection [4]) 

Laukien do not explicitly teach comprises performing motion prediction using the source data to output a prediction signal.

However Nishi teach comprises performing motion prediction using the source data to output a prediction signal. (Nishi [0171] First, a prediction image (Pred) is obtained through typical motion compensation using a motion vector (MV) assigned to the current block)

Regarding to claim 16:

16. Laukien teach the method of claim 15, Laukien do not explicitly teach wherein performing motion prediction using the source data to output a prediction signal comprises using the first encoder for performing the motion prediction.

However Nishi teach wherein performing motion prediction using the source data to output a prediction signal comprises using the first encoder for performing the motion prediction. (Nishi [0171] First, a prediction image (Pred) is obtained through typical motion compensation using a motion vector (MV) assigned to the current block. [0178] the encoder determines whether the current block belongs to a region including complicated motion. The encoder sets the obmc_flag to a value of "1" when 

Regarding to claim 17:

17. Laukien teach the method of claim 14, further comprising: transforming the side information to a same resolution as the source data; and (Laukien [0170] performing video super-resolution, the input may be the low-resolution video stream, and the output may be the desired high-resolution video. [0173] activate Encoder ( ) and its kernels pass the hierarchy's inputs up from layer to layer. The input is first combined with its historical values to generate a derived input, and then converted into a stimulus which is the size and shape of the output encoding)
generating difference information comprising a difference between the source data and the transformed side information, (Laukien [0029] one or more of the first through N-th processing stages further includes a respective predictor coupled in series between the output of the respective encoder and the input of the respective decoder, the predictor of each i-th processing stage of the one or more processing stages also coupled in series between the output of the decoder in the (i+1)-th processing stage and the decoder of the i-th processing stage and configured to provide a corrective [difference] supplementation of the output of the respective encoder using feedback)

wherein providing the source data to the neural network comprises providing the difference information to the neural network.

However Nishi teach wherein providing the source data to the neural network comprises providing the difference information to the neural network. (Nishi [0416] as a result, encoder part 161 of the auto encoder and decoder part 162 of the auto encoder are trained according to back propagation. Through such training, encoder part 161 of the auto encoder and decoder part 162 of the auto encoder are constructed to reduce the difference between a reconstructed image and an input video)

Regarding to claim 20:

20. Laukien teach the hybrid apparatus of claim 19, wherein the neural network further comprises an expander layer that receives the guided information from the first encoder (Laukien [0074] in machine learning and neural networks, collections or arrays of scalar values may be referred to herein as "layers" of "units". When a component (encoder or decoder) has several layers of the same dimensions, each vector of scalar values in the same position in the several layers is referred to as a "cell.")
and transmits the guided information to the first hidden layer of the first decoder. (Laukien [0052] Fig. 3-4 [0155] Each sparse predictor contains a collection of visible and hidden layers indexed by i. Visible (input-facing) layers have size 

Laukien do not explicitly teach and increases an amount of data in the guided information.

However Nishi teach and increases an amount of data in the guided information. (Nishi [0515] If the stream is encoded using such scalable encoding, the video may be low quality when in an unselected state or at the start of the video, but it can offer an experience in which the image quality of the stream progressively increases in an intelligent manner)

(2) Response to Argument

In essence the Appellants argue the following points, and each point is addressed individually by the examiner.

Appellant argued in page 5:

	Appellant argued in page 5 that prior art do not teach claim 1.

Office respectfully disagrees for the following reason:

Examiner disagrees, because the interpretation of claim 1 is depicted in the following diagram –


[AltContent: textbox (   Neural Network)]
    PNG
    media_image2.png
    402
    681
    media_image2.png
    Greyscale


	In the above diagram video encoder input (ex. Video content which to be encoded) is source data. After encoding is done encoder output is guided information. Neural network feedback is guided information. It is well-known in the art that feedforward connection pass through a block without being modified.

Summary of rejection:

Laukien Fig. 14 and [0169-0171] teach low-resolution video stream as input is getting converted by neural network to a high-resolution video using encoder, decoder and part of that process low-resolution video stream comes as source data as input to a first hidden layer. Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided information is generated when feedback is combined with source input video data] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment [augment involves providing original source and feedback keeping in original form without modification and it uses concatenation process described in Laukien [0090]] the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the feedback-corrected prediction [48] signal [guided information]. As such in Fig 14 shows item 5-> item 47-> item 48-> item 2-> item 6 this loop also provides feedback which can also be considered as side information. In Laukien Fig. 14 encoder is providing feedback to decoder and decoder is providing feedback to encoder in loop. Also lower layer encoder also provide feedback to upper layer encoder. 

Please note Laukien [0077] [0105] [0106] Fig. 3, Fig. 11 and Fig. 14 teach  feedforward connection [3] which is a same feedback signal [same side information at different layer] available at different layers of encodes and decoders, 

    PNG
    media_image3.png
    471
    772
    media_image3.png
    Greyscale


	
As such Laukien teach block level representation of every component claimed by appellant. Koker’s [0169] FIG. 12 disclosure describes circuit  level implementation of Video encoding/decoding involving neural network as well as every elements claimed discussed below. Koker [0162] FIG. 11A-B neurons in a fully connected layer have full connections to all activations [without modification] in the previous layer, as previously described for a feedforward network. The output from the fully connected layers 1108 can be used to generate an output result from the network. The activations within the fully connected layers 1108 can be computed using matrix multiplication instead of convolution. Fully connected layers will achieve feedforward functionality. As such Koker’s fully connected layers also transfers side information without any modification. It will be obvious to combine and combined teaching will accommodate predictable results.  
 
Details of rejection:

Rejection of claim 1 is explained below with following claim mapping which includes “Sub Heading” of Argument 1, Argument 2 … etc. to correlate later in the document for addressing separate appellant arguments.

1. Laukien teach a hybrid apparatus for coding a video stream, comprising: (Laukien [0070-0071] teach video encoding and decoding apparatus)

Argument 1:

a first encoder (Laukien Fig. 11 encoder 2) comprising a neural network (Laukien [0073] several terms appear in the following which are used conventionally in machine learning and neural networks, for example “hidden” and “visible” layers and units, “feedforward”, “feedback” and “recurrent”, “activation”, “weight” etc. These terms will be recognised by those skilled in the art to have their usual meanings. [0074] as found conventionally in machine learning and neural networks, collections or arrays of scalar values may be referred to herein as “layers” of “units”. When a component (encoder or decoder) has several layers of the same dimensions, each vector of scalar values in the same position in the several layers is referred to as a “cell.”) having at least one hidden layer, (Laukien [0092] an innovation of the invention, the “Delay Encoder”, is part of the componentry in the apparatus in some embodiments. Each delay encoder comprises an autoencoder-like two-layer architecture, containing a “visible” 

So Laukien clearly teach a first encoder comprising a neural network having at least one hidden layer. However “neural network having at least one hidden layer” visually depicted in the secondary reference Koker [0169] FIG. 12 as well, Koker [0169] the illustrated RNN 1200 can be described has having an input layer 1202 that receives an input vector, hidden layers 1204 to implement a recurrent function, a feedback mechanism 1205 to enable a ‘memory’ of previous states, and an output layer 1206 to output a result. Koker shows “a first encoder” in Koker [0084] the graphics processing engines 431, 432, N may comprise different types of graphics processing engines within a GPU such as graphics execution units, media processing engines (e.g., video encoders/decoders)
wherein the neural network: (Laukien [0074] as found conventionally in machine learning and neural networks, collections or arrays of scalar values may be referred to herein as “layers” of “units”). 
receives source data from the video stream (Laukien [0169-0171] teach low-resolution video stream as input is getting converted by neural network to a high-resolution video and part of that process low-resolution video stream comes as source data as input to a first hidden layer, because Laukien [0171] passes input into the hierarchy, and run the up-pass for each layer ...  each layer, the encoder component processes the input. As such source data [input] is available at hidden layer. Laukien Fig. 4 also shows Receive Input (t) as source data input to hidden layers. Encoder1, at a first hidden layer of the at least one hidden layer; (Laukien [0092])

So Laukien clearly teach receives source data from the video stream at a first hidden layer of the at least one hidden layer; 

However “receives source data from the video stream at a first hidden layer of the at least one hidden layer” is also visually depicted in the secondary reference Koker [0169] FIG. 12 as well.


    PNG
    media_image4.png
    365
    607
    media_image4.png
    Greyscale



can accelerate convolution operations for a CNN that is used to perform image recognition on the high-resolution video data [video stream].

 Laukien [0029] one or more of the first through N-th processing stages further includes a respective predictor coupled in series between the output of the respective encoder and the input of the respective decoder, the predictor of each i-th processing stage of the one or more processing stages also coupled in series between the output of the decoder in the (i+1)-th processing stage and the decoder of the i-th processing stage and configured to provide a corrective supplementation of the output of the respective encoder using feedback. Feedback [side information] is generated from and provided to encoder and decoder of every layer. Laukien Fig. 11 shows N layer/stage of neural network. “receives side information correlated with the source data at the first hidden layer;” is obvious from Laukien [0029] disclosure, side information is feedback signal and feedback are always correlated with source data. It is obvious to have feedback [side information] source data received at the first hidden layer to be correlated. 

However to show explicit evidence Koker has been cited to show receives side information correlated with the source data at the first hidden layer.

receives side information correlated with the source data (Koker [0169] FIG. 12 the RNN 1200 operates based on timesteps. The state of the RNN at a given time step is influenced [correlated] based on the previous time step via the feedback [correlated] mechanism 1205 [side information]) at the first hidden layer; (Koker [0169] FIG. 12 for a given time step, the state of the hidden layers 1204 is defined by the previous state and the input at the current time step. It’s well known in the art the feedback is correlated with the source data)
generates guided information using the source data and the side information; and (

    PNG
    media_image5.png
    731
    506
    media_image5.png
    Greyscale
 
    PNG
    media_image6.png
    753
    526
    media_image6.png
    Greyscale


Laukien Fig. 3, Fig.4, Fig. 11, Fig. 14, Fig. 16 and [0169-0171] teach low-resolution video stream as input is getting converted by neural network to a high-resolution video using encoder and decoders available at different layers of neural network and part of that process low resolution video stream comes as source data as input to a first hidden layer shown in Fig. 3 as item 54 “Receive Input 9t)”. Laukien Fig. 3, Fig.4, Fig. 11, Fig. 14, Fig. 16 represents the same system, because all the Figures of Laukien Fig. 3, Fig.4, Fig. 11, Fig. 14, Fig. 16 have same input and output for example 7, 9 as input and 8 as output supported by Laukien [0126] the ladder is fed a sensor and action input [7] and an optional top-down input [9], and produces a sensor prediction and chosen action output [8]. However different level of internal details are shown in different figures of Fig. 3, Fig.4, Fig. 11, Fig. 14, Fig. 16. So what is happening here is during the process of low-resolution video stream as input while getting converted by different layers of neural network to a high-resolution video encoders takes feedback from previous layers of encoder and decoder from the same layer to improve the conversion system. So high-resolution video output from an individual layer of encoder is guided information generated from source low resolution video stream and feedback. This is supported in Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback
signal is used to correct the encoder's prediction signal. In either mode, the time division

feedback-corrected prediction [48] signal [guided information]. Also Laukien [0137]
 FIG. 11, one sees a series of layers, each composed of an encoder [1] and a neural net layer [41] with preferable connections, encoder modulation connection [42] between them that enables the neural net layer [41] to form a subnetwork chosen by the
data coming from the encoder [1], which modulates [signal after modulation becomes guided information] its inputs [source data] via the feedback connection [4], and the encoder [1] to represent the upcoming data from the feedforward connection [3] [side information]. The hierarchy is fed a sensor and action input [7] [source data] and an optional top-down input [9] [side information], and produces a sensor prediction and chosen action output [8] [guided information])

So Laukien clearly teach generates guided information using the source data and the side information. 

However “generates guided information using the source data and the side information” is also visually depicted in the secondary reference Koker [0169] FIG. 12 as well.

[AltContent: textbox (c)][AltContent: textbox (b)][AltContent: textbox (a)]
    PNG
    media_image7.png
    590
    923
    media_image7.png
    Greyscale


Koker [0169] teach the illustrated RNN 1200 can be described has having an input layer 1202 that receives an input vector [source data], hidden layers 1204 to implement a recurrent function, a feedback mechanism 1205 [side information] to enable a ‘memory’ of previous states, and an output layer 1206 to output a result [guided information]. In Koker FIG. 12 above, connection between x1 and a is source data because 1202 is input vector of the system. So source data 1202 is directly getting received at hidden layer 1204. Also Koker’s Fig. 12 neural network is related to video stream, because Koker [0188] the vision processor 1504 can accelerate convolution operations for a CNN that is used to perform image recognition on the high-resolution video data [video stream].

Argument 2:

outputs the guided information and the side information (Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [output the side information] the decoder [2] with the feedback-corrected prediction [48] signal [guided information]. Please note [0075] [0077] [0105] [0106] Fig. 3, Fig. 11 and Fig. 14 teach  feedforward connection [3] which is a same feedback signal [same side information] available at different layers of encodes and decoders, because The processing stages form autoencoder-like “layers,” in which feedforward signals are fed up the hierarchy in a chain of encoder modules, feedback and predictive signals are passed down the hierarchy via a chain of decoders, and recurrent connections across and between layers allow for learning and prediction assistance at both the local and network levels. This is how the same side information is available at the decoder) for a decoder to reconstruct the source data. (Laukien [0231] the result of the iterative solving is a sparse code which represents a number of steps towards the minimization of reconstruction errors of the encoder. [0232] Learning in the BISTA encoder is performed using a form of reconstruction error minimization. [0225] The BISTA encoder activate ( ) and its kernels pass the hierarchy's inputs up from layer to layer. Each step 

So Laukien clearly teach outputs the guided information and the side information for a decoder to reconstruct the source data. 

However “outputs the guided information and the side information for a decoder to reconstruct the source data.” is also visually depicted in the secondary reference Koker [0169] FIG. 12 as well.

    PNG
    media_image4.png
    365
    607
    media_image4.png
    Greyscale




Argument 3:

receives side information correlated with the source data at the first hidden layer. Koker teach Fig. 12 shows outputs the guided information and the side information to element y. Koker teach each and every limitation of claim 1 except does not show that element y is a decoder. It is obvious y to be a decoder because Koker [0188] shows neural interaction with decoder. So claim 1 is obvious over both Laukien and Koker individually as well as combined.

It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Laukien, further incorporating Koker in video/camera technology. One would be motivated to do so, to incorporate receives side information correlated with the source data at the first hidden layer of the encoder. This functionality will improve efficiency with predictable results.

Appellant argued in page 5:



	Office respectfully disagrees for the following reason:

Examiner disagrees, because Examiner explained claim mapping of claim 1 in page 28 through to page 44 of this examiner answer document.

Appellant argued in page 7:

	Appellant argued in page 7 that there is no identification of what the Examiner believes corresponds to the source data, the neural network of the first encoder, or the first hidden layer of the neural network of the first encoder in claim 1.	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because Examiner has addressed this argument in page 32 through to page 36 of the examiner answer document under the heading Argument 1:

Appellant argued in page 7:


neural network of an encoder in Laukien has "a first hidden layer" that both receives source data (in the Examiner's example, the low-resolution video stream) and the side information.	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because Laukien Fig. 11 shows hidden layers having encoders. Laukien Fig. 4 is different representation of same system as both Fig.11 and Fig. 4 have same inputs (7, 9) and output (8). Laukien Fig. 4 shows hidden layer 1 receiving source data as “Receive Input (t)”. Encoder 1, 2, ..N of Fig. 4 are Encoder 1, 2, ..N of N hidden layer represented in Fig. 11. Receiving feedback as side information has been explained in page 28-31 of this examiner answer document under the heading “Summary of rejection:”

Appellant argued in page 7-8:

	Appellant argued in page 7-8 that prior art do not teach, the neural network of the first encoder 1 in FIG. 14 of Laukien could both "receive source data from the video stream" and "receive side information" at "a first hidden layer of the at least one hidden layer", such that the neural network "generates guided information using the source data and the side information; and outputs the guided information and the side information for a decoder to reconstruct the source data.	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because this is basically the claim 1 and examiner explained claim mapping of claim 1 in page 28 through to page 44 of this examiner answer document.


Appellant argued in page 8:

	Appellant argued in page 8 that the data supplied by the "feedback connection [4]" is a prediction of an encoder input (and is not an encoder input), and, therefore, cannot be the "side information" of claim 1 at least because the "feedback connection [4]" ( as a prediction) is not "side information.	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because side information to be “an encoder input” has not been claimed in claim 1. However it is taught, because Laukien [0099] this decoder solely provides predictions from the encoder hidden state, and can feed back in to the encoder (as when using the previously described STDP-based learning rule for the encoder). So encoder output can come back as encoder input and connection 4 of Laukien Fig. 14 caries signals from encoder as well. Laukien [0029] one or more of the first through N-th processing stages further includes a respective predictor coupled in 
FIG. 11, one sees a series of layers, each composed of an encoder [1] and a neural net layer [41] with preferable connections, encoder modulation connection [42] between them that enables the neural net layer [41] to form a subnetwork chosen by the data coming from the encoder [1], which modulates its inputs via the feedback connection [4], and the encoder [1] to represent the upcoming data from the feedforward connection [3]. Laukien [0029] and Laukien [0137], [0169-0171] and FIG. 11 teach low-resolution video stream as input is getting converted by neural network to a high-resolution video using encoder and decoders available at different layers of neural network and each encoder and decoder of each layer provide feedback to each other. This feedback is side information. FIG. 11 encoder modulation connection [42] which is connector between feedback connection [4] and encoder is an input to the encoder because feedback connection [4] is generated from FF net layers.

Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided information is generated when feedback is 

Please note Laukien [0075] [0077] [0105] [0106] Fig. 3, Fig. 11 and Fig. 14 teach  feedforward connection [3] which is a same feedback signal [same side information] available at different layers of encodes and decoders, because The processing stages form autoencoder-like “layers,” in which feedforward signals are fed up the hierarchy in a chain of encoder modules, feedback and predictive signals are passed down the hierarchy via a chain of decoders, and recurrent connections across and between layers allow for learning and prediction assistance at both the local and network levels. This is how the same side information [feedback] is available at the storage/decoders without modification. Encoders and decoders of Laukien’s Fig. 14 neural networks are providing feedback [side information] to each other. Each encoder provides feedback to and from a decoder of same layer as well as other layers because Laukien [0089] [0090] teach weight matrix are provided as feedforward stimulus. Please see Laukien Fig. 14 feedforward connection [3] travels from one layer to second layer 

Appellant argued in page 9:

	Appellant argued in page 9 that examiner fails to explain how the person skilled in the art could conclude that the source data, which "may be the low-resolution video stream" according to the Examiner, can both be received "at a first hidden layer of the
neural network" and be received at the time division feedback predictor [47] so the time division feedback predictor [47] can "generate[] guided information using the source data and the side information" as recited in claim 1.

	Office respectfully disagrees for the following reason:

Examiner disagrees, Examiner explained claim mapping of claim 1 in page 28 through to page 44 of this examiner answer document.

Appellant argued in page 10:

	Appellant argued in page 10 that no guided information is identified in Koker, so it is left to the Appellant to guess as to how the Examiner is combining the completely


	Office respectfully disagrees for the following reason:

Examiner disagrees, because examiner explained guided information as well as other arguments above in page 40 through to page 43 of this examiner answer document under the heading Argument 2:.

Appellant argued in page 11:

	Appellant argued in page 11 that Appellant further notes that the Examiner fails to state a proper motivation for the combination of Laukien and Koker.	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because examiner explained motivation above in page 43 through to page 44 of this examiner answer document under the heading Argument 3:

Appellant argued in page 11:



	Office respectfully disagrees for the following reason:

Examiner disagrees, because Laukien and Koker both teach encoding using neural network and combined teaching will make the encoder system efficient with predictable results. Better encoder involves efficient coding or coding efficiency.

Appellant argued in page 11:

	Appellant argued in page 11 the motivation for combination is clearly based on hindsight, as opposed to the references and the reasoning of a person skilled in the art.	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because in response to appellant 's argument that the examiner's conclusion of obviousness is based upon improper hindsight reasoning, it must be recognized that any judgment on obviousness is in a sense necessarily a reconstruction based upon hindsight reasoning.  But so long as it takes into account only knowledge which was within the level of ordinary skill at the time the claimed invention was made, and does not include knowledge gleaned only from the appellant 's In re McLaughlin, 443 F.2d 1392, 170 USPQ 209 (CCPA 1971).

Appellant argued in page 11:

	Appellant argued in page 11 that prior art do not teach, the Examiner further asserts that the "combined teaching of Laukien and Koker meets all claimed limitations of every claim with predictable results.". Appellant submits that there is no support for this position. Given the complexity of the technology here, the Examiner cannot make this assertion without a showing.

	Office respectfully disagrees for the following reason:

Examiner disagrees, because complexity of the technology has not been reflected in the claim language. The interpretation of claim 1 is depicted in the following diagram shown in page 28 of this examiner answer document. Which is extremely broad and well known in the art and will be obvious to combine with predictable results. Just as an example - 5 registers, 5 capacitor and 5 inductor of an electronic circuit can have 100s of combination of connections. All combinations are not patentable rather obvious connection choice. Similarly Laukien and Koker teach every component claimed and the connectivity of different elements are obvious choice.

receives side information correlated with the source data at the first hidden layer. Koker teach Fig. 12 shows outputs the guided information and the side information to element y. Koker teach each and every limitation of claim 1 except does not show that element y is a decoder. It is obvious y to be a decoder because Koker [0188] shows neural interaction with decoder. So claim 1 is obvious over both Laukien and Koker individually as well as  combined.

Appellant argument in page 12-14:

	Appellant argument presented in page 12-14 are not related to claim language. 

	Office respectfully disagrees for the following reason:

Examiner notes that the argument above has not been claimed and appellant arguments are not commensurate with claim language. Examiner has addressed rejection of claim 1 in page 28-44 of this examiner answer document. Examiner has 

Appellant argued in page 14:

	Appellant argued in page 14 related to claim 14 that prior art do not teach, "inputting the side information to the neural network for encoding the source data; and transmitting the encoded source data and the side information from the first encoder to a decoder or to storage wherein the side information is transmitted without modification by the neural network.".	

	Office respectfully disagrees for the following reason:

Argument 4:

Examiner disagrees, because page 28-31 of this examiner answer document under the heading of “Summary of rejection” explains the feedforward connection [3] of Laukien Fig. 14 as side information. Laukien [0071] an encoder for video might take frames of pixel color values and transform them into a compressed form for storage and transmission, and a decoder might later take such a compressed file and decode it into a form suitable for display. Laukien [0071] teach encoders have storage. Laukien Fig. 14 shows side information which is feedforward connection [3] going to encoder1 to encoder2 or encoder2 to encoder3 without modification as signal name has not been 

Appellant argued in page 15:

	Appellant argued in page 15 related to claim 14 that the Examiner fails to explain how side information input "to the neural network for encoding the source data" can also be transmitted "from the first encoder ... without modification by the neural network".	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because this has been addressed under Argument 4: in page 54-55 of this examiner answer document. Laukien Fig. 14 shows side information which is feedforward connection [3] going to encoder1 to encoder2 or encoder2 to encoder3 without modification as signal name has not been changed. Alternatively Koker Fig. 12 also teach this as follows -


    PNG
    media_image4.png
    365
    607
    media_image4.png
    Greyscale



Appellant argued in page 15:

	Appellant argued in page 15 that Examiner states no motivation for combining this alleged feature in Koker with Laukien to reject the combination of features in claim 14.	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because examiner explained motivation above in page 43 through to page 44 of this examiner answer document under the heading Argument 3:

Appellant argued in page 16:

	Appellant argued in page 16 related to claim 19 that "a first encoder and a first
decoder comprising a neural network having a plurality of hidden layers, wherein the neural network. .. receives the guided information and the side information at a first hidden layer of the first decoder for reconstruction of the source data.	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because claim 19 is rejected for the same reason as claim 1. Claim 19 differs from claim 1 only with additional limitation of plurality of hidden layers which is taught by Koker [0159] the exemplary neural networks described above can be used to perform deep learning. Deep learning is machine learning using deep neural networks. The deep neural networks used in deep learning are artificial neural networks composed of multiple hidden layers, as opposed to shallow neural networks that include only a single hidden layer. Deeper neural networks are generally more computationally intensive to train. However, the additional hidden layers of the network enable multistep pattern recognition that results in reduced output error relative to shallow machine learning techniques. Koker [0169] FIG. 12 as well, Koker [0169] the illustrated 

Appellant argued in page 17:

	Appellant argued in page 17 related to claim 5 that "the first decoder receive the guided information and the side information for reconstruction of the source data".	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because claim 5 depends on claim 1 and rejection for claim 1 described above is applicable for claim 5. Laukien Fig. 14 [0141] the time division feedback [side information] predictor [47] performs the transformation of the upper layer feedback output into the appropriate signal [source data] to combine [guided information] with that from the same-layer encoder. In representation mode, this feedback signal is used to augment the representation produced by the encoder. In predictive coding mode, this feedback signal is used to correct the encoder's prediction signal. In either mode, the time division feedback [side information] predictor [47] feeds [transmitting] the decoder [2] with the feedback-corrected prediction [48] signal [guided  So what is happening in Laukien Fig. 14 is during the process of low-resolution video stream as input while getting converted by
different layers of neural network to a high-resolution video and encoders/decoders  takes feedback from same and previous layers of encoder and decoder to improve the
conversion system. Koker [0159] [0169] FIG. 12 teach multiple hidden layers. It will be obvious to have of plurality of hidden layers in Laukien using the algorithm described in Koker and combined teaching meets claim limitation with predictable results.

Appellant argued in page 17:

	Appellant argued in page 17 related to claim 6 that prior art do not teach, each hidden layer of the first encoder is structured to pass through the side information such that a first layer of the first decoder receives the side information.	

	Office respectfully disagrees for the following reason:

Argument 5:



Laukien [0090] feedback [side information] from higher layers may similarly be combined, either raw or with some preprocessing, with the feedforward [pass through] stimulus as drawn from the weight [side information, because feedback given as weight] 

Alternatively 




    PNG
    media_image4.png
    365
    607
    media_image4.png
    Greyscale


In the above Koker [0169] FIG. 12, x1 -> a = source data, b -> a through 1205 is side information, a -> c -> y is guided information and b -> y is side information as well. y is decoder because Koker [0188] the media processor 1502 can enable low latency decode of multiple high-resolution (e.g., 4K, 8K) video streams. The decoded video streams can be written to a buffer in the on-chip-memory 1505 ... the vision processor 1504 can accelerate convolution operations for a CNN that is used to perform image recognition on the high-resolution video data [video stream]. Koker [0169] teach the illustrated RNN 1200 can be described has having an input layer 1202 that receives an input vector [source data], hidden layers 1204 to implement a recurrent function, a feedback mechanism 1205 [side information] to enable a ‘memory’ of previous states, and an output layer 1206 to output a result [guided information]. In Koker FIG. 12 above, connection between x1 and a is source data because 1202 is input vector of the 

Appellant argued in page 18:

	Appellant argued in page 18 that wherein the first encoder includes a first decoder, the neural network comprises a plurality of hidden layers, and the first encoder passes the side information through at least one hidden layer to only a first hidden layer of the first decoder.	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because claim 18 is rejected for the same reason as claim 6 and the examiner response provided in page 60-64 of this examiner answer document under the heading Argument 5: Also Laukien Fig. 11 and Fig. 14 shows encoders and hidden layers. 

Claim 18 differs from claim 6 only with additional limitation of plurality of hidden layers which is taught by Koker [0159] the exemplary neural networks described above can be used to perform deep learning. Deep learning is machine learning using deep 


Appellant argued in page 18:

	Appellant argued in page 18 that the position that the side information is passed "to only a first hidden layer of the first decoder" contradicts the position that appears to be taken by the Examiner elsewhere (e.g., see the rejection of claim 5) that the
side information is at every layer of the feedback process.	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because Laukien Fig. 11 and Fig. 14 shows a system of 1, 2, 3, .. N layers of decoders. It is obvious in a scenario of a simple  system having only 

Appellant argued in page 19:

	Appellant argued in page 19 related to claim 10 and claim 13 that Examiner states no motivation to add a block-based encoder to the combination of FIG. 14. Moreover, the Examiner fails to demonstrate that the modified structure would not violate the principles of operation of the hierarchy of Laukien, which uses an encoder 2 formed of layers of a neural network.	

	Office respectfully disagrees for the following reason:

Examiner disagrees, because examiner explained motivation for combining Laukien and Koker above in page 39 through to page 40 of this examiner answer document under the heading Argument 3:. The motivation for combining Laukien and Koker as set forth in claim 1 is equally applicable to claim 10 and claim 13. It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Laukien, further incorporating Koker and Nishi in video/camera technology. One would be motivated to do so, to incorporate to add a block-based encoder. Block-based encoding is well-known to be used with neural network with predictable results. 



/NASIM N NIRJHAR/           Primary Examiner, Art Unit 2482                                                                                                                                                                                             
Conferees:

	/CHRISTOPHER S KELLEY/             Supervisory Patent Examiner, Art Unit 2482                                                                                                                                                                                           


	/MATTHEW K KWAN/	Primary Examiner, Art Unit 2482                                                                                                                                                                                                        


Requirement to pay appeal forwarding fee.  
In order to avoid dismissal of the instant appeal in any application or ex parte reexamination proceeding, 37 CFR 41.45 requires payment of an appeal forwarding fee within the time permitted by 37 CFR 41.45(a), unless appellant had timely paid the fee for filing a brief required by 37 CFR 41.20(b) in effect on March 18, 2013.