DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Currently, claims 1, 2 and 4-20 are pending in the application. Claim 3 is cancelled. 
Response to Arguments / Amendments
Applicant’s arguments have been fully considered but are rendered moot in view of the new ground of rejection necessitated by amendments initiated by the applicant.

Claim Interpretations - 35 USC § 112 ¶ (f)
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked.  

As explained in MPEP 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph: 
the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;  
the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as "configured to" or "so that"; and  
the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.  
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.  
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.  
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. 
 	If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 4-13  and 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Puntambekar et al. (US 20170078574, hereinafter Puntambekar) in view of Sekar et al. (US 20190289359, hereinafter Sekar).

Regarding Claim 1, Puntambekar discloses a method for video processing at a device (FIG.2), comprising: 
receiving a bitstream comprising a set of video frame ([0070], FIG. 2, receive an input video including an I-Frame for some number of (e.g., 60) frames of the input video that is to be encoded from a source format to any other format; [0184], FIG. 21, receive an input video 2108 for encoding or encoding); 
batching the set of video frames into a first subset of video frames and a second subset of video frames based at least in part on a change in a reference scene associated with the set of video frames ([0070], [0071], FIG. 2, splitter module 102 splits the received input video into a plurality of segments each including a specified number of frames such as into a video segment-1, a video segment-2, and a video segment-3 and the number of frames in each segment can be based on for example, a scene changes in the input video; [0185], FIG. 21, divide the input video into different segments 2114 with learning an optimal place to segment the video and place output key frames to maximize quality by detecting scene change);
selecting a mode of operation for a neural processing unit of the device based at least in part on the batching ([0186] In some embodiments, a fingerprint generator 2112 may measure various video characteristics of each segment. The fingerprint generator 2112 may thus identify a “fingerprint” of a video, e.g., the qualities of the video that make it more or less amenable to certain configuration of encoding parameters. The fingerprint generator may feed these characteristics to a neural network settings generator 2116));
generating, based at least in part on the batching, the first subset of video frames using a video processing unit of the device ([0190] After the video segments have been encoded, a segment selector 2120 analyzes each segment and picks the smallest sized segment 2112 that meets a pre-determined quality requirement (e.g., a threshold quality); 
generating the second subset of video frames using the neural processing unit of the device in parallel with generating the first subset of video frames using the video processing unit based at least in part on the batching and the selected mode of operation ([0190] After the video segments have been encoded, a segment selector 2120 analyzes each segment and picks the smallest sized segment 2112 that meets a pre-determined quality requirement (e.g., a threshold quality); and 
generating a set of video packets comprising the first subset of video frames, the second subset of video frames, or both ([[0191] The video segments are joined together into a single output video 2124).
Puntambekar does not explicitly disclose generating the first and the second frame subsets in parallel. 
Sekar teaches from the same field of endeavor generating the first and the second frame subsets in parallel ([0024],  machine learning enabled method and system for generating content data about a video segmenting the video into K scene segments using a neural network trained to segment video based on scene variations, human attribute recognition and face representation in parallel on each of the N groups of video frames using pre-trained machine learning algorithms; [0078], FIG. 12, machine learning based content data generation system 118 includes multiple services that are controlled by a service controller 1100 which provides the video data to two services in parallel, namely a frame splitting service 1104 and a video scene segmentation service 1106).
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of parallel generation of the subsets of video frames Sekar ([0078]) into the video processing system of Puntambekar in order to provide improved user experience and  improved or more efficient content source and the communications network as well as more efficient use of system resources  (Sekar, [0063]).
Regarding Claim 2, Puntambekar in view of Sekar discloses the method of claim 1.
Puntambekar discloses further comprising: identifying a long-term reference frame of the set of video frames, wherein batching the set of video frames into the first subset of video frames and the second subset of video frames is based at least in part on identifying the long-term reference frame ([0195] he deriving operation includes analyzing a ratio of bit utilization by different frame types in the videos, wherein the frame types include an intra-encoded frame type and an inter-encoded frame type).

Regarding Claim 4, Puntambekar in view of Sekar discloses the method of claim 1.
Puntambekar discloses further wherein generating the second subset of video frames using the neural processing unit comprises: generating a first frame of the second subset of video frames based at least in part on a long-term reference frame of the set of video frames ([0195] he deriving operation includes analyzing a ratio of bit utilization by different frame types in the videos, wherein the frame types include an intra-encoded frame type and an inter-encoded frame type).

Regarding Claim 5, Puntambekar in view of Sekar discloses the method of claim 1.
Puntambekar discloses further wherein generating the first subset of video packets and the second subset of video packets comprises: synchronizing the first subset of video frames, the second subset of video frames, or both, based at least in part on temporal information associated with the first subset of video frames and the second subset of video frames ([0195] he deriving operation includes analyzing a ratio of bit utilization by different frame types in the videos, wherein the frame types include an intra-encoded frame type and an inter-encoded frame type).

Regarding Claim 6, Puntambekar in view of Sekar discloses the method of claim 5.
Puntambekar discloses further comprising: outputting the first subset of video packets, the second subset of video packets, or both based at least in part on the synchronization ([0113], combining the plurality of segments to form a single output video. For example, the segment-1 from the encoder-1, the segment-2 from the encoder-2, and the segment-3 from the encoder-3 are combined to form the single output)  

Regarding Claim 7, Puntambekar in view of Sekar discloses the method of claim 1.
Puntambekar discloses further wherein generating the first subset of video packets and the second subset of video packets comprises: generating the first subset of video frames over a first time duration; and generating the second subset of video frames over a second time duration, the first time duration at least partially overlapping in time with the second time duration ([0169], In the received video, key frames may occur nominally a first time duration apart, wherein a key frame is encoded without depending on another frame in the compressed video bitstream. The time duration may be specified in seconds, frame numbers, and so on).  
  
Regarding Claim 8, Puntambekar in view of Sekar discloses the method of claim 1.
Puntambekar discloses further wherein selecting the mode of operation for the neural processing unit comprises: identifying the change in the reference scene; and selecting a training mode for the neural processing unit based at least in part on the identified change in the reference scene ([0071] The splitter module 102 determines a number of output GOPs that can fit into each segment length. Here, the splitter module 102 determines 2 output GOPs for each segment of the input video. In an embodiment, the number of frames in each segment can be based on for example, a scene changes in the input video).
Sekar also discloses segmentation is based on a frame-by-frame comparison of the input video, with the video being split into many continuous scene segments 301(1) to 301(K) on the time axis according to scene changes ([0078], FIG. 12).

Regarding Claim 9, Puntambekar in view of Sekar discloses the method of claim 8.
Puntambekar discloses further comprising: decoding the first subset of video frames by a video processing unit of the device; and training a learning model associated with the neural processing unit during the training mode based at least in part on at least one decoded video frame of the decoded first subset of video frames  ([0182], the training phase may be implemented using a neural network that uses a learning algorithm in which a cost criterion such as rate-distortion or visual rating score is used for training the encoding parameter learning.).

Regarding Claim 10, Puntambekar in view of Sekar discloses the method of claim 8.
Sekar discloses further comprising: selecting a generation mode for the neural processing unit based at least in part on a frame count satisfying a threshold, the frame count comprising a number of frames following the identified change in the reference scene ([0078], scene specific video segment 301(j) serves as the basic unit where an independent face tracking task is performed such as video scene segmentation service 1106 is implemented using a pre-trained machine learning based system that is generated using a machine learning algorithm and sample data; [0082], FIG. 13)
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of parallel generation of the subsets of video frames Sekar ([0078]) into the video processing system of Puntambekar in order to provide improved user experience and  improved or more efficient content source and the communications network as well as more efficient use of system resources  (Sekar, [0063]).
Regarding Claim 11, Puntambekar in view of Sekar discloses the method of claim 10.
Puntambekar discloses further comprising: decoding the first subset of video frames using a video processing unit of the device; and generating, using the neural processing unit of the device, at least one video frame of the second subset of video frames during the generation mode based at least in part on at least one decoded video frame of the decoded first subset of video frames ([0231], encoder and decoder indicate one or more parameters controlling the effectiveness of the on-line learning to the parameters or weights of the neural based on scene cut detection).

Regarding Claim 12, Puntambekar in view of Sekar discloses the method of claim 10.
Puntambekar discloses further comprising: determining to switch the mode of operation for the neural processing unit from the training mode to the generation mode based at least in part on header information associated with one or more frames of the set of video packets ([0242], a distinct set of parameters and/or weights is maintained for each type of prediction where NN-based prediction is applied. For example, a different NN-based predictor may be used for intra prediction, temporal inter prediction, inter-view sample prediction, and inter-layer sample prediction for quality/spatial scalability). 

Regarding Claim 13, Puntambekar in view of Sekar discloses the method of claim 1.
Puntambekar discloses further comprising: training a learning model associated with the neural processing unit based at least in part on the first subset of video frames processed by a video processing unit of the device ([[0182] the training phase may be implemented using a neural network that uses a learning algorithm in which a cost criterion such as rate-distortion or visual rating score is used for training the encoding parameter learning.).

Regarding Claim 16, Puntambekar in view of Sekar discloses the method of claim 1.
Puntambekar discloses further wherein: batching the set of video frames is performed by a decoder of the device ([0166] In FIG. 18, dependency map may identify, for each frame in a segment, other frames on which decoding of that frame depends, e.g., the reference frame(s) used in encoding that frame. Based on the dependency map, the intelligent segmenter may include in each segment frames of two types—frames that are to be encoded by worker nodes, and additional frames that are included in the segment because the frames to be encoded depend on the additional frames).

Regarding Claim 17, Puntambekar in view of Sekar discloses the method of claim 1.
Puntambekar discloses further comprising: identifying a long-term reference frame of the set of video frames based at least in part on an accuracy threshold, the long-term reference frame comprising an intra- frame of the set of video frames ([0141] [0179],  frame distance between key frames, how many predictive or bi-directional frames intervening key frames to use, threshold used for detecting scene changes by comparing two successive video frames, whether or not to perform intra-frame motion prediction, whether or not to use different quantization matrices, which of the multiple coding options to use for coding bits (e.g., variable length encoding or arithmetic encoding), whether or not to use fading detection, the motion search window to be used for each video frame).

Regarding Claim 18, Puntambekar in view of Sekar discloses the method of claim 17.
Puntambekar discloses further wherein the long-term reference frame is identified by an encoder of the device ([0195] he deriving operation includes analyzing a ratio of bit utilization by different frame types in the videos, wherein the frame types include an intra-encoded frame type and an inter-encoded frame type).
Regarding Claim 19, Apparatus claim 19 of using the corresponding method claimed in claims 1, and the rejections of which are incorporated herein for the same reasons of obviousness as used above. 	
Puntambekar further discloses processor and memory coupled with the processor (FIG. 11).
Regarding Claim 20, Apparatus claim 20 of using the corresponding method claimed in claims 1, and the rejections of which are incorporated herein for the same reasons of obviousness as used above. 

Claims 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Puntambekar et al. (US 20170078574, hereinafter Puntambekar) in view of Sekar et al. (US 20190289359, hereinafter Sekar) and Ripple et al. (US 20200272903, hereinafter Ripple).

Regarding Claim 14, Puntambekar in view of Sekar discloses the method of claim 1 but does not explicitly disclose further comprising: estimating motion vector information associated with the set of video frames; and identifying the change in the reference scene based at least in part on the motion vector information and a learning model associated with the neural processing unit.
Ripple teaches from the same field of endeavor estimating motion vector information associated with the set of video frames ([0046] frame extractor model 220 includes a set of reference frame generator models R.sub.1, R.sub.2, . . . , R.sub.n, a set of motion flow generator models MF.sub.1, MF.sub.2, . . . , MF.sub.n, a set of optical flow generator models OF.sub.1, OF.sub.2, . . . , OF.sub.n, a weight map generator model WG, and a residual frame generator model RG that perform the one or more operations of the frame extractor model 220); and  identifying the change in the reference scene based at least in part on the motion vector information and a learning model associated with the neural processing unit  ([0046] generating different types of intermediate frames at each step in the encoding process that can be combined or transformed to generate the set of reconstructed frames. In one embodiment, the set of reference frame generator models, the set of motion flow generator models, the set of optical flow generator models, the weight map generator model, and the residual generator model are each configured as a convolutional neural network (CNN).)
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of estimating motion vector Ripple ([0046]) into the video processing system of Puntambekar & Sakar to provide to adjust the relative amount between the two types of information depending on the content of the target frame while the relative amount of information spent on motion vectors and the residual frame remain relatively constant (Ripple, [0063]).
Regarding Claim 15, Puntambekar in view of Sekar and Ripple discloses the method of claim 14. 	Ripple further discloses determining a difference between first motion vector information associated with a first video frame of the set of video frames and second motion vector information associated with a second video frame of the set of video frames, wherein identifying the change in the reference scene is based at least in part on the determined difference (([0046] frame extractor model 220 includes a set of reference frame generator models R.sub.1, R.sub.2, . . . , R.sub.n, a set of motion flow generator models MF.sub.1, MF.sub.2, . . . , MF.sub.n, a set of optical flow generator models OF.sub.1, OF.sub.2, . . . , OF.sub.n, a weight map generator model WG, and a residual frame generator model RG that perform the one or more operations of the frame extractor model 220)).
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of estimating motion vector Ripple ([0046]) into the video processing system of Puntambekar & Sakar to provide to adjust the relative amount between the two types of information depending on the content of the target frame while the relative amount of information spent on motion vectors and the residual frame remain relatively constant (Ripple, [0063]).


Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Samuel D Fereja whose telephone number is (469)295-9243. The examiner can normally be reached 8AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, DAVID CZEKAJ can be reached on (571) 272-7327. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SAMUEL D FEREJA/Examiner, Art Unit 2487