DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 14 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because is/are directed towards transitory propagating signals. A broad but reasonable interpretation of a claim drawn to a computer readable medium (also called machine readable medium and other such variations) typically covers forms of non-transitory tangible media and also transitory propagating signals in view of the ordinary and customary meaning of computer readable media, particularly when the specification recites on page 17, lines 6-8, “the program instructions can be encoded on an artificially generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal”.  See MPEP 2111.01.  When the broadest reasonable interpretation of a claim covers a signal, the claim must be rejected under 35 U.S.C. § 101 as covering non-statutory subject matter.  See Interim Examination Instructions for Evaluating Subject Matter Eligibility Under 35 U.S.C. § 101, Aug. 24, 2009; p. 2 and see Official Gazette Notice: Subject Matter Eligibility of Computer Readable Media (26Jan2010) 1351 OG 212 23FEB2010. A claim drawn to such a computer 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-4 and 7-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Shelhamer et al: “Clockwork Convnets for Video Semantic Segmentation”, In: Hua, G., Jégou, H. (eds) Computer Vision – ECCV 2016 Workshops. ECCV 2016. Lecture Notes in Computer Science(), vol 9915. Springer, Cham. https://doi.org/10.1007/978-3-319-49409-8_69.

As to claim 1, Shelhamer discloses a method comprising: 
receiving a video sequence comprising a respective video frame at each of a plurality of time steps (FIG. 1, video frames and time); and 
processing the video sequence using a video processing neural network to generate a video processing output for the video sequence (FIG. 1 and FIG. 3), 
wherein the video processing neural network includes a sequence of network components (FIG. 1 and FIG. 3), 
(FIG. 1 and FIG. 3; Section 1: we group the layers of the network into stages), 
wherein each component is active for a respective subset of the plurality of time steps (Section 1: we group the layers of the network into stages, and set separate update rates for these levels of representation; see FIG. 3), and 
wherein each layer block is configured to, at each time step at which the layer block is active, receive an input generated at a previous time step and to process the input to generate a block output (Section 4.2: Pipelining. To reduce latency for real-time recognition we pipeline the computation of sequential frames analogously to instruction pipelining in processors. We instantiate a three-stage pipeline, in which stage 1 reflects frame i, stage 2 frame i − 1, and stage 3 frame i − 2. The total time to process the frame is the time of the longest stage, stage 1 in our pipeline, plus the time for interpolating and fusing outputs. Section 5.1: Pipelined execution schedules reduce latency by producing an output each time the first stage is computed. Later stages are persisted from previous frames and their outputs are fused with the output of the first stage computed on the current frame … so that a later stage is independent of the current frame but not past frames).

As to claim 2, Shelhamer further discloses wherein each layer block is active for a same number or fewer time steps than any layer block before the layer block in the sequence of components (Section 4.2: the deeper layers can be executed at a lower rate to save computation while other stages update … The exponential clockwork schedule is the natural choice of halving the rate at each stage for more efficiency).

As to claim 3, Shelhamer further discloses wherein one or more of the layer blocks comprise an initial layer and one or more additional layers (FIG. 1 and FIG. 3), 
wherein the initial layer in each layer block receives an input generated at the previous time step by a component that precedes the layer block in the sequence of components (FIG. 1 and FIG. 3), and
wherein each additional layer in each layer block receives an output generated by one or more layers at a lower depth level within the same layer block at the time step (FIG. 1 and FIG. 3).

As to claim 4, Shelhamer further discloses wherein one or more of the layer blocks in the sequence of components also receive as input a feedback output generated at a previous time step by one or more components after the block in the sequence of components (Section 4: The clockwork recurrent network of [4], designed for long-term dependency modeling of time series, is an instance of our more general scheme for clockwork computation; see FIG. 4, for clockwork RN, dotted line from Module 3 (input 1) to Module 1 (input 2)).

As to claim 7, Shelhamer further discloses wherein the video processing output is a per-sequence output that includes a single prediction for the video segment (FIG. 1 and FIG. 3; Section 2: Convnets have been applied to video to learn spatiotemporal representations for classification).

claim 8, Shelhamer further discloses wherein the video processing output is a per-frame output that includes a respective prediction for each of multiple frames in the video segment (FIG. 1).

As to claim 9, Shelhamer further discloses wherein the video processing neural network further comprises one or more layers after the final layer block in the sequence that are configured to receive the block outputs generated by one or more of the layer blocks and to process the block outputs to generate the video processing output (FIG. 3, see layers after stage 3).

As to claim 10, Shelhamer further discloses wherein the processing comprises, at each time step: performing the processing of two or more of the layer blocks in parallel (Section 5.1; Pipelining: update every stage in parallel).
As to claim 11, Shelhamer further discloses wherein at each time step at which the layer block is active, the layer block does not receive as input any outputs generated by any other layer blocks at the time step (see paragraphs about “pipelining” in Section 4.2 and Section 5.1”).

As to claim 12, Shelhamer further discloses wherein at each time step, each layer that is active operates on data derived from a different video frame than each other active layer block (Section 4.2: We instantiate a three-stage pipeline, in which stage 1 reflects frame i, stage 2 frame i − 1, and stage 3 frame i − 2).

claim 13, system claim 13 corresponds to method claim 1, recites the same features as those recited in claim 1, and is therefore rejected for the same reasons as those used above in rejecting claim 1.

As to claim 14, computer storage media claim 14 corresponds to method claim 1, recites the same features as those recited in claim 1, and is therefore rejected for the same reasons as those used above in rejecting claim 1.

As to claims 15-20, system claims 15-20 correspond to method claims 2-4 and 10-12, recite the same features as those recited in claims 2-4 and 10-12, respectively, and are therefore rejected for the same reasons as those used above in rejecting claims 2-4 and 10-12.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 5-6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shelhamer et al: “Clockwork Convnets for Video Semantic Segmentation”, In: Hua, G., Jégou, H. (eds) Computer Vision – ECCV 2016 Workshops. ECCV 2016. Lecture Notes in Computer Science(), vol 9915. Springer in view of Ji et al: "3D Convolutional Neural Networks for Human Action Recognition," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 221-231, Jan. 2013, doi: 10.1109/TPAMI.2012.59.

As to claim 5, Shelhamer fails to explicitly disclose wherein one or more of the layer blocks include three-dimensional convolutional layers with kernels that have a time dimension of two or more.
However, this type of layer is well-known in the art as suggested by Shelhamer (see 3D convolution in in Section 2: “Related Work”) and further evidenced by Ji which teaches wherein one or more of the layer blocks include three-dimensional convolutional layers with kernels that have a time dimension of two or more (see Section 2: 3D CONVOLUTIONAL NEURAL NETWORKS; FIGS. 1-2).
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify Shelhamer using Ji’s teachings to include wherein one or more of the layer blocks include three-dimensional convolutional layers with kernels that have a time dimension of two or more in order to extract features from both the spatial and the temporal dimensions by performing 3D convolutions, thereby capturing the motion information encoded in multiple adjacent frames using a 3D CNN model which outperforms compared methods, achieves competitive performance, and demonstrates superior performance in real-world environments (Ji; Abstract and Conclusions).

As to claim 6, the combination of Shelhamer and Ji further discloses wherein each block that includes three-dimensional convolutional layers with kernels that have a time dimension of two or more also receives an input generated at another previous time step (see FIG. 3 of Shelhamer; see Section 2: 3D CONVOLUTIONAL NEURAL NETWORKS and FIGS. 1-4 of Ji).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BOUBACAR ABDOU TCHOUSSOU whose telephone number is (571)272-7625. The examiner can normally be reached M-F 8am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chris Kelley can be reached on 5712727331. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BOUBACAR ABDOU TCHOUSSOU/Examiner, Art Unit 2482