DETAILED ACTION

Response to Amendment
Applicant’s response to the last Office Action, filed on 05/20/2021 has been entered and made of record. 
Applicant’s amendments necessitated the new ground of rejection set forth herein; accordingly this action is made final.
Rejection under 35 USC 101 is withdrawn in view of amendments. 
Specification objections and claim objects are withdrawn in view of amendments.

Response to Arguments
Applicant's arguments filed on 05/20/2021 have been fully considered but they are not persuasive. 
In response to the recently amended independent claims Examiner has added the Zhou reference to the rejections. Zhou teaches a system for video stream analysis which detects a bounding box for an object based on object movement, see ¶ 0063-0064. The bounding box insight information is displayed, see ¶ 0120. ¶ 0067 teaches object counting.
It would have been obvious to one of ordinary skill in the art to have combined Ananthanarayanan’s video stream analysis with Zhou’s video stream analysis. Ananthanarayanan teaches video stream object detection and an output display but does not mention explicitly what is being displayed. Zhou teaches video stream object detection, displaying video stream analysis insights and teaches object counting. The 
Examiner has also added the Millin reference to the rejections. See detailed analysis below.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2-6 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. Claims 2-6 recite the limitations, "The video processing apparatus, as defined by claim 1.” Claim 1 has been amended to recite “A cloud computing-based video processing system.” There is insufficient antecedent basis for this limitation in the claim.


Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):


The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 

Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder 
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-17, 19, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ananthanarayanan (US PGPub 2019/0205649) in view of Millin (US PGPub 20190327179) and Zhou (US PGPub 2019/0130191).
Regarding claim 1, Ananthanarayanan discloses a cloud-computing-based video processing system comprising: (Ananthanarayanan teaches a video stream analysis technique which uses convolutional neural networks (CNNs) to process video.)
a registration component, the registration component registering configuration information associated with video information received by the cloud computing-based video processing system, the configuration information comprising at least one of access information, communication information, metadata, area of interest, analysis information, processing information;  (¶ 0056, 0043 and Fig. 3 teach extracting object areas of interest from whole video frames in order to process the objects. This is a process registering configuration information which comprises areas of interest.)
scaling computing resources based on the video information received by the cloud computing-based video processing system, the computing resources comprising (Ananthanarayanan ¶ 0023-0025 discusses scaling up computational resources to process the video stream, for example at query time.)

a configuration component, the configuration component configuring a plurality of neural networks in at least one of a parallel configuration, sequential configuration, or a mixed parallel and sequential configuration that provides a configured plurality of neural networks; (¶ 0024, ¶ 0055 and Fig. 3 teach configuring multiple CNN neural networks, in particular the compressed CNN and GT-CNN for video processing. These are set up in a sequential configuration.)
a processing component, the processing component processing the filtered video frame using the configured plurality of neural networks that provides insight information, the insight information comprising movements of the objects in the video information; (Fig. 3 and ¶ 0055 show the processing pipeline for the neural networks. ¶ 0111 and 0049 teach that the insight information comprises object detection based on object movement detection.)
a display, the display providing output to a user. (¶ 0157 teach an output display)
a storage component, the storage component storing the configuration information and insight information in persistent cloud-based storage. (¶ 0155 teaches that storage includes “cloud-based storage accessible via a network, such as the Internet.” ¶ 0008 teaches storing configuration and insight information.)
In the field of cloud computing Millin teaches a scaling component, the scaling component requesting and scaling computing resources for the cloud computing-based 
It would have been obvious to one of ordinary skill in the art to have combined the above combination’s video stream processing with Millin’s system for data processing with a cloud-based server. Ananthanarayanan teaches using cloud computing for processing video stream analysis systems. It is directed to balancing tradeoffs in scaling computing resources during different parts of the video stream analysis cycle. Millin teaches the well-known and widely-used technique of scaling up and down cloud server resources in response to a request. The combination constitutes the repeatable and predictable result of simply applying Millin’s teaching here. This cannot be considered a non-obvious improvement in view of the relevant prior art here. Using known engineering design, no “fundamental” operating principle of the teachings are changed; they continue to perform the same functions as originally taught prior to being combined.
In the field of video stream analysis Zhou teaches that the insight information comprising object counts and displaying the insight information to the user and  (Zhou teaches a system for video stream analysis which detects a bounding box for an object based on object movement, see ¶ 0063-0064. The bounding box insight information is displayed, see ¶ 0120. ¶ 0067 teaches object counting.)
It would have been obvious to one of ordinary skill in the art to have combined Ananthanarayanan’s video stream analysis with Zhou’s video stream analysis. Ananthanarayanan teaches video stream object detection and an output display but 
Regarding claim 2, the above combination discloses the video processing apparatus, as defined by Claim 1, further comprising representing the area of interest as a polygon. (Hollander ¶ 0050 teaches representing the area of interest as a bounding box.)
Regarding claim 3, the above combination discloses the video processing apparatus, as defined by Claim 1, wherein the video information comprises at least one of a live video stream, pre-recorded video stream, standalone individual video frame, or an image. (See Ananthanarayanan, ¶ 0041)
Regarding claim 4, the above combination discloses the video processing apparatus, as defined by Claim 1, wherein the insight information is based on an object detected in the filtered video frame and attributes associated with the object. (Ananthanarayanan Fig. 3 and ¶ 0061 teach using the neural networks to detect objects of a certain class X and their associated frame.)
Regarding claim 5, the above combination discloses the video processing apparatus, as defined by Claim 1, wherein the plurality of neural networks comprises a 
Regarding claim 6, the above combination discloses the video processing apparatus, as defined by Claim 1, wherein the video processing apparatus trains the plurality of neural networks to process an image comprising at least one of a predefined dimension, or a dynamic dimension. (Hollander ¶ 0044 and Ananthanarayanan ¶ 0154)
Claims 7-9, 11, 13, and 14 are the method claims corresponding to apparatus claims of 1-6. The apparatus necessarily requires method steps. Remaining limitations are rejected similarly. See detailed analysis above. 
Regarding claim 10, the above combination discloses the method, as defined by Claim 7, further comprising: configuring video content metadata that provides configured video content used in processing the filtered video frame, configuring the area of interest, and storing the configured video content metadata and the configured area of interest in the persistent cloud-based storage. (Ananthanarayanan ¶ 0056 and Fig. 3 teach extracting object areas of interest from whole video frames in order to process the objects. Configured metadata includes the top-k index data and frame data 328, among other data. ¶ 0008 teaches storing configuration metadata and areas of interest data.)
Regarding claim 12, the above combination discloses the method, as defined by Claim 7, further comprising providing the insight information in response to receiving a request for video frame processing. (Ananthanarayanan ¶ 0055 and Fig. 3 teach that the insight information is generated in response to a query 320 for video analysis.)
claim 15, the above combination discloses the method, as defined by Claim 7, further comprising training the plurality of neural networks to process at least one of a black-white image, color image. (Ananthanarayanan teaches processing streams from traffic cameras, surveillance cameras, and news channels. The prior art does not expressly disclose that the video images are one of black-white images or color images, but Examiner notes that the concept of using either black and white or color images for traffic, surveillance or news would have been obvious to incorporate with predictable result and without undue experimentation. This is not considered a non-obvious improvement over the prior art. Official Notice is applied here.)
Regarding claim 16, the above combination discloses the method, as defined by Claim 7, further comprising: 
scaling up a computational resource associated with the plurality of neural networks in response to receiving configuration information comprising video information to be processed, the scaling up comprising requesting a cloud provider API to provide additional computational resources; and (Ananthanarayanan ¶ 0023-0025 discusses scaling up computational resources to process the video stream, for example at query time. Millin, ¶ 0119, “on-demand computing resources can afford flexibility to managed network 300, including the ability to quickly scale cloud services up or down through the click of a button, an API call, or an enterprise rule.”)
scaling down computational resources in response to receiving a stop command, the scaling down comprising requesting the cloud provider API to release existing computational resources used for processing the filtered video frame. (As above, Millin, ¶ 0119, “on-demand computing resources can afford flexibility to managed network 300, 
Regarding claim 17, the above combination discloses the method, as defined by Claim 7, further comprising: configuring a processing pipeline comprising the insight information that provides an aggregation, executing the configured processing pipeline, and storing the aggregation in the persistent cloud-based storage. (Ananthanarayanan Fig. 3 and ¶ 0061 teach using the neural networks to detect objects of a certain class X and their associated frame and return the aggregated collection of frames. ¶ 0008 teaches storage.)
Regarding claim 19, the above combination discloses the method, as defined by Claim 17, further comprising providing an API access to the aggregation in response to a request to initiate calculation and retrieval of the aggregation. (Ananthanarayanan ¶ 0055 and Fig. 3 teach that the aggregation insight information is generated in response to a query 320 for video analysis, a request to initiate calculation and retrieval. The prior art does not expressly disclose that the results are provided to an API, but Examiner notes that the concept of using providing the results to a programming interface for interaction between applications would have been obvious to incorporate with predictable result and without undue experimentation. This is not considered a non-obvious improvement over the prior art. Official Notice is applied here.)
Claim 20 is the computer readable medium claim corresponding to the apparatus of claim 1. Ananthanarayanan ¶ 0006 teaches a computer readable medium. Remaining limitations are rejected similarly. See detailed analysis above.  

Claim 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ananthanarayanan (US PGPub 2019/0205649) in view of Millin (US PGPub 20190327179), Zhou (US PGPub 2019/0130191), and WebGUI (“Features”).
Regarding claim 18, the above combination discloses the method, as defined by Claim 17, further comprising providing access to the aggregation in response to a request to initiate calculation and retrieval of the aggregation. (Ananthanarayanan ¶ 0055 and Fig. 3 teach that the aggregation insight information is generated in response to a query 320 for video analysis, a request to initiate calculation and retrieval.)
In the field of content management systems WebGUI teaches a content management system (¶ 1, WebGUI is a content management system and web application framework, which allows for easy management of content such as photo galleries.)
It would have been obvious to one of ordinary skill in the art to have combined the above combination’s video stream processing with WebGUI’s content management system. Ananthanarayanan and Hollander both teach displaying results of their video stream analysis systems. WebGUI is software for managing display of content. The combination constitutes the repeatable and predictable result of simply displaying image content with a content display software. This cannot be considered a non-obvious improvement in view of the relevant prior art here. Using known engineering design, no “fundamental” operating principle of the teachings are changed; they continue to perform the same functions as originally taught prior to being combined.

Conclusion
Applicant’s amendments necessitated the new ground of rejection set forth herein; therefore THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Raphael Schwartz whose telephone number is (571)270-3822.  The examiner can normally be reached on Monday to Friday 9am-5pm CT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/RAPHAEL SCHWARTZ/           Examiner, Art Unit 2661                                                                                                                                                                                             

/VINCENT RUDOLPH/           Supervisory Patent Examiner, Art Unit 2661