DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 4-12 and 14-20 are rejected under 35 U.S.C. 103 as being unpatentable over Laska et al. (US 20160005281 A1), hereinafter Laska, in view of Scanlon et al. (US 20160086038 A1) hereinafter Scanlon.
Regarding claim 1, 
Laska teaches obtaining, via at least a Wi-Fi network (wireless network [0066][0071], Fig. 1) and at a cloud-based computing system configured to process motion events (video server system 508 provides data processing for monitoring and facilitating review of motion events in video streams captured by video cameras 118 [0092]) for each of a plurality of smart-home environments that are remote from the cloud-based computing system (cloud-computing system to provide a variety of useful smart home functions [0064]; Fig. 2), a real-time video and audio stream (the audio and video recording functions of the camera 118 [0140]) comprising images of a field of view of a smart-home environment by a video 10camera that monitors the smart-home environment, the smart home environment corresponding to a distinct set of one or more video cameras, registered users (In some examples, some or all of the occupants “e.g., individuals who live in the home” may register their device 166 with the smart home environment 100. Such registration may be made at a central server to authenticate the occupant and/or the device as being associated with the home and to give permission to the occupant to use the device to control the smart devices in the home [0070]), and client devices (Fig. 5); 
while obtaining the video and audio stream, creating a first video sub-stream comprising a first plurality of images for the first identified region of interest of the smart-home environment (In some implementations, the zoomed-in portion of the video feed corresponds to a software-based zoom performed locally by the client device 504 on the respective portion of the video feed corresponding to the pinch-in gesture in Fig. 9R [0182]; Fig. 9S); 
concurrently providing the first video sub-stream for display at a client device, of a registered user, that remotely monitors the smart-home environment (Fig. 9R [0182]; Fig. 9S); displaying region smaller in size and overlaid “picture-in-picture” over another region (In some implementations, after performing the software zoom, a perspective window is displayed in the video monitoring UI which shows the zoomed region's location relative to the first video feed “e.g., picture-in-picture window”.  FIG. 9S, for example, shows the client device 504 displaying a perspective box 969 in the first region 903, which indicates the zoomed-in portion 970 relative to the full field of view of the respective camera [0354]). 
Laska did not explicitly teach based on an event recognition identifying from the video stream a second region of interest in the field of view of the video camera, the second region of interest being a smaller portion of the field of view of the video camera than the first region of interest; Page 2 of 9Application No.: 17/216,345Docket No. 20030201USCON01 
creating a second video sub-stream comprising a second plurality of images for the second identified region of interest; and 
providing the first video sub-stream and the second video sub- 5stream for display, the first video sub-stream being displayed smaller in size than the second video sub-stream and the first video sub-stream being displayed in an overlay over the second video sub-stream. 
Scanlon teaches based on an event recognition identifying from the video stream a second region of interest in the field of view of the video camera, the second region of interest “103, Fig. 1” being a smaller portion of the field of view of the video camera than the first region of interest “102, Fig. 1” ([0036]-[0037]; Fig. 1); Page 2 of 9Application No.: 17/216,345Docket No. 20030201USCON01 
creating a second video sub-stream comprising a second plurality of images for the second identified region of interest (In order to properly classify the target, sub-view 103 may be extracted into another secondary video output 105 in FIG. 1C [0036]); and 
concurrently providing the first video sub-stream and the second video sub- 5stream for display (Fig. 1B and Fig. 1C).
It would have been obvious to one having ordinary skill in the art before the effective filing date to add the teachings of Scanlon to the teachings of Laska. The motivation for such an addition would be to automatically detect targets or events of interest and extracting one or more secondary video streams from the primary video stream that provide enhanced image data detailing the targets or events of interest detected (Scanlon [0005]).

Regarding claim 2, 
Laska and Scanlon teaches all the features of claim 1, as outlined above.
Laska further teaches displaying one video stream at a default scale level of the client device; and 15displaying another video stream at a predefined size and overlaid at a predefined location “picture-in-picture” (In some implementations, after performing the software zoom, a perspective window is displayed in the video monitoring UI which shows the zoomed region's location relative to the first video feed “e.g., picture-in-picture window”.  FIG. 9S, for example, shows the client device 504 displaying a perspective box 969 in the first region 903, which indicates the zoomed-in portion 970 relative to the full field of view of the respective camera [0354]).

Regarding claim 4, 
Laska and Scanlon teaches all the features of claim 1, as outlined above.
Laska further teaches wherein video stream 25appears zoomed-in as compared to the other video stream (In some implementations, after performing the software zoom, a perspective window is displayed in the video monitoring UI which shows the zoomed region's location relative to the first video feed “e.g., picture-in-picture window”.  FIG. 9S, for example, shows the client device 504 displaying a perspective box 969 in the first region 903, which indicates the zoomed-in portion 970 relative to the full field of view of the respective camera [0354]).

Regarding claim 5, 
Laska and Scanlon teaches all the features of claim 1, as outlined above.
Laska did not explicitly teach the event recognition includes visual event recognition, motion event recognition, or audio event detection.  
Scanlon teaches the event recognition includes visual event recognition, motion event recognition, or audio event detection (detect targets or events of interest [0005]).
It would have been obvious to one having ordinary skill in the art before the effective filing date to add the teachings of Scanlon to the teachings of Laska. The motivation for such an addition would be to automatically detect targets or events of interest and extracting one or more secondary video streams from the primary video stream that provide enhanced image data detailing the targets or events of interest detected (Scanlon [0005]).

Regarding claim 6, 
Laska and Scanlon teaches all the features of claim 5, as outlined above.
Laska further teaches real-time video stream ([0014][0046][0156][0157][0171]).
Laska did not explicitly teach the event recognition is the 5visual event recognition, and wherein the visual event recognition comprises: recognizing a visual element.
Scanlon teaches the event recognition is the 5visual event recognition, and wherein the visual event recognition comprises: recognizing a visual element ([0005][0048]).
It would have been obvious to one having ordinary skill in the art before the effective filing date to add the teachings of Scanlon to the teachings of Laska. The motivation for such an addition would be to automatically detect targets or events of interest and extracting one or more secondary video streams from the primary video stream that provide enhanced image data detailing the targets or events of interest detected (Scanlon [0005]).

Regarding claim 7, 
Laska and Scanlon teaches all the features of claim 6, as outlined above.
Laska
Scanlon teaches recognizing the visual element in the video stream comprises: 10recognizing a person (“a person” [0014]; “human [0044][0051][0058][0059]; “people” [0048][0064]).
It would have been obvious to one having ordinary skill in the art before the effective filing date to add the teachings of Scanlon to the teachings of Laska. The motivation for such an addition would be to automatically detect targets or events of interest and extracting one or more secondary video streams from the primary video stream that provide enhanced image data detailing the targets or events of interest detected (Scanlon [0005]).

Regarding claim 8, 
Laska and Scanlon teaches all the features of claim 7, as outlined above.
Laska did not explicitly teach recognizing the person comprises: recognizing one or more of a face of the person, a gait of the person, or clothing or a uniform of the person.  
Scanlon teaches recognizing the person comprises: recognizing one or more of a face of the person, a gait of the person, or clothing or a uniform of the person (For example, if identification of human targets in the scene is desired, then the vision module may perform facial recognition algorithms to determine such information [0044]).
It would have been obvious to one having ordinary skill in the art before the effective filing date to add the teachings of Scanlon to the teachings of Laska. The motivation for such an addition would be to automatically detect targets or events of interest and extracting one or more secondary video streams from the primary video stream that provide enhanced image data detailing the targets or events of interest detected (Scanlon [0005]).

Regarding claim 9, 
Laska and Scanlon
Laska did not explicitly teach wherein the second identified region of interest corresponds to the person.  
Scanlon teaches wherein the second identified region of interest corresponds to the person (“a person” [0014]; “human [0044][0051][0058][0059]; “people” [0048][0064]; two regions of interest Fig. 1).
It would have been obvious to one having ordinary skill in the art before the effective filing date to add the teachings of Scanlon to the teachings of Laska. The motivation for such an addition would be to automatically detect targets or events of interest and extracting one or more secondary video streams from the primary video stream that provide enhanced image data detailing the targets or events of interest detected (Scanlon [0005]).

Regarding claim 10, 
Laska and Scanlon teaches all the features of claim 1, as outlined above.
Laska teaches panning, tilting, or zooming to follow the region of interest (changing the pan and tilt angles of the camera 118 [0140]; In some implementations, the server changes the stored video settings (e.g., tilt, pan, and zoom settings) for the respective camera according to the zoom command.  In response to receiving the zoom command [0207]; The electronic device sends “1620” a command to the camera to perform a hardware zoom function on the respective portion according to the current zoom magnification [0358]).  

Regarding claims [11-12 and 14-20] “system” are rejected under the same reasoning as claims [1-2 and 4-10] “method”, where Laska teaches system and method ([0008]; Fig. 5).

s 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Laska and Scanlon, in view of Gouda et al. (US 20160323532 A1) hereinafter Gouda.
Regarding claim 10, 
Laska and Scanlon teaches all the features of claim 1, as outlined above.
Laska did not explicitly teach detecting that the display of the video sub-stream obstructs motion in the display of the video stream; and based on the detecting, shifting the location of the displayed video sub- stream.
Gouda teaches detecting that the display of the video sub-stream obstructs motion in the display of the video stream; and based on the detecting, shifting the location of the displayed video sub- stream (a step of observing a staying situation of a moving object appearing on the monitored moving image and acquiring stay information indicating the staying situation of the moving object; a step of generating the sub-image; controlling an arrangement position of the sub-image on the monitored moving image based on the stay information; and a step of generating the monitoring moving image in which the sub-image is composed on the monitored moving image based on the arrangement position of the sub-image determined by the step and outputting the monitoring moving image on the display device [0012]; Fig. 3A and 3B and [0093]-[0095] Fig. 8B).
It would have been obvious to one having ordinary skill in the art before the effective filing date to add the teachings of Gouda to the teachings of Laska and Scanlon. The motivation for such an addition would be to arrange the sub-image at the position different from the designated area on the monitored moving image so as to overcome hindrance of the monitoring operation (Gouda [0003]).

Regarding claim 13 “system” are rejected under the same reasoning as claim 3 “method”, where Laska teaches system and method ([0008]; Fig. 5).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AYMAN A ABAZA whose telephone number is (571)270-0422. The examiner can normally be reached Mon-Fri 8-5. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hadi Armouche can be reached on 5712703618. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AYMAN A ABAZA/Primary Examiner, Art Unit 2419