Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over US 10803667 B1 to Madden in view of US 2021/0117658 A1 to Taheri et al., hereinafter, “Taheri” and US 20190235126 A1 to Petruk et al., hereinafter “Petruk”.
Claim 1. A computer-implemented method comprising: identifying a video to display a result of an evaluation of video analysis; Madden [col. 2, lines 11-14] teaches for displaying information that corresponds to configurations, rules, and results of video analytics of a monitoring system on an augmented reality device. 
Madden [col. 9, lines 46-54] teaches the system 200 may be capable of image processing and video analytics of images/videos captured by the camera 130 and the augmented reality device 50 based on configurations and rules of video analytics as described in regard to FIG. 1. The analyzed data or recorded image/video may be transmitted to the augmented reality device 50 to be displayed overlaid on what the augmented reality device 50 presently views. 
identifying a particular time in the video when a video analysis determination does not match a ground truth determination for the video; Madden [col. 5, lines 64lcol. 6, line 8] teaches as depicted in FIG. 1, when the augmented reality device 50 views the front yard area of the property 101, the augmented reality device 50 displays visualizations 62, 64, and 66 that represent an event in which a groundhog is inside the front yard area in the monitoring zone 52 taking a path at 8:35 AM. The visualization 62 is an image of the groundhog or a shape that indicates the event and a location (e.g., an end point) of the event. The visualization 64 shows textural information that includes identification of the detected object and the time stamp when the object was detected. 
displaying an image from the particular time in the video; Madden [col. 5, lines 64-col. 6, line 8] teaches as depicted in FIG. 1, when the augmented reality device 50 views the front yard area of the property 101, the augmented reality device 50 displays visualizations 62, 64, and 66 that represent an event in which a groundhog is inside the front yard area in the monitoring zone 52 taking a path at 8:35 AM. The visualization 62 is an image of the groundhog or a shape that indicates the event and a location (e.g., an end point) of the event. The visualization 64 shows textural information that includes identification of the detected object and the time stamp when the object was detected. 
Madden [col. 16, lines 37-43] teaches the process 300 may include determining that the area corresponds to an event or a configuration of a monitoring system (330). For instance, the application server 160 may determine the corresponding events or configurations based on attributes such as coordinates or location information, an area, and a time associated with the events or the configurations.
While Madden teaches comparing video analytics to ground truth; however, Taheri, in the field of object detection in video, teaches and displaying an indication that the video analysis determination does not match the ground truth determination for the video. Madden [col. 14, line 65-col. 15, line 15] teaches the application server 160 may store the reference images for the portions of the property 101. The application server 160 then may compare the image obtained from the augmented reality device 50 with the reference images. The comparison may produce a plurality of results that indicate whether the image obtained from the augmented reality device 50 matches the reference images.
Taheri [0058] teaches the process 350 includes determining that detection of the object in the particular frame was a false positive detection (358). In some examples, determining that detection of the object was a false positive detection includes determining, based on ground truth specified by training data, that the detection of the object was a false positive detection.
Taheri [0048] teaches based on detections that the model generates in a forward pass, positive and negative error is calculated 316 to penalize the model for false positive 311 detections, and to boost true positive 313 detections. The error is compared 318 to a threshold (TH) to determine if parameters are to be updated 314. In an adaptive false alarm penalty scheme, when the model generates a false positive 311 detection bounding box in a forward pass, motion energy corresponding to the falsely detected bounding box is calculated 310. The motion energy can be defined as the average of pixel values of pixels of the detected bounding box on the frame difference image 122. Examiner interprets positive or negative error to be the indication.
Taheri [0052] teaches the process 350 includes generating a representation of a difference between two frames of a video (352). The two frames include sequential image frames of the video. The video can be captured by a camera, e.g., the doorbell camera 102. In some examples, the two frames can include a current frame and a previous frame of the video. In some examples, the two frames can include a current frame and a subsequent frame of the video.
Taheri [0053] teaches the representation of the difference between the two frames of the video can include a single-channel grayscale image. In some examples, the system can compare pixel values of the two frames to produce a color difference image. The system can then convert the color difference image to a grayscale image, e.g., frame difference image 122. In some examples, the system can convert the color difference image to grayscale image based on luma values or luminance values of the pixels of the difference image.
Taheri [0054] teaches the process 350 includes providing, to an object detector, a particular frame of the two frames and the representation of the difference between two frames of the video (354). The particular frame can include pixel values, e.g., RGB values, of each pixel in the particular frame. [0056]
Taheri [0059] teaches determining that detection of the object was a false positive detection includes determining, based on ground truth specified by training data, that the detected object was not a human. For example, training data may include sets of images including the two frames. The training data may also include, for each image, ground truth specifying whether or not an object is present in the image. The ground truth may also specify whether or not the detected object is a human. The ground truth may also specify a location, size, and shape of the detected object. The doorbell camera 102 may determine, based on the ground truth, that the detected object was not a human, e.g., that the detected object was a tree, pole, vehicle, animal, or other non-human object. Based on determining that the detected object was not a human, the doorbell camera 102 can classify the detection as a false positive detection.
Petruk, in the field of surveillance (object detection) [0009] teaches a difference between the second image and the first image; requesting, by the processor, a display to present the second image with the difference being marked.
Thus, at the time of the invention, it would have been obvious to one of ordinary skill to combine the teachings of Madden and Taheri to improve detection accuracy (Taheri [0009]) and with the teachings of Petruk to correctly tracked objects. (Petruck [0006])
Claim 2. Taheri further teaches wherein displaying the image from the particular time in the video comprises generating a graphical user interface for presentation on a display of a computing device. Taheri [0092] teaches the central alarm station server 470 is connected to multiple terminals 472 and 474. The terminals 472 and 474 may be used by operators to process alerting events. For example, the central alarm station server 470 may route alerting data to the terminals 472 and 474 to enable an operator to process the alerting data. The terminals 472 and 474 may include general-purpose computers (e.g., desktop personal computers, workstations, or laptop computers) that are configured to receive alerting data from a server in the central alarm station server 470 and render a display of information based on the alerting data. For instance, the controller 412 may control the network module 414 to transmit, to the central alarm station server 470, alerting data indicating that a sensor 420 detected motion from a motion sensor via the sensors 420. The central alarm station server 470 may receive the alerting data and route the alerting data to the terminal 472 for processing by an operator associated with the terminal 472. The terminal 472 may render a display to the operator that includes information associated with the alerting event (e.g., the lock sensor data, the motion sensor data, the contact sensor data, etc.) and the operator may handle the alerting event based on the displayed information.
Taheri [0094] teaches the one or more authorized user devices 440 and 450 are devices that host and display user interfaces. 
Claim 3. Madden, Taheri and Petruk further teaches wherein the video analysis comprises first video analysis of the video, the method comprising displaying a result of an evaluation of second video analysis that is different from the first video analysis by: obtaining a second video analysis determination on the video at the particular time in the video; Madden [col. 5, lines 64-col. 6, line 8] teaches as depicted in FIG. 1, when the augmented reality device 50 views the front yard area of the property 101, the augmented reality device 50 displays visualizations 62, 64, and 66 that represent an event in which a groundhog is inside the front yard area in the monitoring zone 52 taking a path at 8:35 AM. The visualization 62 is an image of the groundhog or a shape that indicates the event and a location (e.g., an end point) of the event. The visualization 64 shows textural information that includes identification of the detected object and the time stamp when the object was detected. 
Madden [col. 16, lines 37-43] teaches the process 300 may include determining that the area corresponds to an event or a configuration of a monitoring system (330). For instance, the application server 160 may determine the corresponding events or configurations based on attributes such as coordinates or location information, an area, and a time associated with the events or the configurations.
Madden [col. 14, line 65-col. 15, line 15] teaches the application server 160 may store the reference images for the portions of the property 101. The application server 160 then may compare the image obtained from the augmented reality device 50 with the reference images. The comparison may produce a plurality of results that indicate whether the image obtained from the augmented reality device 50 matches the reference images.
and displaying an indication of whether the second video analysis determination on the video matches the ground truth determination for the video. Madden [col. 14, line 65-col. 15, line 15] teaches the application server 160 may store the reference images for the portions of the property 101. The application server 160 then may compare the image obtained from the augmented reality device 50 with the reference images. The comparison may produce a plurality of results that indicate whether the image obtained from the augmented reality device 50 matches the reference images.
Taheri [0048] teaches based on detections that the model generates in a forward pass, positive and negative error is calculated 316 to penalize the model for false positive 311 detections, and to boost true positive 313 detections. The error is compared 318 to a threshold (TH) to determine if parameters are to be updated 314. In an adaptive false alarm penalty scheme, when the model generates a false positive 311 detection bounding box in a forward pass, motion energy corresponding to the falsely detected bounding box is calculated 310. The motion energy can be defined as the average of pixel values of pixels of the detected bounding box on the frame difference image 122. Examiner interprets positive or negative error to be the indication.
Petruk [0009] teaches a difference between the second image and the first image; requesting, by the processor, a display to present the second image with the difference being marked.
Claim 4. Madden further teaches wherein displaying the indication that the video analysis determination does not match the ground truth determination comprises displaying a depiction of a virtual line crossing. Madden [col. 2, lines 48-58] teaches the rules of video analytics may include, for example, polygons defining monitoring areas and virtual lines that are prohibited to cross in the monitoring areas. In some cases, the configurations may include the defined rules. The results of video analytics may be detected events by analyzing captured images and videos based on the configurations and rules of video analytics. For example, the results may be recorded as a video showing a moving object detected in the monitoring area defined by the rules. The results may include descriptive information such as time stamps of the events and locations of the events.
Madden [col. 4, lines 48-58] teaches the user may define a virtual tripwire 56 as a line across a portion of the front yard area of the property 101. The user may set a rule associated with the virtual tripwire 56 to alert the user and to make a noise if the monitoring system detects a human crossing the virtual tripwire 56 after 9 P.M, for instance. The virtual tripwires 56, 58 may be visualized on the augmented reality device 50 in various colors, patterns, and shades that may represent different rules from each other.
Madden [col. 18, lines 33-49] teaches the application server 160 may provide the augmented reality device 50 with screen information that includes a visualization of the configuration of the monitoring system. For instance, the visualization of the configuration represents at least one of monitoring zones 52, 54 of the property 101, virtual tripwires 56, 58 that indicate a detection boundary of the monitoring system, or a component (e.g., a camera 60) of the monitoring system installed at the property 101. Based on the screen information, the augmented reality device 50 may display the information that represents the event or the configuration in which the displayed information is overlaid on a present view of the augmented reality device 50 (410). For example, the augmented reality device 50 may display, on a display portion of the augmented reality device 50, one or more visualizations of the configuration of the monitoring system of the property 101. 
Claim 5. Taheri further teaches wherein displaying the indication that the video analysis determination does not match the ground truth determination comprises displaying a depiction of a bounding box. Taheri [0028] teaches the indication that the human detector detected a human in the four-channel image 130 can be a bounding box around the human detected in the image. The bounding box indicates bounds of a location of the detected human within the image. The bounding box can approximate the outline of the detected human.
Claim 6. Taheri further teaches wherein the bounding box is positioned around a region of the image where the video analysis determination or the ground truth determination detected an object. Taheri [0028] teaches the indication that the human detector detected a human in the four-channel image 130 can be a bounding box around the human detected in the image. The bounding box indicates bounds of a location of the detected human within the image. The bounding box can approximate the outline of the detected human.
Taheri [0048] Based on detections that the model generates in a forward pass, positive and negative error is calculated 316 to penalize the model for false positive 311 detections, and to boost true positive 313 detections. The error is compared 318 to a threshold (TH) to determine if parameters are to be updated 314. In an adaptive false alarm penalty scheme, when the model generates a false positive 311 detection bounding box in a forward pass, motion energy corresponding to the falsely detected bounding box is calculated 310. The motion energy can be defined as the average of pixel values of pixels of the detected bounding box on the frame difference image 122. In some examples, the pixels of the detected bounding box include pixels that make up boundaries of the bounding box. In some examples, the pixels of the detected bounding box include only pixels inside the bounds of the bounding box.
Taheri [0049] teaches since it is desirable to differentiate humans with higher motion energy from objects with lower motion energy, a higher penalty is assigned to the false positive 311 detections with lower motion energy. For example, stationary objects such as sign posts, mailboxes, and statues may be identified as humans by the human detector. In other examples, moving objects such as animals may be identified as humans by the human detector. The penalties assigned for false alarms caused by stationary objects are larger than the penalties assigned for false alarms caused by moving objects. Therefore, penalties assigned are inversely proportional to the motion energy of the false alarm bounding boxes. The penalties are used as the backpropagation error to update 314 model parameters. 
Taheri [0057] teaches the process 350 includes receiving an indication that the object detector detected an object in the particular frame (356). In some examples, the indication that the object detector detected the object includes a bounding box, e.g., bounding box 142, that indicates bounds of a location of the detected object. For example, the system can generate a bounding box that outlines a shape of the detected human. In some examples, the bounding box may be a rectangular, square, or elliptical bounding box. In some examples, the bounding box can approximate the outline of the detected human.
Taheri [0058] teaches the process 350 includes determining that detection of the object in the particular frame was a false positive detection (358). In some examples, determining that detection of the object was a false positive detection includes determining, based on ground truth specified by training data, that the detection of the object was a false positive detection.
Taheri [0059] teaches determining that detection of the object was a false positive detection includes determining, based on ground truth specified by training data, that the detected object was not a human. For example, training data may include sets of images including the two frames. The training data may also include, for each image, ground truth specifying whether or not an object is present in the image. The ground truth may also specify whether or not the detected object is a human. The ground truth may also specify a location, size, and shape of the detected object. The doorbell camera 102 may determine, based on the ground truth, that the detected object was not a human, e.g., that the detected object was a tree, pole, vehicle, animal, or other non-human object. Based on determining that the detected object was not a human, the doorbell camera 102 can classify the detection as a false positive detection.
Taheri [0060] teaches the system can determine that the detection of the object was a false positive detection by comparing the detected bounding box to the ground truth bounding box. The system may determine an amount of overlap between the detected bounding box and the ground truth bounding box. If the overlap of the bounding boxes meets criteria, the system can determine that the detection was a true positive detection. If the overlap of the bounding boxes does not meet criteria, the system can determine that the detection was a false positive detection. The criteria can include a threshold amount of overlap between the detected bounding box and the ground truth bounding box. For example, the criteria can include a threshold of 70% overlap between the detected bounding box and the ground truth bounding box. If the overlap is less than 70%, the doorbell camera 102 can determine that the detection was a false positive detection. If the overlap is greater than 70%, the doorbell camera 102 can determine that the detection was a true positive detection.
Claim 7. Taheri further teaches wherein the bounding box is positioned around a region of the image where the video analysis determination or the ground truth determination detected motion. Taheri [0028] teaches the indication that the human detector detected a human in the four-channel image 130 can be a bounding box around the human detected in the image. The bounding box indicates bounds of a location of the detected human within the image. The bounding box can approximate the outline of the detected human.
Taheri [0060] teaches the system can determine that the detection of the object was a false positive detection by comparing the detected bounding box to the ground truth bounding box. The system may determine an amount of overlap between the detected bounding box and the ground truth bounding box. If the overlap of the bounding boxes meets criteria, the system can determine that the detection was a true positive detection. If the overlap of the bounding boxes does not meet criteria, the system can determine that the detection was a false positive detection. The criteria can include a threshold amount of overlap between the detected bounding box and the ground truth bounding box. For example, the criteria can include a threshold of 70% overlap between the detected bounding box and the ground truth bounding box. If the overlap is less than 70%, the doorbell camera 102 can determine that the detection was a false positive detection. If the overlap is greater than 70%, the doorbell camera 102 can determine that the detection was a true positive detection.
Taheri [0062] teaches the amount of motion energy where the object was detected includes average motion energy of only pixels of the bounding box. For example, the bounding box may include a boundary of pixels, e.g., a rectangular boundary, and may also include a number of pixels inside the bounding box. In some examples, the amount of motion energy where the object was detected includes average motion energy, e.g., average pixel value, of the pixels of the boundary and the pixels inside the bounding box. In some examples, the amount of motion energy where the object was detected includes average motion energy of only the pixels inside the bounding box, and does not include pixels of the boundary of the bounding box.
Claim 8. Taheri further teaches wherein displaying the indication that the video analysis determination does not match the ground truth determination comprises displaying a depiction of boundaries around a portion of the image that corresponds to an area of interest. Taheri [0057] teaches the process 350 includes receiving an indication that the object detector detected an object in the particular frame (356). In some examples, the indication that the object detector detected the object includes a bounding box, e.g., bounding box 142, that indicates bounds of a location of the detected object. For example, the system can generate a bounding box that outlines a shape of the detected human. In some examples, the bounding box may be a rectangular, square, or elliptical bounding box. In some examples, the bounding box can approximate the outline of the detected human.
Taheri [0058] teaches the process 350 includes determining that detection of the object in the particular frame was a false positive detection (358). In some examples, determining that detection of the object was a false positive detection includes determining, based on ground truth specified by training data, that the detection of the object was a false positive detection.
Taheri [0059] teaches determining that detection of the object was a false positive detection includes determining, based on ground truth specified by training data, that the detected object was not a human. For example, training data may include sets of images including the two frames. The training data may also include, for each image, ground truth specifying whether or not an object is present in the image. The ground truth may also specify whether or not the detected object is a human. The ground truth may also specify a location, size, and shape of the detected object. The doorbell camera 102 may determine, based on the ground truth, that the detected object was not a human, e.g., that the detected object was a tree, pole, vehicle, animal, or other non-human object. Based on determining that the detected object was not a human, the doorbell camera 102 can classify the detection as a false positive detection.
Taheri [0060] teaches the system can determine that the detection of the object was a false positive detection by comparing the detected bounding box to the ground truth bounding box. The system may determine an amount of overlap between the detected bounding box and the ground truth bounding box. If the overlap of the bounding boxes meets criteria, the system can determine that the detection was a true positive detection. If the overlap of the bounding boxes does not meet criteria, the system can determine that the detection was a false positive detection. The criteria can include a threshold amount of overlap between the detected bounding box and the ground truth bounding box. For example, the criteria can include a threshold of 70% overlap between the detected bounding box and the ground truth bounding box. If the overlap is less than 70%, the doorbell camera 102 can determine that the detection was a false positive detection. If the overlap is greater than 70%, the doorbell camera 102 can determine that the detection was a true positive detection.
Claim 9. Taheri further teaches wherein the area of interest comprises an area of a property that is monitored by a camera that captured the video.     Taheri [0006] teaches a camera, e.g., a doorbell camera, can detect objects and track object movement within a field of view. For example, a doorbell camera with a field of view that includes a front yard of a property can track positions and movements of objects of interest in the front yard. Objects of interest can include, for example, humans, vehicles, and animals. The objects of interest may be moving or stationary. The doorbell camera can use video tracking to associate objects of interest in consecutive video images, or frames.
Taheri [0060] teaches the system can determine that the detection of the object was a false positive detection by comparing the detected bounding box to the ground truth bounding box. The system may determine an amount of overlap between the detected bounding box and the ground truth bounding box. If the overlap of the bounding boxes meets criteria, the system can determine that the detection was a true positive detection. If the overlap of the bounding boxes does not meet criteria, the system can determine that the detection was a false positive detection. The criteria can include a threshold amount of overlap between the detected bounding box and the ground truth bounding box. For example, the criteria can include a threshold of 70% overlap between the detected bounding box and the ground truth bounding box. If the overlap is less than 70%, the doorbell camera 102 can determine that the detection was a false positive detection. If the overlap is greater than 70%, the doorbell camera 102 can determine that the detection was a true positive detection.
Claim 10. Madden further teaches wherein displaying the indication that the video analysis determination does not match the ground truth determination comprises displaying text indicating that the video analysis determination does not match the ground truth determination. Madden [col. 2, lines 48-58] teaches the rules of video analytics may include, for example, polygons defining monitoring areas and virtual lines that are prohibited to cross in the monitoring areas. In some cases, the configurations may include the defined rules. The results of video analytics may be detected events by analyzing captured images and videos based on the configurations and rules of video analytics. For example, the results may be recorded as a video showing a moving object detected in the monitoring area defined by the rules. The results may include descriptive information such as time stamps of the events and locations of the events.
Madden [col. 18, lines 4-18] teaches the augmented reality device 50 may, based on reception of the information from the application server 160, display a visualization of an object detected during the event at the area of the property or a visualization of the configuration of the monitoring system. For instance, the visualization of the object may include at least one of an image of the object captured during the event, a video recorded during the event, one or more frames of the video (e.g., still images), a path taken by the object, or a graphical object or text that indicates occurrence of the event or an identification of the object. The visualization of the configuration may represent at least one of monitoring zones of the property, a virtual tripwire that indicates a detection boundary of the monitoring system, or a component of the monitoring system installed at the property.
Madden [col. 19, lines 24-41] teaches FIG. 5 illustrates an example of using an augmented reality device to view multiple events occurred a monitored property. For example, the augmented reality device 50 presently views an area of a property, and displays, on the present view of the area, various visualizations such as a monitoring zone 52, a virtual tripwire 56, images 502, 504, and 506, paths 510 and 512 between the images, and multiple text 514, 516, and 518. The images 502, 504, and 506 indicate prior events that correspond to detections of a ground hog at event locations in the monitoring zone 52. The paths 510 and 512 between the images indicate paths that the groundhog took to move from one event location to another event location detected during the prior events. The text 514, 516, and 518 indicate a type of the detected object (e.g., groundhog) and event times corresponding to the detections of the object. In some cases, the paths 510 and 512 may represent paths estimated based on the event locations and event times.
Claim 11. Madden further teaches wherein the indication that the video analysis determination does not match the ground truth determination for the video includes a user-selectable icon. Madden [col. 5, lines 6-16] teaches the augmented reality device 50 may visualize connected components such as cameras and sensors installed at an area of the property 101 when the augmented reality device 50 views the area. For instance, the visualizations may include graphical icons overlaid on a field of view of the augmented reality device 50, where the icons represent positions of the components in the field of view of the augmented reality device 50. As depicted in FIG. 1, the augmented reality device 50 visualizes a camera 60 installed at a side wall of the property 101 as an icon representing the camera, for instance. In some cases, the visualizations may be a shape such as a circle around the camera in the present view instead of showing the icon representing the camera.
Madden [col. 5, lines 17-29] teaches in examples where the cameras or sensors are installed in a hidden area or have a small form factor, the visualizations with blown-up icons can help the user to easily recognize the locations of the cameras and sensors in the augmented reality device 50. In some examples, the augmented reality device 50 may be capable of controlling the connected components. For instance, the user may be able to determine an activation status of the camera 60 based on the visualization (e.g., a green/red color of the circle around the camera icon) and turn on or off the camera 60 by selecting the visualization through the augmented reality device 50. The user may adjust pan and tilt angles of the camera 60 by dragging the camera icon or by selecting a submenu item that may be provided when the user clicks or touches the camera icon.
Madden [col. 17, lines 17-31] teaches the process 300 may include in response to determining that the area corresponds to the event or the configuration, providing information representing the event or the configuration for display on the augmented reality device 50 (340). For instance, the application server 160 may provide information to the augmented reality device 50, which then generates visualizations representing the determined event or configuration for display in the field of view of the augmented reality device 50. The augmented reality device 50 may display the generated visualizations through a display device of the augmented reality device 50. The augmented reality device 50 may display a visualization such as an icon or a pop-up window indicating the events so that the user can select to view an image/video associated with the events.
Claim 12. Madden further teaches the method comprising: in response to a user selecting the user-selectable icon, displaying video analysis results for the particular time in the video. Madden [col. 5, lines 48-63] teaches the data visualized may represent textural information associated with the event and geometrical information associated with the event. In some implementations, the visualizations may include various graphical, textual, audio, and video information to represent configurations, rules, and events. In some examples, the visualizations may include lines or curves corresponding to a path taken by a moving object which has been recorded as an event in the monitoring area. In other instances, the visualizations may include overlaid text that represent information associated with the event such as a time of the event and a kind of the moving object.
Madden [col. 6, lines 9-26] teaches the visualizations may be associated with additional data such as recorded video or still images of the event and a list of similar events. For instance, when the user selects the visualization 62 by clicking or touching the image of the groundhog, the recorded video of the event may be displayed on the augmented reality device 50 and overlaid on the area that the augmented reality device 50 presently views so that the user may think as the event presently occurs at the area. The monitoring system may provide guidance to lead the user to the location of an event when the location is not visible in the augmented reality device 50. For example, when the user selects an event from a list of events which occurred on the other side of the property 101, the augmented reality device 50 may display visualizations such as icons showing the user which direction the event location is in, which way to turn, which direction or path to take to get to the location, or how far away the event location is from the user.
Madden [col. 17, lines 17-31] teaches the process 300 may include in response to determining that the area corresponds to the event or the configuration, providing information representing the event or the configuration for display on the augmented reality device 50 (340). For instance, the application server 160 may provide information to the augmented reality device 50, which then generates visualizations representing the determined event or configuration for display in the field of view of the augmented reality device 50. The augmented reality device 50 may display the generated visualizations through a display device of the augmented reality device 50. The augmented reality device 50 may display a visualization such as an icon or a pop-up window indicating the events so that the user can select to view an image/video associated with the events.
Claim 13. Taheri further teaches wherein identifying the particular time in the video when the video analysis determination does not match the ground truth determination for the video comprises identifying the particular time in the video when one or more of a false positive motion detection event, a false negative motion detection event, a false positive object detection event, a false negative object detection event, or a false object classification event occurred. Taheri [0058] teaches the process 350 includes determining that detection of the object in the particular frame was a false positive detection (358). In some examples, determining that detection of the object was a false positive detection includes determining, based on ground truth specified by training data, that the detection of the object was a false positive detection.
Taheri [0059] teaches determining that detection of the object was a false positive detection includes determining, based on ground truth specified by training data, that the detected object was not a human. For example, training data may include sets of images including the two frames. The training data may also include, for each image, ground truth specifying whether or not an object is present in the image. The ground truth may also specify whether or not the detected object is a human. The ground truth may also specify a location, size, and shape of the detected object. The doorbell camera 102 may determine, based on the ground truth, that the detected object was not a human, e.g., that the detected object was a tree, pole, vehicle, animal, or other non-human object. Based on determining that the detected object was not a human, the doorbell camera 102 can classify the detection as a false positive detection.
Taheri [0060] teaches the system can determine that the detection of the object was a false positive detection by comparing the detected bounding box to the ground truth bounding box. The system may determine an amount of overlap between the detected bounding box and the ground truth bounding box. If the overlap of the bounding boxes meets criteria, the system can determine that the detection was a true positive detection. If the overlap of the bounding boxes does not meet criteria, the system can determine that the detection was a false positive detection. The criteria can include a threshold amount of overlap between the detected bounding box and the ground truth bounding box. For example, the criteria can include a threshold of 70% overlap between the detected bounding box and the ground truth bounding box. If the overlap is less than 70%, the doorbell camera 102 can determine that the detection was a false positive detection. If the overlap is greater than 70%, the doorbell camera 102 can determine that the detection was a true positive detection.
Claim 14. The method of claim 1, wherein identifying the particular time in the video when the video analysis determination does not match the ground truth determination for the video comprises identifying the particular time in the video when a motion detection latency occurred. Madden [col. 5, lines 48-63] teaches the data visualized may represent textural information associated with the event and geometrical information associated with the event. In some implementations, the visualizations may include various graphical, textual, audio, and video information to represent configurations, rules, and events. In some examples, the visualizations may include lines or curves corresponding to a path taken by a moving object which has been recorded as an event in the monitoring area. In other instances, the visualizations may include overlaid text that represent information associated with the event such as a time of the event and a kind of the moving object.
Claim 15. Madden further teaches wherein the motion detection latency comprises motion detection at a time in the video that is greater than a threshold time duration after ground truth motion detection.  Madden [col. 5, lines 64lcol. 6, line 8] teaches as depicted in FIG. 1, when the augmented reality device 50 views the front yard area of the property 101, the augmented reality device 50 displays visualizations 62, 64, and 66 that represent an event in which a groundhog is inside the front yard area in the monitoring zone 52 taking a path at 8:35 AM. The visualization 62 is an image of the groundhog or a shape that indicates the event and a location (e.g., an end point) of the event. The visualization 64 shows textural information that includes identification of the detected object and the time stamp when the object was detected. 
Madden [col. 20, lines 48-54] teaches FIG. 1, the image shown in the augmented reality device 50 may be associated with a type attribute, for example, “animal” or “groundhog.” The application server 160 may search the databased to determine one or more prior events that have the type attribute “animal” or “groundhog” and that occurred within a predetermined time window.
Claim 16. Madden further teaches wherein identifying the particular time in the video when the video analysis determination does not match the ground truth determination for the video comprises identifying the particular time in the video when an object detection latency occurred. Madden [col. 5, lines 64lcol. 6, line 8] teaches as depicted in FIG. 1, when the augmented reality device 50 views the front yard area of the property 101, the augmented reality device 50 displays visualizations 62, 64, and 66 that represent an event in which a groundhog is inside the front yard area in the monitoring zone 52 taking a path at 8:35 AM. The visualization 62 is an image of the groundhog or a shape that indicates the event and a location (e.g., an end point) of the event. The visualization 64 shows textural information that includes identification of the detected object and the time stamp when the object was detected. 
Madden [col. 19, lines 24-41] teaches FIG. 5 illustrates an example of using an augmented reality device to view multiple events occurred a monitored property. For example, the augmented reality device 50 presently views an area of a property, and displays, on the present view of the area, various visualizations such as a monitoring zone 52, a virtual tripwire 56, images 502, 504, and 506, paths 510 and 512 between the images, and multiple text 514, 516, and 518. The images 502, 504, and 506 indicate prior events that correspond to detections of a ground hog at event locations in the monitoring zone 52. The paths 510 and 512 between the images indicate paths that the groundhog took to move from one event location to another event location detected during the prior events. The text 514, 516, and 518 indicate a type of the detected object (e.g., groundhog) and event times corresponding to the detections of the object. In some cases, the paths 510 and 512 may represent paths estimated based on the event locations and event times.
Claim 17. Taheri further teaches wherein the object detection latency comprises object detection at a time in the video that is greater than a threshold time duration after ground truth object detection. Taheri [0060] teaches a human detector may generate false alarms due to movement of objects such as shrubs, human shadows, flags, etc. The human detector can use motion difference image information to reduce the false alarms. For example, based on motion difference information, the human detector can exclude object motion that has less than a threshold of motion energy. Specifically, the human detector can evaluate the frame difference image 122 component of the four-channel image 130 to differentiate motion energy corresponding to human movement from motion energy corresponding to object movement. Additionally, the human detector can evaluate the frame difference image 122 component of the four-channel image 130 to differentiate motion energy corresponding to human movement from motion energy corresponding to stationary objects. This can reduce false alarms that may be caused by stationary objects.
Claim 18. Madden further teaches wherein identifying the video to display the result of the evaluation of video analysis comprises selecting a video clip from a database of stored video clips. Madden [col. 14, line 65-col. 15, line 15] teaches the application server 160 may store the reference images for the portions of the property 101. The application server 160 then may compare the image obtained from the augmented reality device 50 with the reference images. The comparison may produce a plurality of results that indicate whether the image obtained from the augmented reality device 50 matches the reference images.
Claim 19. It differs from claim 1 in that it is a system performing the method of claim 1. Therefore claim 19 has been analyzed and reviewed in the same way as claim 1. See the above analysis. 
Claim 20. It differs from claim 1 in that it is a non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to performing the method of claim 1. Therefore claim 20 has been analyzed and reviewed in the same way as claim 1. See the above analysis. 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US 2003/0185340 A1 to Frantz.
[0136] In the most basic form of comparison, the two images may be overlaid one upon another, possibly in different color schemes, wherein the user may visually scan for differences between the images. Any difference that is found may be manually inspected further with the hand mirror device of the prior art. In a more sophisticated analysis, the software may find the feature differences and highlight them on the display, such as by changing the color of the areas of difference, flashing the areas of difference, or placing a cursor or pointer (120) near the areas of difference, as shown in FIG. 13.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DELOMIA L GILLIARD whose telephone number is (571)272-1681. The examiner can normally be reached 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached on 571 272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DELOMIA L GILLIARD/Primary Examiner, Art Unit 2661