Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim Objections
Claims 2, 3, 6 and 9 recites the term “sufficiently”. The term “sufficiently” is interpreted broadly. Examiner suggest amending claim limitation to delete the term. 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-6, 9 and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over US 2021/0027485 A1 to Zhang.
Claim 1. A computer-implemented method comprising: obtaining images captured by a camera, the images depicting an area of a property; Zhang [0012] teaches obtaining, by the one or more computers, image data from a camera, the image data representing an image of a monitored area;

providing two or more images of the images to a machine learning model; Zhang [0012] teaches providing, by the one or more computers and to one or more machine learning models, input data that is based on the image data representing the image of the monitored area, where the one or more machine learning models have been trained to detect different properties of the monitored area;

Zhang [0022] teaches obtaining second image data from the camera representing a second image of the monitored area; processing the second image data using the one or more machine learning models to detect conditions present in the monitored area

obtaining an output of the machine learning model corresponding to the two or more images; Zhang [0012] teaches receiving, by the one or more computers, output of the one or more machine learning models, the output indicating (i) one or more status classifications for the monitored area and a respective location for each of the one or more status classifications,

determining one or more potential states of the area of the property using the output of the machine learning model, each of the one or more potential states corresponding to an image in the two or more images; Zhang [0011] teaches the system can store and provide images of the monitored area at different times corresponding to the records, such as images of the area at the time the condition was detected, at one or more subsequent times (such as after notification to address the condition has been issued),

Zhang [0012] teaches the output indicating (i) one or more status classifications for the monitored area and a respective location for each of the one or more status classifications, or (ii) whether the image data shows a state that is inconsistent with normal or expected states of the monitored area;


Zhang [0129] teaches the process includes obtaining image data (step 502), such as from one or more cameras or other sensors located to capture data about a monitored area. In general, for training, many different images are desirable, including images captured at different times and representing a variety of conditions, and including image showing desirable conditions (such as normal or expected operation or use of the monitored area) as well as images showing undesirable conditions (such as specific items that require attention or deviations or inconsistencies with respect to the typical state or use of the monitored area).

and performing an action based on the one or more potential states.  Zhang [0019] teaches the method includes generating a record for a task corresponding to the detected condition.

Zhang [0136] teaches while some implementations can train models to identify and localize specific conditions (e.g., litter on a particular table, a particular item having low stock, etc.), this level of specificity is not required. In addition or as an alternative, the training may configure a model to determine when the overall state of the monitored area is acceptable or not. For example, the model can be configured to determine whether a monitoring image is different from or deviates from a typical or desired baseline state in a manner that requires attention. The monitoring system is often used in locations and situations that involve frequent movement and changes, such as during business hours while customers coming and going and making varied and often unpredictable movements. As a result, it is generally not sufficient to merely detect movement or a change in an image compared to a prior image, as many changes in images are benign or desirable and do not reflect any need for action. As a result, the models can be trained to distinguish types of variations in images that are within the normal or expected range of conditions (e.g., image data showing different arrangements of people and food around occupied tables) from items in images that show changes that need corrective action (e.g., litter on an unattended table). Even without labeling specific conditions, models can be trained to detect the type of conditions that are inconsistent from or incompatible with a range of different acceptable conditions. This can include training the models to recognize the states and conditions that represent the range of patterns and configurations of the monitored area that are acceptable and do not need action. This can be done, for example, by training a neural network model based on image examples of acceptable and unacceptable states of the monitored area, without necessarily identifying the specific region or type of condition that cause one condition to need attention. Further, with training regarding the typical or baseline range of variations, changes outside the scope of these changes can be identified as needing attention, even if it is a new situation not shown in the training data. For example, although the model may not be trained to specifically identify a condition for a spilled drink on the floor, the model may nevertheless determine that an image showing a spill is different from the desired baseline state and can classify the image as needing action as a result.

It would have been obvious, before the effective filing date of the claimed invention, to one of ordinary skill in the art to modify and combine the embodiments of Zhang. One skilled in the art would have been motivated to modify the embodiments in this manner because it would allow different processes to achieve optimal results and would not cause significant change to the design. The embodiments solve the problem Identifying and resolving unfavorable conditions can be important to create a consistent and enjoyable environment at a location. (Zhang [0002])

Claim 2. The method of claim 1, wherein determining one or more potential states of the area of the property comprises determining two or more states corresponding to the two or more images, comprising determining that the two or more states are not sufficiently similar, wherein performing the action comprises performing an action based on the two or more states not being sufficiently similar. Zhang [0011] teaches the system can store and provide images of the monitored area at different times corresponding to the records, such as images of the area at the time the condition was detected, at one or more subsequent times (such as after notification to address the condition has been issued),

Zhang [0012] teaches the output indicating (i) one or more status classifications for the monitored area and a respective location for each of the one or more status classifications, or (ii) whether the image data shows a state that is inconsistent with normal or expected states of the monitored area;

Zhang [0022] teaches obtaining second image data from the camera representing a second image of the monitored area; processing the second image data using the one or more machine learning models to detect conditions present in the monitored area, and based on processing the second image data: determining that the task has been completed based on determining that detected condition is not detected based on the second image data; or determining that the task has not been completed based on determining that the detected condition is detected based on the second image data.

Zhang [0093] teaches the machine learning models 123 can learn, from example image data, what properties or conditions are typical for a monitored area, so that the models 123 can detect when a condition occurs that differs from or is inconsistent with the typical conditions or properties of the monitored area

Zhang [0095] teaches the functionality to detect the overall image as a whole not being representative of the desired range of states for the location can be one of the ways that the system 100 can detect new conditions that were not observed during training and were not in any predefined set of classes. For example, if a table is tipped over, the training data may not have specifically shown that condition or given a status label for specifying that condition. Similarly, spills of food or drinks may have different locations, shapes, sizes, and colors that are not easy to predict or recognize. Nevertheless, for these types of conditions, the model 123 may still detect that the state of the monitored area is in the expected state or range of variation that encompasses normal operation (and potentially expected changes, such as increased traffic, etc.), and may thus classify the image as representing a condition of the monitored area that needs attention.

Zhang [0136] teaches the model can be configured to determine whether a monitoring image is different from or deviates from a typical or desired baseline state in a manner that requires attention... As a result, the models can be trained to distinguish types of variations in images that are within the normal or expected range of conditions (e.g., image data showing different arrangements of people and food around occupied tables) from items in images that show changes that need corrective action (e.g., litter on an unattended table).

Claim 3. The method of claim 2, wherein: determining that the two or more states are not sufficiently similar comprises: for each of the two or more images, determining, from the output of the machine learning model, a confidence for one or more potential states corresponding to the respective image; for each of the two or more images, (i) selecting a state for the respective image from the one or more potential states based on the confidences corresponding to one or more potential states or (ii) determining that no state can be identified with sufficient confidence for the respective image based on the confidences corresponding to one or more potential states, wherein the two or more states are the selected states; Zhang [0030] teaches (iii) confidence scores for the identification of the objects and/or the object status classifications; applying one or more post-processing rules to the output of the one or more machine learning models to filter a list of the identified objects based on the confidence scores; evaluating the filtered list of identified objects with respect to one or more predetermined criteria to detect a condition present in the monitored area; and providing output indicating the detected condition present in the monitored area.

Zhang [0076] teaches the system 110 has been set up with the cameras 110a, 110b and a microphone 115 installed, and with the computer system 120 having trained machine learning models loaded and ready to classify and predict whether a set of predetermined conditions are present. As discussed further below, the machine learning models can be trained so that, given image data, the models can detect the locations of objects in the image data, classify the status of the objects in the image data, and provide confidence scores indicating the confidence in the detection and status classification. The system 100 can then use the output of the machine learning models to automatically create, assign, track, and otherwise manage tasks to cause any unfavorable conditions to be corrected.

Zhang [0097] teaches the models 123 may be used to determine a classification for the image of the monitored area or a specific portion of the image. This can include determining a classification for one or more properties of the monitored area or specific objects. When determining a status or classification is discussed herein, the models may do so in any of various ways. One way is indicating a specific classification decision or selection of a classification. Another way is providing a set values that indicate the respective likelihood that the classification is appropriate (e.g., a score of 0.7 for a table being dirty, a score of 0.2 for the table being occupied, and a score of 0.1 for the table being clean and vacant). The models 123 may additionally or alternatively provide regression outputs, such as a value or score along a scale or range rather than a specific classification. For example, rather than classify a shelf among discrete classifications (e.g., empty, low-stock, medium-stock, or full), the models 123 may output a score along a range indicating the stock level (e.g., 53%) or give a score indicating a likelihood or urgency of checking or correcting the stock level (e.g., a 60% confidence score that the shelf should be restocked, or a priority score of 4 on a scale of 1 to 10 indicating the importance of checking the shelf). As another example, the models 123 can provide a score for the monitoring area as a whole or for individual objects or regions within the monitored area for different properties, e.g., occupancy, cleanliness, orderliness, etc. The computer system 120 can then use the scores for these different properties and compare them to thresholds or baseline levels for the properties for the monitored area to determine if the monitored area is in a condition that requires attention or intervention.

Zhang [0098] teaches during stage (D), the computer system 120 processes the outputs of the models 123 using a post-processing module 124, which can filter or otherwise adjust and interpret the results from the models 123. For example, the module 124 may access a set of rules 125 that indicate rules and thresholds for the post-processing actions. These rules and thresholds may be different for different locations (e.g., different restaurant buildings) and for different models 123, and can even be tailored for specific cameras 110a, 110b. The post-processing module 124 actions can remove detected objects that have confidence scores less than a threshold indicated by the rule set 125. As another example, the post-processing module 124 can identify regions of detected objects and determine areas of overlap. When two detected objects are determined to overlap by a minimum amount, the module 124 can remove the object that has the lower confidence score. As a result of the post-processing, the computer system 120 obtains a filtered list of detected objects, with their locations in the image data and their status classifications. In some implementations, the filtered list of detected objects can be provided as a JSON object with object and class keys. The JSON object can include an array of objects detected, and for each detected object, location data (e.g., a bounding region such as coordinates of two corners along a diagonal of a bounding box), a center location, a size or shape, etc.), status classification data, and one or more confidence scores.

Zhang [0099] teaches the machine learning models 123 indicate conditions that are detected or predicted to be present in the monitored area. Even so, the computer system 120 can evaluate the confidence scores and other data to verify that the detection is accurate before the computer system 120 will consider a condition needing attention to be detected. In some implementations, the processing and evaluation of model outputs can be a primary or secondary way to detect conditions in the monitored area. For example, the models 123 can provide output scores indicative of different properties of the monitored area, e.g., level of litter present, cleanliness, occupancy, speed that people progress through a waiting line, etc. The computer system 120 can then evaluate whether these scores represent an issue that needs to be addressed by a user. This can be done by comparing scores to corresponding threshold or corresponding baseline values typical for the monitored area. For example, if a cleanliness score is below a predetermined level or if the level of litter present is above a predetermined level, and potentially has been at that level for at least a threshold minimum amount of time, the computer system 120 detects an issue to be addressed.

61ATT'Y DOCKET No.: 22888-0428001Zhang [0149] teaches the system can evaluate outcome measures associated with images can help the system determine which inconsistencies or differences from baseline characteristics need attention from a worker. The system can also use the outcome metrics to determine the urgency or priority with which conditions should be addressed. For example, some changes from the typical or usual state may be benign, such as adding new furniture. This change may be visually quite different from the typical prior images and would appear to be an inconsistency from the area's desirable baseline state.

calculating a similarity score for the two or more images based on the two or more states; Zhang [0136] teaches the model can be configured to determine whether a monitoring image is different from or deviates from a typical or desired baseline state in a manner that requires attention... As a result, the models can be trained to distinguish types of variations in images that are within the normal or expected range of conditions (e.g., image data showing different arrangements of people and food around occupied tables) from items in images that show changes that need corrective action (e.g., litter on an unattended table)… although the model may not be trained to specifically identify a condition for a spilled drink on the floor, the model may nevertheless determine that an image showing a spill is different from the desired baseline state and can classify the image as needing action as a result.

and determining that the similarity score corresponding to the two or more images fails to meet a threshold similarity score, and performing an action based on the two or more states not being sufficiently similar comprises performing an action based on the similarity score failing to meet the threshold similarity score.  Zhang [0097] teaches the models 123 may be used to determine a classification for the image of the monitored area or a specific portion of the image. This can include determining a classification for one or more properties of the monitored area or specific objects. When determining a status or classification is discussed herein, the models may do so in any of various ways. One way is indicating a specific classification decision or selection of a classification. Another way is providing a set values that indicate the respective likelihood that the classification is appropriate (e.g., a score of 0.7 for a table being dirty, a score of 0.2 for the table being occupied, and a score of 0.1 for the table being clean and vacant). The models 123 may additionally or alternatively provide regression outputs, such as a value or score along a scale or range rather than a specific classification. For example, rather than classify a shelf among discrete classifications (e.g., empty, low-stock, medium-stock, or full), the models 123 may output a score along a range indicating the stock level (e.g., 53%) or give a score indicating a likelihood or urgency of checking or correcting the stock level (e.g., a 60% confidence score that the shelf should be restocked, or a priority score of 4 on a scale of 1 to 10 indicating the importance of checking the shelf). As another example, the models 123 can provide a score for the monitoring area as a whole or for individual objects or regions within the monitored area for different properties, e.g., occupancy, cleanliness, orderliness, etc. The computer system 120 can then use the scores for these different properties and compare them to thresholds or baseline levels for the properties for the monitored area to determine if the monitored area is in a condition that requires attention or intervention.

Zhang [0099] teaches the machine learning models 123 indicate conditions that are detected or predicted to be present in the monitored area. Even so, the computer system 120 can evaluate the confidence scores and other data to verify that the detection is accurate before the computer system 120 will consider a condition needing attention to be detected. In some implementations, the processing and evaluation of model outputs can be a primary or secondary way to detect conditions in the monitored area. For example, the models 123 can provide output scores indicative of different properties of the monitored area, e.g., level of litter present, cleanliness, occupancy, speed that people progress through a waiting line, etc. The computer system 120 can then evaluate whether these scores represent an issue that needs to be addressed by a user. This can be done by comparing scores to corresponding threshold or corresponding baseline values typical for the monitored area. For example, if a cleanliness score is below a predetermined level or if the level of litter present is above a predetermined level, and potentially has been at that level for at least a threshold minimum amount of time, the computer system 120 detects an issue to be addressed.

Zhang [0139] teaches the computer system generates thresholds, rules, and post-processing parameters. These can include rules used to specify actions to take in response to identifying specific conditions and thresholds for the confidence level for a certain condition before action is requested. For example, the condition of litter on a table may have a corresponding threshold of 80% set, so that if the confidence reaches or exceeds this level one or more users are informed of the condition. Similarly, a rule for the litter detected condition can be set to specify which action or actions for the system to take (e.g., generate a task), which users receive notifications, a time period in which the condition should be corrected (e.g., 20 minutes) before further actions are needed, etc. The rules, threshold, and post-processing parameters can be set based on user input, such as instructions provided to the system when the system is set up at a location. In addition, the system may adjust the thresholds and other parameters over time based on the situations observed. For example, if the system uses a threshold of 60% confidence for detecting a condition and users repeatedly dismiss the condition suggesting false positives are occurring in the detection, the system may increase the confidence level requirement (e.g., to a threshold of 70%) to improve the accuracy of results. In some implementations, the thresholds, rules, and post-processing parameters are determined based on image data or other data for the location or view where the model will be used.

61ATT'Y DOCKET No.: 22888-0428001Zhang [0149] teaches the system can evaluate outcome measures associated with images can help the system determine which inconsistencies or differences from baseline characteristics need attention from a worker. The system can also use the outcome metrics to determine the urgency or priority with which conditions should be addressed. For example, some changes from the typical or usual state may be benign, such as adding new furniture. This change may be visually quite different from the typical prior images and would appear to be an inconsistency from the area's desirable baseline state.

Claim 4. The method of claim 3, wherein selecting the state for the respective image from the one or more potential states comprises selecting, for the respective image, a state from the one or more potential states associated with the highest confidence.  
Zhang [0096] teaches to detect deviations from the baseline or desired states of the monitored area, the system 100 can capture example image data for the monitored area captured at different times and showing different situations that show normal or expected states (e.g., desirable or acceptable states) of the monitored area. This may include, for example, images of a restaurant seating area at many different times during normal use, so that images show different combinations of people, food, and other items at different positions. Images showing conditions that are not in the normal or expected range may also be used, as examples to represent conditions that are not typical and should be classified as such.

Zhang [0097] teaches the models 123 may be used to determine a classification for the image of the monitored area or a specific portion of the image. This can include determining a classification for one or more properties of the monitored area or specific objects. When determining a status or classification is discussed herein, the models may do so in any of various ways. One way is indicating a specific classification decision or selection of a classification. Another way is providing a set values that indicate the respective likelihood that the classification is appropriate (e.g., a score of 0.7 for a table being dirty, a score of 0.2 for the table being occupied, and a score of 0.1 for the table being clean and vacant). The models 123 may additionally or alternatively provide regression outputs, such as a value or score along a scale or range rather than a specific classification. For example, rather than classify a shelf among discrete classifications (e.g., empty, low-stock, medium-stock, or full), the models 123 may output a score along a range indicating the stock level (e.g., 53%) or give a score indicating a likelihood or urgency of checking or correcting the stock level (e.g., a 60% confidence score that the shelf should be restocked, or a priority score of 4 on a scale of 1 to 10 indicating the importance of checking the shelf). As another example, the models 123 can provide a score for the monitoring area as a whole or for individual objects or regions within the monitored area for different properties, e.g., occupancy, cleanliness, orderliness, etc. The computer system 120 can then use the scores for these different properties and compare them to thresholds or baseline levels for the properties for the monitored area to determine if the monitored area is in a condition that requires attention or intervention.

Zhang [0099] teaches the machine learning models 123 indicate conditions that are detected or predicted to be present in the monitored area. Even so, the computer system 120 can evaluate the confidence scores and other data to verify that the detection is accurate before the computer system 120 will consider a condition needing attention to be detected. In some implementations, the processing and evaluation of model outputs can be a primary or secondary way to detect conditions in the monitored area. For example, the models 123 can provide output scores indicative of different properties of the monitored area, e.g., level of litter present, cleanliness, occupancy, speed that people progress through a waiting line, etc. The computer system 120 can then evaluate whether these scores represent an issue that needs to be addressed by a user. This can be done by comparing scores to corresponding threshold or corresponding baseline values typical for the monitored area. For example, if a cleanliness score is below a predetermined level or if the level of litter present is above a predetermined level, and potentially has been at that level for at least a threshold minimum amount of time, the computer system 120 detects an issue to be addressed.

Zhang [0131] teaches the image data can be labelled with the corresponding conditions represented in the image data (step 506). This labeling facilitates training of machine learning models, especially for supervised training. The image data can be labeled as representing or corresponding to any of the classes (e.g., classifications), conditions, status or states, or other items determined in step 504. This can include indicating specific regions or portions of an image that the labeled condition refers to. For example, the labeling can indicate not only that litter is present in the overall monitored area in the entirety of an image, but that the litter is present at a specific portion of the image, such as a specific table.

Zhang [0154] teaches the output can indicate classifications made or scores for different classifications. For example, the output can include a classification of a table has having litter present and/or a score indicating a likelihood that the classification is correct, such as a confidence score for the classification decision.

Claim 5. The method of claim 3, wherein selecting the state for the respective image from the one or more potential states comprises identifying, for the respective image, a state from the one or more potential states associated with a confidence that meets a threshold confidence.  Zhang [0096] teaches to detect deviations from the baseline or desired states of the monitored area, the system 100 can capture example image data for the monitored area captured at different times and showing different situations that show normal or expected states (e.g., desirable or acceptable states) of the monitored area. This may include, for example, images of a restaurant seating area at many different times during normal use, so that images show different combinations of people, food, and other items at different positions. Images showing conditions that are not in the normal or expected range may also be used, as examples to represent conditions that are not typical and should be classified as such.

Zhang [0097] teaches the models 123 may be used to determine a classification for the image of the monitored area or a specific portion of the image. This can include determining a classification for one or more properties of the monitored area or specific objects. When determining a status or classification is discussed herein, the models may do so in any of various ways. One way is indicating a specific classification decision or selection of a classification. Another way is providing a set values that indicate the respective likelihood that the classification is appropriate (e.g., a score of 0.7 for a table being dirty, a score of 0.2 for the table being occupied, and a score of 0.1 for the table being clean and vacant). The models 123 may additionally or alternatively provide regression outputs, such as a value or score along a scale or range rather than a specific classification. For example, rather than classify a shelf among discrete classifications (e.g., empty, low-stock, medium-stock, or full), the models 123 may output a score along a range indicating the stock level (e.g., 53%) or give a score indicating a likelihood or urgency of checking or correcting the stock level (e.g., a 60% confidence score that the shelf should be restocked, or a priority score of 4 on a scale of 1 to 10 indicating the importance of checking the shelf). As another example, the models 123 can provide a score for the monitoring area as a whole or for individual objects or regions within the monitored area for different properties, e.g., occupancy, cleanliness, orderliness, etc. The computer system 120 can then use the scores for these different properties and compare them to thresholds or baseline levels for the properties for the monitored area to determine if the monitored area is in a condition that requires attention or intervention.

Zhang [0099] teaches the machine learning models 123 indicate conditions that are detected or predicted to be present in the monitored area. Even so, the computer system 120 can evaluate the confidence scores and other data to verify that the detection is accurate before the computer system 120 will consider a condition needing attention to be detected. In some implementations, the processing and evaluation of model outputs can be a primary or secondary way to detect conditions in the monitored area. For example, the models 123 can provide output scores indicative of different properties of the monitored area, e.g., level of litter present, cleanliness, occupancy, speed that people progress through a waiting line, etc. The computer system 120 can then evaluate whether these scores represent an issue that needs to be addressed by a user. This can be done by comparing scores to corresponding threshold or corresponding baseline values typical for the monitored area. For example, if a cleanliness score is below a predetermined level or if the level of litter present is above a predetermined level, and potentially has been at that level for at least a threshold minimum amount of time, the computer system 120 detects an issue to be addressed.

Zhang [0131] teaches the image data can be labelled with the corresponding conditions represented in the image data (step 506). This labeling facilitates training of machine learning models, especially for supervised training. The image data can be labeled as representing or corresponding to any of the classes (e.g., classifications), conditions, status or states, or other items determined in step 504. This can include indicating specific regions or portions of an image that the labeled condition refers to. For example, the labeling can indicate not only that litter is present in the overall monitored area in the entirety of an image, but that the litter is present at a specific portion of the image, such as a specific table.

Zhang [0154] teaches the output can indicate classifications made or scores for different classifications. For example, the output can include a classification of a table has having litter present and/or a score indicating a likelihood that the classification is correct, such as a confidence score for the classification decision.

Claim 6. The method of claim 3, wherein determining that no state can be identified with sufficient confidence for the respective image comprises determining, for the respective image, that none of the confidences associated with the one or more potential states meet a threshold confidence.  Zhang [0096] teaches to detect deviations from the baseline or desired states of the monitored area, the system 100 can capture example image data for the monitored area captured at different times and showing different situations that show normal or expected states (e.g., desirable or acceptable states) of the monitored area. This may include, for example, images of a restaurant seating area at many different times during normal use, so that images show different combinations of people, food, and other items at different positions. Images showing conditions that are not in the normal or expected range may also be used, as examples to represent conditions that are not typical and should be classified as such.

Zhang [0131] teaches the image data can be labelled with the corresponding conditions represented in the image data (step 506). This labeling facilitates training of machine learning models, especially for supervised training. The image data can be labeled as representing or corresponding to any of the classes (e.g., classifications), conditions, status or states, or other items determined in step 504. This can include indicating specific regions or portions of an image that the labeled condition refers to. For example, the labeling can indicate not only that litter is present in the overall monitored area in the entirety of an image, but that the litter is present at a specific portion of the image, such as a specific table.

Zhang [0136] teaches even without labeling specific conditions, models can be trained to detect the type of conditions that are inconsistent from or incompatible with a range of different acceptable conditions. This can include training the models to recognize the states and conditions that represent the range of patterns and configurations of the monitored area that are acceptable and do not need action. This can be done, for example, by training a neural network model based on image examples of acceptable and unacceptable states of the monitored area, without necessarily identifying the specific region or type of condition that cause one condition to need attention. Further, with training regarding the typical or baseline range of variations, changes outside the scope of these changes can be identified as needing attention, even if it is a new situation not shown in the training data.

Claim 9. The method of claim 2, wherein performing the action based on the two or more states not being sufficiently similar comprises obtaining external data, and the method comprising determining a current state of the area of the property using the external data. Zhang [0011] teaches the system can store and provide images of the monitored area at different times corresponding to the records, such as images of the area at the time the condition was detected, at one or more subsequent times (such as after notification to address the condition has been issued),

Zhang [0012] teaches the output indicating (i) one or more status classifications for the monitored area and a respective location for each of the one or more status classifications, or (ii) whether the image data shows a state that is inconsistent with normal or expected states of the monitored area;

Zhang [0022] teaches obtaining second image data from the camera representing a second image of the monitored area; processing the second image data using the one or more machine learning models to detect conditions present in the monitored area, and based on processing the second image data: determining that the task has been completed based on determining that detected condition is not detected based on the second image data; or determining that the task has not been completed based on determining that the detected condition is detected based on the second image data.

Zhang [0093] teaches the machine learning models 123 can learn, from example image data, what properties or conditions are typical for a monitored area, so that the models 123 can detect when a condition occurs that differs from or is inconsistent with the typical conditions or properties of the monitored area

Zhang [0095] teaches the functionality to detect the overall image as a whole not being representative of the desired range of states for the location can be one of the ways that the system 100 can detect new conditions that were not observed during training and were not in any predefined set of classes. For example, if a table is tipped over, the training data may not have specifically shown that condition or given a status label for specifying that condition. Similarly, spills of food or drinks may have different locations, shapes, sizes, and colors that are not easy to predict or recognize. Nevertheless, for these types of conditions, the model 123 may still detect that the state of the monitored area is in the expected state or range of variation that encompasses normal operation (and potentially expected changes, such as increased traffic, etc.), and may thus classify the image as representing a condition of the monitored area that needs attention.

Zhang [0136] teaches the model can be configured to determine whether a monitoring image is different from or deviates from a typical or desired baseline state in a manner that requires attention... As a result, the models can be trained to distinguish types of variations in images that are within the normal or expected range of conditions (e.g., image data showing different arrangements of people and food around occupied tables) from items in images that show changes that need corrective action (e.g., litter on an unattended table).

Claim 19. It differs from claim 1 in that it is a system performing a method of claim 1. Therefore claim 19 has been reviewed and analyzed in the same way as claim 1. See the above analysis. 

Claim 20. It differs from claim 1 in that it is one or more non-transitory computer-readable media storing instructions that, when executed by one or more computers, cause the one or more computers to perform a method of claim 1. Therefore claim 20 has been reviewed and analyzed in the same way as claim 1. See the above analysis.
 
Claim(s) 7-8 is/are rejected under 35 U.S.C. 103 as being unpatentable over US 2021/0027485 A1 to Zhang in view of US 2020/0341436 A1 to Saxena et al., hereinafter, “Saxena”.
Claim 7. Zhang is silent on claim 7, however, Saxena, in the field of monitoring an environment, teaches wherein calculating the similarity score for the two or more images comprises calculating a similarity score that indicates extent of state matches between the two or more states corresponding to the two or more images. Saxena [0072] teaches at least one of the plurality of inputs indicates a history of: states, outputs, output types, settings, and/or configurations. For example, the input identifies one or more previously determined outputs identifying controllable device property settings. In some embodiments, one or more of the inputs states are associated with a measure of confidence that the corresponding state is the correct state. For example, it may be difficult to determine the exact correct state based on received sensor information and candidate states are identified along with associated confidence score values.

Thus, at the time of the invention, it would have been obvious to one of ordinary skill in the art to modify the teachings of Zhang with Saxena [0001] to allow the user must be aware of where devices are specifically located, the capability/range of the device, and how each device will specifically impact the surrounding environment as it relates to a desired result. 

Claim 8. Saxena further teaches wherein calculating the similarity score for the two or more images comprises: determining a highest number of state matches for a particular state in the two or more states; and 62ATT'Y DOCKET No.: 22888-0428001 calculating a similarity score using the highest number of state matches, wherein the similarity score is indicative of a comparison between the highest number of state matches and a total number of states in the two or more states. Saxena [0072] teaches at least one of the plurality of inputs indicates a history of: states, outputs, output types, settings, and/or configurations. For example, the input identifies one or more previously determined outputs identifying controllable device property settings. In some embodiments, one or more of the inputs states are associated with a measure of confidence that the corresponding state is the correct state. For example, it may be difficult to determine the exact correct state based on received sensor information and candidate states are identified along with associated confidence score values. 

Claim(s) 10, 14 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over US 2021/0027485 A1 to Zhang in view of US 10755543 B1 to Usie. 
Claim 10. Zhang is silent on claim 9, however, Usie, in the field of monitoring an environment, teaches wherein: obtaining external data comprises: in response to the two or more states not being sufficiently similar, generating a request that queries a user to select the current state from among a list of potential states of the area of the property, or that queries the user to input the current state; transmitting the request to a user device of the user; and receiving a response to the request from the user device, the response indicating the current state of the area property, and determining the current state of the area of the property using the external data comprises determining the current state of the area of the property using the response.  Usie [col. 4, lines 31-48] teaches a bridge device generates event information corresponding to a signal based alarm of a sensor generated by an alarm panel by sensing a motion through an image of a camera and generates the event information in the same format as alarm information related to the alarm and transmits the generated alarm information to a monitoring server to support an administrator of the monitoring server to verify an image based event of the camera associated with occurrence of the alarm of a sensor in a format of the alarm panel and supports the administrator to easily verify the event related image associated with the alarm of the sensor by distinguishing the event related image from another image and allows the event related image to be transferred to a user and supports the administrator or the user to clearly verify whether an error occurs in the alarm of the sensor through the transferred event related image to support to cope only with an alarm of a normal sensor

Usie [col. 9, lines 3-21] teaches the monitoring server 300 may generate the event notification information including the link information received by matching the event notification information with the event information from the bridge device 100 to the event notification information and transmit the event notification information to the user terminal 10 and the user terminal 10 may access the temporary storage image stored in the cloud server 400 based on the link information and receive and display the temporary storage image.

Usie [col. 9, lines 31-48] teaches the monitoring server 300 may receive response information from the user terminal 10 in response to the event notification information and when it is determined that the user determines that the event occurs based on the response information (when it is determined that the alarm of the alarm panel 200 is the normal alarm), the monitoring server 300 may transmit the event notification information to a server of a public institution of such as a predetermined police and notify event occurrence.

Usie [col. 13, lines 24-35] teaches the user terminal 10 that receives the event notification information may receive and display the temporary storage image by accessing the cloud server 400 based on the event notification information and the user terminal 10 may generate response information for requesting the dispatch of the police or disregarding of the corresponding event to the monitoring server 300 according to an input of the user that verifies the event related image depending on sensing of the sensor and the camera corresponding to the event notification information and transmit the generated response information to the monitoring server 300.

Thus, at the time of the invention, it would have been obvious to one of ordinary skill in the art to modify the teachings of Zhang with Usie [col 2, lines 27-40] to increase convenience and efficiency for a security by supporting an alarm generated by a sensor and an event generated based on an image of a camera to match each other in the same monitoring area while reducing a cost burden by supporting constructing a separate system for security monitoring based on the image of the camera or supporting a separate education for an administrator not to be performed by supporting the administrator of a monitoring center to easily verify an event related image corresponding to alarm information of an alarm panel by supporting event related event information the monitoring center

Claim 14. Usie further teaches comprising: determining a prior state for the area of the property; determining that the prior state does not match the current state; and based on the prior state not matching the current state, notifying a user device of a change in the state of the area of the property. Usie [Abstract] teaches disclosed is a bridge device supporting an alarm format, which transmits an event which occurs based on an image of a camera that photographs a monitoring area of an alarm panel in an alarm format according to an alarm format of an alarm panel transmitting alarm information in a predetermined alarm format when sensing by a sensor in the monitoring area in the alarm format

Usie [col. 4, lines 31-48] teaches a bridge device generates event information corresponding to a signal based alarm of a sensor generated by an alarm panel by sensing a motion through an image of a camera and generates the event information in the same format as alarm information related to the alarm and transmits the generated alarm information to a monitoring server to support an administrator of the monitoring server to verify an image based event of the camera associated with occurrence of the alarm of a sensor in a format of the alarm panel and supports the administrator to easily verify the event related image associated with the alarm of the sensor by distinguishing the event related image from another image and allows the event related image to be transferred to a user and supports the administrator or the user to clearly verify whether an error occurs in the alarm of the sensor through the transferred event related image to support to cope only with an alarm of a normal sensor Examiner interprets “sensing the motion” to be the change (prior and current) in state. 

Usie [col. 9, lines 3-21] teaches the monitoring server 300 may generate the event notification information including the link information received by matching the event notification information with the event information from the bridge device 100 to the event notification information and transmit the event notification information to the user terminal 10 and the user terminal 10 may access the temporary storage image stored in the cloud server 400 based on the link information and receive and display the temporary storage image.

Usie [col. 9, lines 31-48] teaches the monitoring server 300 may receive response information from the user terminal 10 in response to the event notification information and when it is determined that the user determines that the event occurs based on the response information (when it is determined that the alarm of the alarm panel 200 is the normal alarm), the monitoring server 300 may transmit the event notification information to a server of a public institution of such as a predetermined police and notify event occurrence.

Usie [col. 13, lines 24-35] teaches the user terminal 10 that receives the event notification information may receive and display the temporary storage image by accessing the cloud server 400 based on the event notification information and the user terminal 10 may generate response information for requesting the dispatch of the police or disregarding of the corresponding event to the monitoring server 300 according to an input of the user that verifies the event related image depending on sensing of the sensor and the camera corresponding to the event notification information and transmit the generated response information to the monitoring server 300.

Claim 17. Usie further teaches wherein: determining the current state of the area of the property comprises determining a device is in a first state, determining that the prior state for the area of the property comprises determining that device was previously in a second state different from the first state, Usie [col. 9, lines 3-21] teaches the monitoring server 300 may generate the event notification information including the link information received by matching the event notification information with the event information from the bridge device 100 to the event notification information and transmit the event notification information to the user terminal 10 and the user terminal 10 may access the temporary storage image stored in the cloud server 400 based on the link information and receive and display the temporary storage image.

and65ATT'Y DOCKET No.: 22888-0428001 notifying the user device of the change in the state of the area of the property comprises transmitting a notice to the user device indicating at least one of the following: a change in state of the area of the property has occurred; the device was previously in the second state; and the device is currently in the first state. Usie [col. 9, lines 31-48] teaches the monitoring server 300 may receive response information from the user terminal 10 in response to the event notification information and when it is determined that the user determines that the event occurs based on the response information (when it is determined that the alarm of the alarm panel 200 is the normal alarm), the monitoring server 300 may transmit the event notification information to a server of a public institution of such as a predetermined police and notify event occurrence.

Usie [col. 13, lines 24-35] teaches the user terminal 10 that receives the event notification information may receive and display the temporary storage image by accessing the cloud server 400 based on the event notification information and the user terminal 10 may generate response information for requesting the dispatch of the police or disregarding of the corresponding event to the monitoring server 300 according to an input of the user that verifies the event related image depending on sensing of the sensor and the camera corresponding to the event notification information and transmit the generated response information to the monitoring server 300.

Claim(s) 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over US 2021/0027485 A1 to Zhang in view of US 10755543 B1 to Usie and in further view of US 2020/0341436 A1 to Saxena et al., hereinafter, “Saxena”.
Claim 11. Zhang and Usie are silent on claim 11, however, Saxena, in the field of monitoring an environment, teaches wherein: generating the request that queries the user to select the current state from among the list of potential states of the area of the property comprises generating a request that queries the user to select from each unique state in the two or more states, or each unique state in the two or more states that is associated with a confidence that meets a threshold confidence, receiving the response to the request from the user device comprises receiving a response that includes a selection of a first unique state in the two or more states Saxena [0038] teaches in response to detecting an event, an output response (e.g., sound, user alert, light alert, wearable alert, etc.) is generated. For example, when it is detected that a stove is left on beyond a threshold amount of time and/or when no human subject is detected as present, the stove is automatically turned off and/or an output alert is generated. In another example, when a water leak is detected, an automatic output alert is sent and a water valve is automatically turned off. In another example, when it is detected that a person has fallen and further movement of the person is not detected, an alert to an emergency contact person and/or an emergency authority is automatically sent. In another example, if a presence of a person is detected in a living room during morning times, curtains are automatically opened. In another example, if it is detected that humidity is above a threshold value, a fan is automatically turned on. In another example, a humidifier is automatically switched on/off to maintain a preferred humidity. In another example, when a learned preferred time is reached, a coffee maker is automatically turned on. In another example, a dishwasher is automatically scheduled to be operated at a time when energy rates are relatively lower. In another example, light intensity is automatically adjusted based on a time of day (e.g., lights turned on a lower intensity when a subject wakes up in the middle of the night to use the bathroom.). In another example, music is automatically turned on when it is detected that a subject is eating dinner. In another example, when it is detected that ambient temperature and humidity are above threshold values and a subject is detected as sitting, a fan is automatically turned on.

Saxena [0051] teaches at 206, one or more automation rules are discovered based on the identified state(s). For example, once it has been observed that an identified state is correlated with a certain controllable device state/status/action, a rule that places the controllable device into the associated state/status/action when the associated state is detected is created. Correlation between a determined state and a detected controllable device state/status/action may be identified and once the correlation reaches a threshold, an automation rule is dynamically created and/or updated. In some embodiments, a correlation between a group of states and/or a range of state values with a controllable device state/status/action is identified and utilized to generate an automation rule. In some embodiments, the probability measure of each state may be utilized when determining the correlations and/or automation rules. In some embodiments, a history of determined states and associated probability values and co-occurring controllable device states/status/actions over time are stored and analyzed using machine learning (e.g., statistical and/or deep learning) to discover correlations. In the event a measure of correlation is above a threshold value, a corresponding automation rule may be created/updated. In some embodiments, automation rules are continually added/updated based on new correlations that are discovered.

Saxena [0060] teaches a single state includes a plurality of sub states. In some embodiments, each state includes an identifier of a subject, a coarse location of the subject (e.g., which room of a house/building), a specific location of the subject within the coarse location (e.g., on the bed of a bedroom), whether the subject is present within an environment, a type of the subject (e.g., human vs. pet, specific individual, etc.), a coarse activity of the subject (e.g., reading), and the specific activity of a subject (e.g., opening a book). In some embodiments, each candidate state includes a state of a controllable object. In some embodiments, an activity state of a subject is one of predefined activities that can be detected (e.g., detected based on observed/training data). 

Saxena [0063] teaches at 316, for each of the candidate states, a likelihood that the candidate state is the next state after a previously identified state is determined. For example, a probability that the candidate state is the actual state after a previously determined state of a subject is determined. In some embodiments, this likelihood is determined using machine learning. For example, statistical and/or deep learning processing has been utilized to analyze observed state transitions between different states to determine a transition model of probabilities for each potential candidate state given a previous state. In one example, a motion detector sensor has been installed in each room of a house. The relative locations of the rooms of the house may be automatically determined by using machine learning to observe the pattern of sensor triggers as subjects move from one room to another room. Once the connections between the rooms are known, given a current room location of a subject, the possible adjoining rooms are known and each likelihood that the subject will visit a next room of the possible connected rooms may be determined. For example, given the previous state that indicates a location of a subject, the next state is limited to adjoining rooms that are reachable given the determined/observed rate of movement of the subject and elapsed time between the sensor data of the states. In some embodiments, the likelihood that the candidate state is the next state is determined using the graph model.

Saxena [0065] teaches at 320, for each of the candidate states, an overall likelihood that the candidate state is the actual state is determined. For example, for each candidate state, the overall probability that the candidate state is the correct state of a subject is determined. In some embodiments, determining the overall state includes multiplying together one or more of the probabilities determined in 314, 316, and 318. For example, at least a first probability that the candidate state corresponds to a received sensor data, and a second probability that the candidate state is the next state after a previously identified state are multiplied together to obtain the overall likelihood. In some embodiments, the candidate states are sorted based on their overall likelihoods and the candidate state with the best overall likelihood is selected as the actual/correct state.

Saxena [0069] teaches at 406, in the event a historical probability meets a threshold, an associated automation rule is created. For example, if a historical probability determined in 402 and/or a cluster probability determined in 404 is greater than a threshold value (e.g., 80%), a corresponding automation rule is stored in a rule database. In some embodiments, the automation rule identifies that if an identified state (e.g., included in the cluster of identified states) is detected, the corresponding controllable property setting is to be recreated/implemented (e.g., property of corresponding controllable device(s) modified to be the rule specified in controllable device property setting). In some embodiments, the automation rule is updated periodically. For example, the automation rule is associated with an expiration time and the rule is to be renewed or deleted upon expiration. In some embodiments, creating the associated automation rule includes setting or modifying one or more optimization evaluation values identifying a relationship between an identified state and a corresponding controllable device property. For example, a measure of user desirability of the corresponding particular controllable device property setting given the identified state is determined and saved for use when utilizing an optimization function to automatically determine and control the controllable device.

Saxena [0071] teaches at 502, a plurality of inputs that indicate states is received. For example, each input identifies an associated state of the input. In some embodiments, the received inputs correspond to automatically controlling a controllable device. For example, the states of the inputs are to be analyzed to determine a new control setting and an action is to be potentially performed to implement the new control setting on the controllable device. In some embodiments, the received inputs include one or more states identified in 204 of FIG. 2. In some embodiments, the received inputs include one or more states (e.g., vector of states) identified using the process of FIG. 3. Each received input may indicate corresponding states of one or more subjects (e.g., user location, user activity, user preference, user type, user profile, user category, user proficiently level, user knowledge level, user knowledge of system feature, etc.), devices (e.g., sensor data, controllable device setting, device configuration, etc.), and/or environments (e.g., time, date, weather, humidity, air quality, geographical location, etc.). In some embodiments, the received states include previous states (e.g., history of previous states of subjects, devices, sensors, etc.). In some embodiments, the received states includes a state identifying a previous user provided indication received in response to an inquiry or indication of a user interactive qualification. In some embodiments, one or more of the inputs that indicate the states were determined by analyzing received sensor information. For example, sensor information from one or more devices of devices 102 of FIG. 1A and/or sensors 124 of FIG. 1B is received and utilized to determine the states. In some embodiments, at least one of the plurality of inputs indicates a scene mode. For example, various scene modes (e.g., breakfast scene, sleeping scene, reading scene, movie scene, away mode, etc.) define a desired group of controllable device settings for a particular activity or status of one or more users. The scene mode may be manually specified by a user and/or automatically determined.

Saxena [0072] teaches at least one of the plurality of inputs indicates a history of: states, outputs, output types, settings, and/or configurations. For example, the input identifies one or more previously determined outputs identifying controllable device property settings. In some embodiments, one or more of the inputs states are associated with a measure of confidence that the corresponding state is the correct state. For example, it may be difficult to determine the exact correct state based on received sensor information and candidate states are identified along with associated confidence score values.

Saxena [0073] teaches receiving the inputs includes selecting inputs to be utilized among a larger group of possible inputs. For example, it is determined which inputs are relevant in automatically determining an output control setting for a property of a controllable device. The selection of inputs to be utilized may have been preconfigured (e.g., determined based on a predefined configuration for a particular controllable property of a controllable device). For example, a programmer has manually configured which inputs are to be analyzed when automatically determining a particular output (e.g., control setting of a controllable device). In some embodiments, the selection of inputs to be utilized is based at least in part on an automation rule that was created in 406 of FIG. 4. In some embodiments, the selection of inputs to be utilized is based at least in part on locations of devices associated with the inputs relative to a controllable device. For example, based on a mapping of physical locations of devices/sensors within an installation environment (e.g., determined using a graph model), states of a device that are located in the same room and/or physically near the controllable device to be controlled are automatically selected as inputs. In various embodiments, the selection of inputs to be utilized is automatically determined based on machine learning.

and 63ATT'Y DOCKET No.: 22888-0428001 determining the current state of the area of the property using the response comprises determining that the current state of the area of the property is the first unique state of the two or more states. Saxena [0114] teaches in one illustrative example of the process of FIG. 6, in order to determine whether to automatically open the living room curtains that are motorized by a controllable network connected device motor, the candidate configuration settings of a curtain open setting and a curtain close setting are evaluated separately. The received input state indicates that there is a 70% chance that a user is in the living room (e.g., 70% chance of living room state) and a 30% chance that the user is in the bedroom (e.g., 30% chance of bedroom state). When the curtain open setting is evaluated, component evaluation values of −1 for the living room state and −11 for the bedroom state are combined to determine a combined evaluation value of −4 (e.g., (0.7)*(−1)+(0.3)*(−11)). For the living room state, this combined evaluation value of −4 is scaled/adjusted for each of three different output handling types to identify the total evaluation value of −10 for output handling type 1, −5 for output handling type 2, and −1 for output handling type 3. For the bedroom state, the combined evaluation value of −4 is scaled/adjusted for each of the three different output handling types to identify the total evaluation value of −20 for output handling type 1, −20 for output handling type 2, and −20 for output handling type 3. After combining the total evaluation values for the different states, the combined total evaluation values of −13 (e.g., (0.7)(−10)+(0.3)(−20)) for output handling type 1, −9.5 (e.g., (0.7)(−5)+(0.3)(−20)) for output handling type 2, and −6.7 (e.g., (0.7)(−1)+(0.3)(−20)) for output handling type 3 result for the curtain open setting. Applying a similar process to the curtain close setting, combined example total evaluation values of −21 for output handling type 1, −12 for output handling type 2, and −10 for output handling type 3 result for the curtain close setting. A cost optimization function to be utilized specifies that the configuration setting candidate with the lowest total evaluation value is to be selected and the curtain open setting with output handling type 3 (i.e., −6.7 is the lowest value) is selected as the selected output candidate to be implemented because it had the lowest total evaluation value.

Saxena [0154] teaches at 1104, a graph model of the environment is applied to identify nodes of the graph model that are related to the command. For example, the graph model specifies relationships between devices/sensors as well as physical regions and this relationship information is utilized to identify the related devices/sensors as well as their command related relationships. In some embodiments, the identified nodes are a subset of all nodes of the graph model. In some embodiments, the identifying the nodes includes identifying a node directly associated with the command (e.g., node corresponding to a device specified by the command) and then determining nodes that are connected to the identified node with a relationship that is associated with the command. 

Thus, at the time of the invention, it would have been obvious to one of ordinary skill in the art to modify the teachings of Zhang with Saxena [0001] to allow the user must be aware of where devices are specifically located, the capability/range of the device, and how each device will specifically impact the surrounding environment as it relates to a desired result. 

Claim(s) 12-13 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over 35 U.S.C. 103 as being unpatentable over US 2021/0027485 A1 to Zhang in view of US 20200341436 B1 to Saxena et al., hereinafter, “Saxena”. 
Claim 12. Zhang is silent on claim 12, however, Saxena, in the field of monitoring an environment, teaches wherein: obtaining external data comprises obtaining sensor data from one or more electronic devices, and determining the current state of the area of the property using the external data comprises verifying a particular state of the two or more states as the current state of the area of the property using the sensor data. Saxena [0030] teaches machine learning (e.g., local and/or cloud-based) is utilized to integrate input from many different devices/sensors (e.g., devices 102) to build a unique model of a user's presence, activities, and behavior. For example, an environment such as home is installed with devices 102 that can be controlled remotely or locally. The sensors of devices 102 may provide data about the presence and motion of people in an environment, measurements of the environmental properties such as light, temperature and humidity of the house, motion of subjects, and video of different locations of the environment. The machine learning may take into account the graph model to account for locational, functional, and behavior properties of the devices/sensors and their data during the learning process.

Saxena [0065] teaches at 320, for each of the candidate states, an overall likelihood that the candidate state is the actual state is determined. For example, for each candidate state, the overall probability that the candidate state is the correct state of a subject is determined. In some embodiments, determining the overall state includes multiplying together one or more of the probabilities determined in 314, 316, and 318. For example, at least a first probability that the candidate state corresponds to a received sensor data, and a second probability that the candidate state is the next state after a previously identified state are multiplied together to obtain the overall likelihood. In some embodiments, the candidate states are sorted based on their overall likelihoods and the candidate state with the best overall likelihood is selected as the actual/correct state.

Thus, at the time of the invention, it would have been obvious to one of ordinary skill in the art to modify the teachings of Zhang with Saxena [0001] to allow the user must be aware of where devices are specifically located, the capability/range of the device, and how each device will specifically impact the surrounding environment as it relates to a desired result. 


Claim 13. Saxena further teaches wherein: obtaining external data comprises: obtaining one or more new images; providing the one or more new images to the machine learning model; obtaining an new output of the machine learning model corresponding to the one or more new images; and determining a new state of the area of the property using the new output of the machine learning model, Saxena [0030] teaches machine learning (e.g., local and/or cloud-based) is utilized to integrate input from many different devices/sensors (e.g., devices 102) to build a unique model of a user's presence, activities, and behavior. For example, an environment such as home is installed with devices 102 that can be controlled remotely or locally. The sensors of devices 102 may provide data about the presence and motion of people in an environment, measurements of the environmental properties such as light, temperature and humidity of the house, motion of subjects, and video of different locations of the environment. The machine learning may take into account the graph model to account for locational, functional, and behavior properties of the devices/sensors and their data during the learning process.

Saxena [0033] teaches hub 104 and/or server 106 includes one or more inference engines that convert sensor data received from one or more devices of devices 102 into state representations (e.g., state of a person's behavior, location, etc.). For example, the inference engine utilizes machine learning algorithms that rely on statistical and/or deep learning techniques. In some embodiments, hub 104 and/or server 106 includes a “vision engine” (e.g., ML Inference) that receives images/video from one or more camera sensors and analyzes the images/video using vision algorithms to infer a subject's (e.g., human, pet, etc.) location, behavior, and activities (e.g., spatial and motion features). In some embodiments, camera video data is analyzed to learn hand gestures of a person that control connected devices to a desired state.

and determining the current state of the area of the property using the external data comprises verifying a particular state of the two or more states as the current state of the area of the property using the new state. Saxena [0063] teaches At 316, for each of the candidate states, a likelihood that the candidate state is the next state after a previously identified state is determined. For example, a probability that the candidate state is the actual state after a previously determined state of a subject is determined. In some embodiments, this likelihood is determined using machine learning. For example, statistical and/or deep learning processing has been utilized to analyze observed state transitions between different states to determine a transition model of probabilities for each potential candidate state given a previous state. In one example, a motion detector sensor has been installed in each room of a house. The relative locations of the rooms of the house may be automatically determined by using machine learning to observe the pattern of sensor triggers as subjects move from one room to another room. Once the connections between the rooms are known, given a current room location of a subject, the possible adjoining rooms are known and each likelihood that the subject will visit a next room of the possible connected rooms may be determined. For example, given the previous state that indicates a location of a subject, the next state is limited to adjoining rooms that are reachable given the determined/observed rate of movement of the subject and elapsed time between the sensor data of the states. In some embodiments, the likelihood that the candidate state is the next state is determined using the graph model.

Claim 18. Saxena further teaches comprising: determining a prior state for the area of the property; determining that the prior state does not match the current state; and based on the prior state not matching the current state, generating instructions for one more external electronic devices to change a mode of the of the one or more electronic devices; and transmitting the instructions to the one or more electronic devices. Saxena [0059] teaches determining the candidate states includes identifying all possible states that can be associated with the received sensor data. For example, all possible predefined activities of a subject that can be identified using data from a camera are identified. In some embodiments, determining the candidate states includes identifying the most likely candidate states. For example, rather than identifying all possible states, the most likely candidate states are identified. In some embodiments, the most likely candidate states are identified by analyzing associated sensor data received in 202 of FIG. 2. In some embodiments, determining the candidate states includes identifying a subject associated with newly received sensor data and identifying the last determined state for the subject. In some embodiments, the most likely candidate states are identified based on a previous current state. For example, for a given previous state (e.g., a location of a subject), only certain states are eligible to become the new current state (e.g., only locations adjoining the previous location) and these states are identified.

Saxena [0063] teaches at 316, for each of the candidate states, a likelihood that the candidate state is the next state after a previously identified state is determined. For example, a probability that the candidate state is the actual state after a previously determined state of a subject is determined. In some embodiments, this likelihood is determined using machine learning. For example, statistical and/or deep learning processing has been utilized to analyze observed state transitions between different states to determine a transition model of probabilities for each potential candidate state given a previous state. In one example, a motion detector sensor has been installed in each room of a house. The relative locations of the rooms of the house may be automatically determined by using machine learning to observe the pattern of sensor triggers as subjects move from one room to another room. Once the connections between the rooms are known, given a current room location of a subject, the possible adjoining rooms are known and each likelihood that the subject will visit a next room of the possible connected rooms may be determined. For example, given the previous state that indicates a location of a subject, the next state is limited to adjoining rooms that are reachable given the determined/observed rate of movement of the subject and elapsed time between the sensor data of the states. In some embodiments, the likelihood that the candidate state is the next state is determined using the graph model.

Saxena [0065] teaches at 320, for each of the candidate states, an overall likelihood that the candidate state is the actual state is determined. For example, for each candidate state, the overall probability that the candidate state is the correct state of a subject is determined. In some embodiments, determining the overall state includes multiplying together one or more of the probabilities determined in 314, 316, and 318. For example, at least a first probability that the candidate state corresponds to a received sensor data, and a second probability that the candidate state is the next state after a previously identified state are multiplied together to obtain the overall likelihood. In some embodiments, the candidate states are sorted based on their overall likelihoods and the candidate state with the best overall likelihood is selected as the actual/correct state.

Saxena [0071] teaches at 502, a plurality of inputs that indicate states is received. For example, each input identifies an associated state of the input. In some embodiments, the received inputs correspond to automatically controlling a controllable device. For example, the states of the inputs are to be analyzed to determine a new control setting and an action is to be potentially performed to implement the new control setting on the controllable device. In some embodiments, the received inputs include one or more states identified in 204 of FIG. 2. In some embodiments, the received inputs include one or more states (e.g., vector of states) identified using the process of FIG. 3. Each received input may indicate corresponding states of one or more subjects (e.g., user location, user activity, user preference, user type, user profile, user category, user proficiently level, user knowledge level, user knowledge of system feature, etc.), devices (e.g., sensor data, controllable device setting, device configuration, etc.), and/or environments (e.g., time, date, weather, humidity, air quality, geographical location, etc.). In some embodiments, the received states include previous states (e.g., history of previous states of subjects, devices, sensors, etc.). In some embodiments, the received states includes a state identifying a previous user provided indication received in response to an inquiry or indication of a user interactive qualification. In some embodiments, one or more of the inputs that indicate the states were determined by analyzing received sensor information. For example, sensor information from one or more devices of devices 102 of FIG. 1A and/or sensors 124 of FIG. 1B is received and utilized to determine the states. In some embodiments, at least one of the plurality of inputs indicates a scene mode. For example, various scene modes (e.g., breakfast scene, sleeping scene, reading scene, movie scene, away mode, etc.) define a desired group of controllable device settings for a particular activity or status of one or more users. The scene mode may be manually specified by a user and/or automatically determined.

Saxena [0072] teaches at least one of the plurality of inputs indicates a history of: states, outputs, output types, settings, and/or configurations. For example, the input identifies one or more previously determined outputs identifying controllable device property settings. In some embodiments, one or more of the inputs states are associated with a measure of confidence that the corresponding state is the correct state. For example, it may be difficult to determine the exact correct state based on received sensor information and candidate states are identified along with associated confidence score values.

Saxena [0118] teaches at 710, if a controllable device configuration setting of the selected output candidate is a change from a corresponding current setting of a property of a controllable device, the action is performed by providing an instruction to the controllable device to change to the specified device configuration setting of the selected output candidate. In some embodiments, the action is not associated with a persistent property state of the controllable device and the action is always performed in 710. For example, performing the action includes sending an alert/notification to a remote user based on information captured using one or more sensor devices. 

Claim(s) 15-16 is/are rejected under 35 U.S.C. 103 as being unpatentable US 2021/0027485 A1 to Zhang in view of US 10755543 B1 to Usie and in further view of US 2020/0359175 A1 to Schobel et al., hereinafter, “Schobel”.
Claim 15. Zhang is silent on claim 15, however, Schobel, in the field of monitoring a device in an environment, teaches wherein:64ATT'Y DOCKET No.: 22888-0428001 determining the current state of the area of the property comprises determining a device is present in the area of the property, determining that the prior state for the area of the property comprises determining that device was previously not present in the area of the property, and notifying the user device of the change in the state of the area of the property comprises transmitting a notice to the user device indicating at least one of the following: a change in state of the area of the property has occurred, the device was previously not present in the area of the property, and the device is currently present in the area of the property. Schobel [Abstract] teaches  a first computing device can send notifications at times that the first computing device is not in an expected location. A user of a second computing device can remotely configure an expected location for the first computing device, which may be a particular location for a certain period of time. During that time, the first computing device can monitor its own location and check whether it is within the expected location. If the first computing device unexpectedly leaves or fails to enter the expected location, the first computing device may transmit a notification to the second computing device. Similarly, if the first computing device loses connectivity with other devices, a server device may notify the second computing device that the location of the first computing device cannot be determined.

Schobel [0005] teaches particular implementations provide at least the following advantages. For example, particular implementations provide a parent with ways to define expected locations for a child's device and receive notifications if the child's device is not at the expected location. This improves the child's safety because the child's device can be used to ensure that the child is in an expected and safe location (e.g., a school) rather than an unsafe location. The notifications improve the parent's use of her computing device as the parent receives notifications rather than having to contact the child periodically or worry about the child's location. These implementations are quicker than known methods because the parent can immediately receive a notification rather than be notified by another person that the child is missing. Particular implementations are advantageous because the parent is notified even when the child's device is unreachable when it was expected to notify regarding its location. This again improves the parent's knowledge of the child's location as the parent can then use some other method to locate the child. Also, the child's device monitors its own location and does not exchange location information with, for example, a server. The child's device may perform low-power location checks relative to the device's surrounding location rather than making power-intensive location determinations that require communicating with external servers and making more complex location calculations. This reduces battery usage, data usage, and processor cycles for the child's device. It also increases privacy for the child as no other device tracks or otherwise processes the child's location.

Schobel [0058] teaches the parent may wish to receive a notification if the child's monitored device 138 enters the home between the hours of 4:10 PM and 8:00 AM. This may be a notification that the child's device has entered an expected location, but the entry is not at an expected time. For example, the parent may have set 4:10 PM as a deadline time before which the child must return from outside the home. If the child is late and returns after 04:10 PM, monitored device 138 may be configured to transmit a notification.

Thus, at the time of the invention, it would have been obvious to one of ordinary skill in the art to modify the teachings of Zhang with Schobel [0005] to allow location of a device. 

Claim 16. Schobel further teaches wherein: determining the current state of the area of the property comprises determining a device is connected to a second device, determining that the prior state for the area of the property comprises determining that device was previously not connected to the second device, and notifying the user device of the change in the state of the area of the property comprises transmitting a notice to the user device indicating at least one of the following: a change in state of the area of the property has occurred; the device was previously not connected to the second device in the area of the property; and the device is currently connected to the second device in the area of the property.  Schobel [Abstract] teaches  a first computing device can send notifications at times that the first computing device is not in an expected location. A user of a second computing device can remotely configure an expected location for the first computing device, which may be a particular location for a certain period of time. During that time, the first computing device can monitor its own location and check whether it is within the expected location. If the first computing device unexpectedly leaves or fails to enter the expected location, the first computing device may transmit a notification to the second computing device. Similarly, if the first computing device loses connectivity with other devices, a server device may notify the second computing device that the location of the first computing device cannot be determined.

Schobel [0058] teaches the parent may wish to receive a notification if the child's monitored device 138 enters the home between the hours of 4:10 PM and 8:00 AM. This may be a notification that the child's device has entered an expected location, but the entry is not at an expected time. For example, the parent may have set 4:10 PM as a deadline time before which the child must return from outside the home. If the child is late and returns after 04:10 PM, monitored device 138 may be configured to transmit a notification.

Schobel [0093-0095]

Schobel [claim 4] teaches generating, by the first device, a second notification configured to inform the second user of the second device that the first device has unexpectedly exited the expected location; determining, by the first device at a first time, that the first device cannot establish a connection with the second device; storing the second notification on the first device; determining, by the first device at a second time, that the first device has established a connection with the second device;

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US 2019/0221104 A1 to Hesford et al. 
Hesford [col. 5, lines 7-23] teaches the alarm 104 may be adapted to detect other conditions besides intrusion. Without limitation, a water sensor 116 may be provided to detect a flooding condition, particularly where some part of the structure 102 is below grade. Other alarm states may be triggered based upon other predefined environmental conditions. For example, environmental conditions may include one or more of: a threshold level of carbon dioxide, detection of propane or natural gas, a proximity sensor affixed to a person or animal going beyond a defined perimeter (such as a person with limited mental capacity, a child, a pet), a power outage, an ambient temperature falling below a threshold lower limit or exceeding an upper limit, or other condition that may be automatically ascertained. A threshold distance from a defined perimeter may be monitored via electronic timing signals transmitted from a local server and responded back from a device affixed to a person or pet.  

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DELOMIA L GILLIARD whose telephone number is (571)272-1681. The examiner can normally be reached 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached on 571 272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DELOMIA L GILLIARD/Primary Examiner, Art Unit 2661