Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  
first and second image capturing units are described in the specification as cameras, see 0017; a data-for-learning generating unit (see 0018), learning unit (0018), people detecting unit (0018), and apparatus unit (0023) are described, respectively in sections of the specification, provide a sufficient description to disclose the acts for performing the claimed function
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
    


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1,4, and 6-8  are rejected under 35 U.S.C. 103 as being unpatentable over Sanjay et al. (PG/PUB 20190042854) in view over Saxena (USPN 10270609).
Claim 1:
Sanjay teaches a facility apparatus control device (ABSTRACT, Figure 2-see sub-component combinations for analysis and control) but does not expressly teach the learning unit and controlling based on the learned result and image limitations as described below.  Saxena teaches the learning unit and controlling based on the learned result and image limitations as described below.
a first image capturing unit that captures an image of a first space (Sanjay, 0111 e.g. “Example 31 may include the subject matter of example 30, further including the set of sensors, where the set of sensor devices includes two or more of a camera device, a microphone, a heartbeat monitor, and a respiration monitor,” see also 0051 e.g. “image frames”)
a second image capturing unit that captures an image of a second space(Sanjay, 0040, 0049, 0057 e.g. see multiple spaces including a classroom, spaces, retail store, etc., where a camera/video sensor is associated per space)
a data-for-learning generating unit that generates location information on a person present in the first space by analyzing a first image and extracts, from the first image, an image of a determined range covering the person present in the first space, the first image being an image captured by the first image capturing unit (Sanjay, 0040 e.g. see localization system as reading on a data-for-learning generating unit and see predetermined range as a set of features/characteristics of the person , see “A localization system 215 may also make use of sensor data (e.g., 235 b) (or feature vectors generated from the sensor data)…. For instance, within a retail store, the locations and movement of particular consumers within the retail store environment may be detected and tracked, using sensor data 235b as inputs, such as video sensor data,” 0040,  see also “As illustrative examples, a set of features may be extracted from video data and provided as an input to a neural network based model to recognize emotion displayed by users captured in the video (e.g., based on characteristics such a facial expression, gait, posture, and gestures capable of being captured in video), see also location data generated from the data-for-learning generating unit, 0040.”))
a learning unit that learns (Saxena, see ML inference engine as a learning unit for determining am estimated state vector/learned result representing the claimed features based on learning the features present within an image, see Col 4 lines 36-67, see also applying training data for use in determining the features within the image, Col 10 lines 4-22, Col 8 lines 3-67  e.g. “In some embodiments, a state vector including the activity state (e.g., general activity state—reading, sleeping, cooking, etc. and detailed activity state—reading-opening-book, placing-book-down, etc.) of a subject is estimated. In one embodiment, the state vector includes other relevant information such as time-of-the-day, weather, number of people in the house, subject locations, and current activities being performed. In some embodiments, the state vector includes one or more controllable device states. In some embodiments, the state vector includes a listing of subjects with each region (e.g., room) of an environment. In some embodiments, a learning method learns the weight of a function that maps the values of the sensor data to the state vector. In some embodiments, a learning method utilizes deep learning to learn a multi-staged non-linear mapping from the sensor data to the state vector.”, see also training the inference engine, see  Col 10 lines 4-22 e.g. “For example, statistical and/or deep learning processing has been utilized to analyze a training data set of example associations between different sensor data and corresponding states to determine associated probabilities of association between different sensor data and different states,” (e.g. sensors include vision and provide occupant location and activity information for determining an association and user states.  As understood, the sensor data, including vision data, provide location and user activity states that are correlated to known states and locations ), see also  Sanjay for providing extracted location data via image analysis for training), using the location information generated by the data-for-learning generating unit and a first extracted image, a feature of the person present in the first space and a staying location occupied by the person present in the first space (Saxena, see inference engine for determining a state vector based on training data, i.e., determining within an image a particular person and activity occurring within a  location, the state vector representing a location and activity of a person based upon training the inference engine to identify a location and activity of a user using within an image, and see Sanjay as providing training data using extracted images and location data), the first extracted image being an image extracted by the data-for-learning generating unit (Sanjay, 0042 e.g. see feature of the person present in the first space and a staying location as an emotional state at a particular location, see  “In some implementations, an example emotion detection engine 255 may utilize machine learning models to operate upon features or feature vectors (e.g., 260) derived from sensor data generated by sensors in the environment, see also feature vectors as comprising a first extracted image,  i.e., “ As illustrative examples, a set of features may be extracted from video data and provided as an input to a neural network based model to recognize emotion displayed by users captured in the video (e.g., based on characteristics such a facial expression, gait, posture, and gestures capable of being captured in video), see also location data generated from the data-for-learning generating unit, 0040)
     One of ordinary skill in the art before the effective filing date of the invention, given an inference engine that estimates a state vector representing a user identification, location, and activity based on training data, as per Saxena, where the training data includes extracted location information via image analysis and user features extracted via image analysis and from a second camera image, as per Sanjay, would achieve an expected and predictable result of providing training data comprising user location and extracted features (e.g. gait, posture, activity type, and location), from which an inference engine is trained to identify the activity and location within an image.  In other words, the training data of Sanjay is modified to include the extracted location data via image analysis such that given an image, the inference engine will recognize, given a camera image, the user activity occurring within the image at a location comprising learned features (e.g. sitting, sleeping, or standing, etc., within a location and/or user moving to a target location).

Sanjay as modified by Saxena teaches:
a person detecting unit that detects a person present in the second space by analyzing a second image and extracts, from the second image, an image of a determined range covering the detected person, the second image being an image captured by the second image capturing unit (Sanjay, 0040 e.g. see localization system as a person detecting unit , see “ For instance, within a retail store, the locations and movement of particular consumers within the retail store environment may be detected and tracked, using sensor data 235b as inputs, such as video sensor data (and other image data), infrared (IR) sensor data, voice sensor data, among other example information,” see also “A localization system 215 may also make use of sensor data (e.g., 235 b) (or feature vectors generated from the sensor data), see also “extracted features”  as reading on a determined range, see “ Such characteristics may include, for instance, the words or sounds spoked by the person, the voice inflections exhibited in the person's speech, gestures or posture exhibited by the person, the gait of the person, the facial expressions of the person, the heartbeat of the person, the respiration patterns of the person, among other example characteristics, see also “In some implementations, a localization engine 250 may employ localization functions, machine learning algorithms, and other techniques to identify individual persons within an environment and further identify these persons' positions within the environment. For instance, within a retail store, the locations and movement of particular consumers within the retail store environment may be detected and tracked, using sensor data 235b as inputs, such as video sensor data (and other image data), infrared (IR) sensor data, voice sensor data, among other example information. The localization engine 250 may determine the unique identities of the people within the store (while applying security measures to anonymize the data 235 b) using facial recognition, voice recognition, and other recognition techniques (e.g., based on the clothing of the person, the gait or other gestures used by the person, and other identifying characteristics captured in the sensor data 235 b). 
an apparatus control unit that controls a facility apparatus installed in the first space with use of a second extracted image and a learned result obtained by the learning unit, the second extracted image being an image extracted by the person detecting unit (Saxen, see controlling based upon learned result comprising learned user activities and location within a second extracted image, claim 1 (e.g. see automation rules associated with image analysis and what the user is doing, col 8 lines 26-30 –“identified states,” see also “image data to detect person and activity performed,” Col 7 lines 25-35,)see also Sanjay as controlling based on emotion at a location , 0045 e.g. “In some cases, results described in an emotion heat map may be utilized to trigger one or more actuators (e.g., 115 a) to automate adjustments to environmental characteristics within an environment. For instance, based on trends described in the emotion heat map, the temperature within the environment may be controlled, music selection or volume may be changed (e.g., to try to counter or otherwise influence the emotions being experienced within the environment), change the type of images or video being displayed by a display device within the environment (to affect a change or reinforce emotions being detected within a location in the environment where the display is likely being seen and affecting emotions of viewers), among other examples. For instance, the type of emotion being experienced may be described in the emotion heat map, as well as the location in which the emotion is being experienced. Depending on the type of the described emotion and the location (and variable characteristics applicable to that location), the emotion heat map may be used directly to autonomously control characteristics within the environment, see also Saxen, Figure 4-402, 404, 406 e.g. controlling based on correlated states of a user)
    One of ordinary skill in the art before the effective filing date of the claimed invention controlling at least HVAC/actuator devices based upon learned emotion or activity at a location would achieve an improved invention via automatically controlling devices based upon determining what a user is doing at a particular location based upon training a system to recognize what image features match recognized state features in any one captured image (e.g., emotion at a location in an image, activity of a user at a location, reading, moving to another location, on the bed, etc. in an image).  Sanray teaches controlling based on a learned emotion corresponding to a location.  Saxen teaches controlling based upon identifying features (e.g. activity type and location) within an image.  Accordingly, one of ordinary skill in the art given a trained system to identify occupant features within an image as a basis for control would realize an improved invention via automatically controlling a system based on identifying at least occupant location and corresponding activities.  Saxes is reasonably pertinent to a problem of controlling based on image analysis, as described Background, and would have yielded an improved invention by automatically determining control rules based on learned user activity in combination with controlling based on learning emotional states of a user at a particular location, as per Sanjay. 


Claim 4. 
The cited combination of prior art teaches the facility apparatus control device according to claim 1 but does not expressly teach displaying the learned result.  Saxen teaches a learned result as a vector depicting at least occupant activity and location and Sanjay teaches a display.
     One of ordinary skill in the art adapting the display of Sanjay to display the learned result of Saxen would achieve an expected and predictable result of apprising a user of at least a recognized activity per location for visual confirmation.  Sanjay teaches displaying visual confirmation of a learned result (emotion) at a location.  Saxen teaches determining a learned result as at least recognized user activity per location.  Accordingly, adapting the display of Sanjay would have yielded an improved invention by providing visual feedback to a user.

Claim 6. 
 The cited combination of prior art teaches the facility apparatus control device according to claim 1, wherein the facility apparatus is an air-conditioning device (Sanjay, 0016 e.g. “HVAC appliance.”)


Claim 7.
  The cited combination of prior art teaches the facility apparatus control device according to claim 1 but does not expressly teach the external computer limitation as described.  Sanjay teaches an external computer as a cloud-based server as described below. Saxen teaches using a cloud computer to perform the analysis (Figure 1C-130)
    wherein the learning unit utilizes an external computer system to learn a feature of the person present in the first space and a staying location in the first space, occupied by the person present in the first space ) see outside learning image cloud computer  (Saxen supra claim 1, Col 5 lines 15-17, Col 8 lines 3-18, see also Sanjay, see cloud-based server as an external computer and see the method for learning a feature of a person present and a staying location, see emotion map determination, Figure 5, supra claim 1 analysis for an adapted model configured to receive location and extracted images to determine user emotional states at a location)
    One of ordinary skill in the art before the effective filing date of the claimed invention adapting a cloud-server of Sanjay to execute the analysis in place of or in additional to a local processor would achieve an expected and predictable result of off-loading processing to a server. Saxen teaches using a cloud-server to determining a person’s staying location and feature via cloud computing.  Sanjay teaches the application of cloud computing for analysis.  Accordingly, adapting the cloud server to Sanjay to determine the learned results of Saxen would provide an improved invention for the same reasons Sanjay and Saxen are combinable as set forth in claim 1.

Claim 8. 
    The cited combination of prior art teaches a facility apparatus control method comprising:
a first image capturing step of capturing an image of a first space by a facility apparatus control device; supra claim 1
a data-for-learning generating step of, by the facility apparatus control device, generating location information on a person present in the first space by analyzing a first image and extracting, from the first image, an image of a determined range covering the person present in the first space, the first image being an image captured at the first image capturing step; supra claim 1
a learning step of, by the facility apparatus control device, learning, with use of the location information generated at the data-for-learning generating step and a first extracted image, a feature of the person present in the first space and a staying location occupied by the person present in the first space, the first extracted image being an image extracted at the data-for-learning generating step; supra claim 1
a second image capturing step of capturing an image of a second space by the facility apparatus control device; supra claim 1
a person detecting step of, by the facility apparatus control device, detecting a person present in the second space by analyzing a second image and extracting, from the second image, an image of a determined range including the detected person, the second image being an image captured at the second image capturing step; supra claim 1 
an apparatus controlling step of, by the facility apparatus control device, controlling a facility apparatus installed in the first space with use of a second extracted image and a learned result obtained at the learning step, the second extracted image being an image extracted at the person detecting step, supra claim 1

Claim 8 is rejected under the same combination and prior art set forth in claim 1.

Claim 2:
Sanjay as modified by Saxen teaches the facility apparatus control device according to claim 1 wherein the apparatus control unit determines whether or not a feature of the person present in the second space matches a feature of a person already learned by the learning unit with use of the second extracted image and the learned result (Saxen, see learning what the occupant is doing at a space within a second extracted image, i.e., reading a book, for example, based on training data that has been utilized to recognize the activity based on feature extraction, supra claim 1)
    when the features match each other, the apparatus control unit controls a facility apparatus installed in the first space (Sanjay, as modified, supra claim 1, where based on identifying what a user is doing at a particular location, then implement an automation rule.  As interpreted, control follows training the system to identify the activity via image analysis and training, supra claim 1.)


Claim 3. 
  The cited combination of prior art teaches the facility apparatus control device according to claim 2, wherein the apparatus control unit controls, among two or more facility apparatuses installed in the first space, a facility apparatus associated with a staying location where the person present in the second space stays in the first space (Saxen, Figure 1A, see light, thermostat, door lock, speaker, fan, speaker, etc.)


Alternative Interpretation of claim 2
Claims 2-3 are rejected under 35 U.S.C. 103 as being unpatentable over Sanjay et al. (PG/PUB 20190042854) in view over Saxen  (USPN 10270609) in view over Jin et al. (USPN 10489690)

Claim 2. 
The combination of prior art teaches the facility apparatus control device according to claim 1 but does not expressly teach determining the features of a person present limitations as described below.   
    Jin et al. teaches wherein the apparatus control unit determines whether or not a feature of the person present in the second space matches a feature of a person already learned by the learning unit with use of the second extracted image and the learned result, and when the features match each other, the apparatus control unit controls a facility apparatus installed in the first space (Jin et al., see a learned result/”facial expressions corresponding to emotions” used as training data for use in training the model to identify a feature of a person in an extracted image of a space after being trained to identify  the feature, Col 14 lines 19-50, and see a learned result of Sanjay as the determined emotional state data at a location (e.g. as applied, training data) of a user including facial expressions, supra claim 1, and see Sanjay, as modified by Lee, as controlling a system based on recognizing an image of a user, supra claim 1)
    One of ordinary skill in the art before the effective filing date of the claimed invention using the learned result/emotional state of a user, as per Sanjay, as training data for identifying a feature of a person present in the second extracted image, as per Jin et al.,  would achieve an expected and predictable result of identifying a facial  expression at a location corresponds to a particular emotion.  The use of learning methods based on training data provides a means for automatically classifying image features into respective emotions from which to control the system to optimize user comfort.  
   


Claim 5 is  rejected under 35 U.S.C. 103 as being unpatentable over Sanjay et al. (PG/PUB 20190042854) in view over Saxena (USPN 10270609).in view over Pavlidis (USPN 6996256)
Claim 5. 
 The cited combination of prior art teaches the facility apparatus control device according to claim 1 but does not expressly teach the thermal image limitation as described below.  Pavlidis teaches the thermal image limitation as described below.
    wherein the first image and the second image are thermal images (Pavlidis, ABSTRACT, see also Sanjay, 0040 e.g. infrared (IR) sensor data, see also Matsuoka, 0051 e.g. “camera also includes infrared LEDs for night vision, see also Lee as providing multiple camera, Figure 5)
    One of ordinary skill in the art before the effective filing date of the claimed invention adapting each camera to include or provide an additional camera using thermal image analysis to identify occupant states would achieve an expected and predictable result of identifying emotions. Sanjya teaches determining emotional states of a user at a particular location using image analysis while Pavlidis teaches determining user emotional states via thermal images based on employing learning methods.  Accordingly, Pavlidis is reasonably pertinent to a problem of learning user emotional states and would have yield an improved invention via expanding upon the emotions experienced by a user. 


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

Image analysis for recognition purposes
10387716  20190205625 11371739 20210285671 20210285671 (see location as input)  11424028 20210398032
       HVAC control based on learning
    10871302  7298871 20030227439  7663502 20210279475 11043090  2021035616 11301779 
       20210279475 11043090 10871302 20200334831 10748024  10380429  20190122065  20190102812  10205891  9791872 9477215   7298871 20030227439 20210056161 202102799475 10775064 10871302 11301779 20210279475 11043090
   Comfort constraints
   20200355391 11371739 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DARRIN D DUNN whose telephone number is (571)270-1645. The examiner can normally be reached M-Sat (10-8) PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Rocio Del Mar Perez-Velez can be reached on 571-270-5935. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DARRIN D DUNN/Patent Examiner, Art Unit 2117