Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on October 11, 2022 has been entered.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-3, 5-17, 19-21, 23-24, 28-37 is/are rejected under 35 U.S.C. 103 as being unpatentable over US 2009/0261979 A1 to Breed et al., hereinafter, “Breed” in view of US 2015/0009010 A1 to Biemer and US 2014/0368626 A1 to John Archibald et al. hereinafter, “John Archibald” in view of Compression Independent Reversible Encryption for Privacy in Video Surveillance to Carrillo et al., hereinafter, “Carrillo” and US 2016/0065906 A1 to Boghossian et al., hereinafter, “Boghossian”.
Claim 1. A driver monitoring system for monitoring a driver of a vehicle, the system including: Breed [0039] teaches the present invention relates generally to systems and methods for monitoring a driver of a vehicle to determine whether the driver is falling asleep or otherwise unable to operate the vehicle

Breed [0175] teaches when this system is used to monitor the driver as shown in FIG. 5, with appropriate circuitry and a microprocessor, the behavior of the driver can be monitored.

Breed [0311] teaches the use of multispectral imaging can be a significant aid in recognizing objects inside and outside of a vehicle.

Breed further teaches (iv) a driver behaviour sub-system connected to the computer vision subsystem and configured to gather information on the driving behaviour of the driver. Breed [0081] teaches the determination of these rules is important to the pattern recognition techniques used in at least one of the inventions disclosed herein. In general, three approaches have been useful, artificial intelligence, fuzzy logic and artificial neural networks (including cellular and modular or combination neural networks and support vector machines--although additional types of pattern recognition techniques may also be used, such as sensor fusion) 

Breed [0350] teaches Considering now FIG. 9, the normalized data from the ultrasonic transducers 6, 8, 9 and 10, the seat track position detecting sensor 74, the reclining angle detecting sensor 57, from the weight sensor(s) 7, 76, from the heartbeat sensor 71, the capacitive sensor 78 and the motion sensor 73 are input to the neural network 65, and the neural network 65 is then trained on this data.

Breed [0175] teaches when this system is used to monitor the driver as shown in FIG. 5, with appropriate circuitry and a microprocessor, the behavior of the driver can be monitored.

Breed [0188] teaches the position of the driver, and particularly of the driver's head, can be monitored over time and any behavior, such as a drooping head, indicative of the driver falling asleep or of being incapacitated by drugs, alcohol or illness can be detected and appropriate action taken

Breed [0382] teaches other training of the processor or pattern recognition technique used thereby can involve motion statistics that lead to an expectation as to what a particular driver does when he is alert. If the driver passes the test, then the thresholds can be modified. In particular, as a person begins to fall asleep, he can execute some jerking motions or other telltale motions that will be different from his normal alert behavior. Any such out-of-the-ordinary movements can evoke the test of his response time. Such unordinary movements can be programmed into the trained pattern recognition technique in the processor, i.e., embodied on computer-readable medium accessible by the processor, so that once one of these movements is detected by the pattern recognition technique in the processor, the processor would control the reactive component(s) accordingly, or otherwise monitor detection of a response to cue or other required response by the driver indicating alertness.

Breed [0389] teaches the trained pattern recognition technique applied by the processor may be trained on all measures of occupant behavior that might indicate driver attentiveness or lack thereof while driving, i.e., whether the vehicle is moving or not. For example, vehicle parameters may also be analyzed such as acceleration, steering wheel angle, angular motion of the vehicle, etc.

While Breed teaches acquiring internal and external images, Breed fails to explicitly teach forward facing camera and a rear facing camera. Biemer, in the field of computer vision system (vehicle vision)for detecting a driver, teaches in the  (i) a forward facing camera arranged to capture an image of the external environment; Biemer [0004] teaches the present invention provides a collision avoidance system or vision system or imaging system for a vehicle that utilizes one or more cameras (preferably one or more CMOS cameras) to capture image data representative of images interior and/or exterior of the vehicle, and provides a driver's head detection and recognition system, which, upon detection and recognition of the driver's head and face

Biemer [0008] teaches vehicle vision system and/or driver assist system and/or object detection system and/or alert system operates to capture images exterior of the vehicle and may process the captured image data to display images and to detect objects at or near the vehicle and in the predicted path of the vehicle, such as to assist a driver of the vehicle in maneuvering the vehicle in a rearward direction. The vision system includes an image processor or image processing system that is operable to receive image data from one or more cameras and provide an output to a display device for displaying images representative of the captured image data.

Biemer [0009] teaches referring now to the drawings and the illustrative embodiments depicted therein, a vehicle 10 includes an imaging system or vision system 12 that includes at least one exterior facing imaging sensor or camera, such as a rearward facing imaging sensor or camera 14a (and the system may optionally include multiple exterior facing imaging sensors or cameras, such as a forwardly facing camera 14b at the front (or at the windshield) of the vehicle, and a sidewardly /rearwardly facing camera 14c, 14d at respective sides of the vehicle), which captures images exterior of the vehicle, with the camera having a lens for focusing images at or onto an imaging array or imaging plane or imager of the camera (FIG. 1).

Biemer [0104] teaches the vision system (utilizing the forward facing camera and a rearward facing camera and other cameras disposed at the vehicle with exterior fields of view) may be part of or may provide a display of a top-down view or birds-eye view system of the vehicle or a surround view at the vehicle

(ii) a rear facing camera arranged to capture an image of the driver; 
Biemer [0009] teaches the vision system 12 includes an interior camera 22, which may be operable to capture images of the driver's head area so that the system may detect and recognize the head and face of the driver of the vehicle, as discussed below. The camera 22 may be disposed at or in the mirror head of the mirror assembly (and may be adjusted with adjustment of the mirror head) or may be disposed elsewhere within the interior cabin of the vehicle, with its field of view encompassing an area that is typically occupied by a driver of the vehicle

(iii) a computer vision sub-system connected to the cameras and programmed to track objects, in which the computer vision sub-system is an edge computing based sub-system that is located in the vehicle, and in which the computer vision sub-system includes at least some of an edge layer; Biemer [Abstract] teaches a vision system of a vehicle includes an interior camera disposed in an interior cabin of a vehicle and having a field of view interior of the vehicle that encompasses an area typically occupied by a head of a driver of the vehicle.

Biemer [0104] teaches the vision system (utilizing the forward facing camera and a rearward facing camera and other cameras disposed at the vehicle with exterior fields of view) may be part of or may provide a display of a top-down view or birds-eye view system of the vehicle or a surround view at the vehicle

Per specification [0106] the edge layer processes raw sensor data or video data at an ASIC embedded in a sensor or at a gateway/hub; [0107] The edge layer includes a computer-vision system or engine that (a) generates from a pixel stream a digital representation of a person or other object and (b) determines attributes or characteristics of the person or object from that digital representation and (c) enables one or more networked devices or sensors to be controlled. [0108] The edge layer can detect multiple people in a scene and continuously track or detect one or more of their: trajectory, pose, gesture, identity. [0109] The edge layer can infer or describe a person's behaviour or intent by analysing one or more of the trajectory, pose, gesture, identity of that person. 

John Archibald [0009] teaches as a non-limiting example of computer vision application-specific processing, the application processor may extract a feature (e.g., a visual descriptor) or a set of features as needed, instead of all at once.

John Archibald [0062] teaches a low-power device (not shown in FIG. 1) within a processing system may generate image statistics for each frame 102-106 in the video stream after each frame 102-106 is captured. For example, the low-power device may generate image statistics for each frame 102-106 based on the corresponding pixel representations 112-116.

John Archibald [0164] teaches memory 1408 may include threshold data 1434, selection criteria 1438, classification model(s) 1428, user preference(s) 1422, timestamp(s) 1494, feature set data 1432, context data 1490, intermediate context data 1492, an application 1444, or any combination thereof. In a particular embodiment, at least a portion of the memory 1408 may correspond to the memory 306 of FIG. 3, the memory 420 of FIG. 4, the memory 520 of FIG. 5, the DDR 914 of FIG. 9, or a combination thereof. As used herein, a "context" of an image may include information inferred or determined from the image and/or feature(s) extracted therefrom. For example, a context of an image may include a specific location, a specific person, a specific object, a specific activity, or any combination thereof. To illustrate, a context of an image of a particular person attending a meeting in a particular conference room may include the particular person (e.g., "John"), the activity of attending the meeting (e.g., "attended group meeting"), the particular conference room (e.g., "Room 2.134"), objects in the conference room (e.g., "whiteboard"), or any combination thereof.

John Archibald [0011-0014] teaches in another particular embodiment, an apparatus includes means for generating a control signal based on a change amount between first sensor data captured by a sensor and second sensor data captured by the sensor, where the means for generating the control signal is included in a first processing path. The apparatus also includes means for performing computer vision application-specific processing on the second sensor data based on the control signal, where the means for performing the computer vision application-specific processing is included in a second processing path. 

John Archibald [0096] teaches the control signal 416 may be provided to the application processor 418. The control signal 416 may indicate whether to "wake up" the application processor 418 to perform application-specific processing (e.g., computer vision application-specific processing) on the second frame 104. For example, if the change detection circuit 414 determines that the change amount between the first frame 102 and the second frame 104 does not satisfy a threshold, the control signal 416 may keep the application processor 418 in a "sleep" state to conserve power. If the change detection circuit 414 determines that the change amount between the first frame 102 and the second frame 104 satisfies the threshold, the control signal 416 may wake up the application processor 418 to perform application-specific processing on the second frame 104. Thus, the change detection circuit 414 may also provide sensor data 430 (e.g., the second frame 104) to the application processor 418 for computer vision application-specific processing.

Carrillo [page 3] teaches Survive Recoding and Transcoding. Video surveillance can span large areas and videos captured are typically distributed over networks. Network distribution may require using a different video formats or changing the video bitrate to meet the network and receiver constraints.

Carrillo [page 3] teaches (1) Provides Complete Privacy. A surveillance system should provide complete privacy by hiding portions of the video that reveal the identity of individuals. These features include faces, license tags on cars, and textual information/markings. Assuming the identifying features in a video can be detected, a surveillance system should then hide these features. A few ways of hiding such features are: (1) removing/replacing the corresponding pixels from the frame and (2) encrypt the corresponding pixels.

Carrillo [Background] teaches Obfuscation. The system presented in [1] describes a privacy preserving video console that uses a rendering face images technique in the pixel domain and leaves the face unrecognizable by identification software. Based on computer vision techniques, the video console determines the interesting components of a video and then obscures that piece of information, or its components, such that face recognition software cannot recognize the faces. With this method the privacy is maintained but the surveillance and security needs are not met due to the irreversibility of the obfuscation process…

Carrillo [Compression Independent Reversible Encryption] teaches the video is also fed to monitoring stations that play the video in realtime. The monitors use a standard video decoder to match the encoder used. The security personnel will be able to see the video and observe but with all the identifying features hidden. …The system can be configured to automatically decrypt and display the live video while keeping the identifying features encrypted in all recorded videos. When recorded video is played, all the identifying features are obscured through encryption. Since the proposed system is compression independent, this encrypted video can be played back on any standard video player such as a standard Media Player. However, when there is a legitimate need to decrypt and reveal the identifying features, for example to aid a
criminal investigation, the video has to be played in a special player/security console that has the ability to decrypt the regions.

Carrillo [Conclusion] teaches this paper presents a system for encrypting selected regions in videos. The system can be used for ensuring privacy in video surveillance by hiding the identity revealing regions in the video. The encrypted video can be transcoded and or decrypted at a later time with the right decryption keys…

Carrillo Figure 4: Original and encrypted frames from a test sequence  

Carrillo [Figure 5: (a) Decoded and decrypted H.264 video with Quantization
Parameter (QP) of 35; (b) decoded and decrypted H.264 video
with QP of 26.

Boghossian, in the field of analyzing a video, teaches and is configured to send metadata relating to tracked objects to a metadata database located on the server; [Abstract] teaches an apparatus is disclosed which is operative to analyze a sequence of video frames of a camera view field to track an object in said view field and determine start and end points of said track in said view field.
Boghossian [0013] teaches video image data in the sequence of video frames to reduce the behaviour or path of an object such as a person or vehicle present in the sequence of video frames to a metadata format that is lower bandwidth, for example just four simple data points, which allows for easy searching. 
Boghossian [0062] teaches the video content analysis module 26 analyses video image data to identify foreground objects such as vehicles and pedestrians in the video images and assigns to those objects attributes identifying them and describing their behavior and path in the camera view field. Such attributes may be regarded as “object metadata” since they comprise information about the objects. 
Boghossian [0065] teaches these attributes are stored in the metadata database 20 which does not store the video image data but only metadata in terms of attributes assigned to foreground objects. The video image data is stored in the video server 24.
Boghossian [0070] teaches the video content analyzer 26 is coupled to a metadata database 28 and sends metadata to the metadata database 28 for storage. Various modules operate on or use the metadata stored in the metadata database 28 to further identify behavior or track foreground objects characterized by the metadata.
Boghossian [0079] teaches the metadata record formed by the metadata attributes is then sent to the metadata database at step 126. Process flow then returns to assigning an object ID to the next object in the frame. If there are no more objects to generate metadata for the metadata records are sent to the metadata database 28 at step 102. Optionally, metadata records could be sent to the metadata database 28 at step 126 as each metadata record is completed.
Hence the prior art includes each element claimed, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. Thus, it would have been obvious to one of ordinary skill in the art to modify monitoring a driver of a vehicle by Breed and Biemer’s with John Archibald and Carillo teachings of the function of an edge layer. Boghossian teaches analyze a sequence of video frames of a camera view field to track an object.  One would have been motivated to perform this combination due to the fact that it allows one to accurately capture and analyze an image of the driver. In combination, Breed is not altered in that Breed continues to detect a driver’s behavior. Biemer's teachings perform the same as they do separately of using computer vision. In combination, John Archibald is not altered in that John Archibald continues to use application-specific integrated circuit (ASIC) based product including a computer vision system. Carrillo teachings perform the same as they do separately of to obfuscate identities of objects in video streams. Boghossian continues to analyze a video to track an object.
Therefore one of ordinary skill in the art, such as an individual working in the field of detecting a driver’s behavior using image data could have combined the elements as claimed by known methods, and that in combination, each element merely performs the same function as it does separately. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to claim 1.

Claim 3. The combination of Breed and Biemer further teaches in which the computer vision sub-system monitors the inside of the vehicle. Breed [0311] teaches the use of multispectral imaging can be a significant aid in recognizing objects inside and outside of a vehicle.

Biemer [0009] teaches the vision system 12 includes an interior camera 22, which may be operable to capture images of the driver's head area so that the system may detect and recognize the head and face of the driver of the vehicle, as discussed below. The camera 22 may be disposed at or in the mirror head of the mirror assembly (and may be adjusted with adjustment of the mirror head) or may be disposed elsewhere within the interior cabin of the vehicle, with its field of view encompassing an area that is typically occupied by a driver of the vehicle

Claim 5. Biemer further teaches in which the computer vision sub-system monitors road conditions. Biemer [0056] teaches additional to the attention the driver pays to the traffic (measuring the eye fixation on an area, event, sign, or hint) also the ratio of signaling or intervention measure (later referred as "signaling/intervention ratio measure") in between very prudent and less prudent levels may be controlled driver specifically, driver drowsiness specifically, stress specifically (driving paired with distractions) and/or context specifically. [0059] Context related conditions may include:… [0071] lane assisted driving [0072] traffic jam driving/approaching [0073] snow condition driving [0074] icy condition driving [0075] dusty/foggy condition driving [0076] rain condition driving [0077] night condition driving [0078] off road condition driving

Claim 6. The combination of Breed and Biemer further teaches in which the computer vision sub-system monitors the proximity of other cars to the vehicle. Breed [0263] teaches an alternate method is to use a lens with a short focal length. In this case, the lens is mechanically focused, e.g., automatically, directly or indirectly, by the control circuitry 20, to determine the clearest image and thereby obtain the distance to the object.

Biemer [0097] teaches system includes an image processor operable to process image data captured by the camera or cameras, such as for detecting objects or other vehicles or pedestrians or the like in the field of view of one or more of the cameras

Claim 7. The combination of Breed and Biemer further teaches in which the computer vision sub-system monitors information in case of a collision. Breed [0085] teaches one or more of the transducers 6, 8, 10 can also be image-receiving devices, such as cameras, which take images of the interior of the passenger compartment. These images can be transmitted to a remote facility to monitor the passenger compartment or can be stored in a memory device for use in the event of an accident, i.e., to determine the status of the occupant(s) of the vehicle prior to the accident. In this manner, it can be ascertained whether the driver was falling asleep, talking on the phone, etc., [0095]

Breed [0096] teaches the processor or processors associated with the transducers can be trained to determine the location of the life forms, either periodically or continuously or possibly only immediately before, during and after a crash., [0127]

Biemer [0004] teaches the present invention provides a collision avoidance system or vision system or imaging system for a vehicle that utilizes one or more cameras (preferably one or more CMOS cameras) to capture image data representative of images interior and/or exterior of the vehicle,

Claim 8. Biemer further teaches in which the computer vision sub-system does not output live continuous or streaming video to an external server. 
Biemer [0014] teaches the system may store one or more reference images or key markers of the driver's face and/or eyes.

Biemer [0032] teaches the system may always store the actual classified image via the learning algorithm

Claim 9. Breed further teaches in which the computer vision sub-system outputs video to an external server only if a predetermined event occurs. Breed [0085] teaches one or more of the transducers 6, 8, 10 can also be image-receiving devices, such as cameras, which take images of the interior of the passenger compartment. These images can be transmitted to a remote facility to monitor the passenger compartment or can be stored in a memory device for use in the event of an accident, i.e., to determine the status of the occupant(s) of the vehicle prior to the accident. In this manner, it can be ascertained whether the driver was falling asleep, talking on the phone, etc.

Claim 10. Breed further teaches in which the computer vision sub-system has been trained using machine learning. Breed [0228] teaches Information relating to the space behind the driver can be obtained by processing the data obtained by the sensors 126, 127, 128 and 129, which data would be in the form of images if optical sensors are used as in the preferred embodiment. Such information can be the presence of a particular occupying item or occupant, e.g., a rear facing child seat 2 as shown in FIG. 12, as well as the location or position of occupying items… Processing of the images obtained by the sensors to determine the presence, position and/or identification of any occupants or occupying item can be effected using a pattern recognition algorithm in any of the ways discussed herein, e.g., a trained neural network

Breed [0332] teaches In basic embodiments of the inventions, wave or energy-receiving transducers are arranged in the vehicle at appropriate locations, associated algorithms are trained, if necessary depending on the particular embodiment, and function to determine whether a life form, or other object, is present in the vehicle and if so, how many life forms or objects are present., [0349]

Breed [0369] teaches the use of trainable pattern recognition technologies such as neural networks is an important part of the some of the inventions discloses herein particularly for the automobile occupancy case, although other non-trained pattern recognition systems such as fuzzy logic, correlation, Kalman filters, and sensor fusion can also be used. These technologies are implemented using computer programs to analyze the patterns of examples to determine the differences between different categories of objects. These computer programs are derived using a set of representative data collected during the training phase, called the training set. After training, the computer programs output a computer algorithm containing the rules permitting classification of the objects of interest based on the data obtained after installation in the vehicle.

Claim 11. The combination of Breed and Biemer further teaches in which the computer vision sub-system applies feature extraction and classification to find objects of known characteristics in each video frame. Breed [0296] teaches permits easy segmentation of objects in the captured image and thus the rapid classification using, for example, a modular neural network or combination neural network system.

Breed [0316] teaches it would be possible to determine the contour of an object in the rear seat and thus using pattern recognition techniques, the classification or identification of the object. Motion of objects in the rear seat can also be determined using radar sensors.

Breed [0369] teaches the use of trainable pattern recognition technologies such as neural networks is an important part of the some of the inventions discloses herein particularly for the automobile occupancy case, although other non-trained pattern recognition systems such as fuzzy logic, correlation, Kalman filters, and sensor fusion can also be used. These technologies are implemented using computer programs to analyze the patterns of examples to determine the differences between different categories of objects. These computer programs are derived using a set of representative data collected during the training phase, called the training set. After training, the computer programs output a computer algorithm containing the rules permitting classification of the objects of interest based on the data obtained after installation in the vehicle.

Biemer [0028] teaches requires feature extracting and tracking. Methods for extracting and tracking features, such as the iris, are known.

Claim 12. The combination of Breed and Biemer further teaches in which the computer vision sub-system applies deep learning feature extraction and classification techniques to find objects of known characteristics in each video frame. Breed [0296] teaches permits easy segmentation of objects in the captured image and thus the rapid classification using, for example, a modular neural network or combination neural network system.

Breed [0316] teaches it would be possible to determine the contour of an object in the rear seat and thus using pattern recognition techniques, the classification or identification of the object. Motion of objects in the rear seat can also be determined using radar sensors.

Breed [0369] teaches the use of trainable pattern recognition technologies such as neural networks is an important part of the some of the inventions discloses herein particularly for the automobile occupancy case, although other non-trained pattern recognition systems such as fuzzy logic, correlation, Kalman filters, and sensor fusion can also be used. These technologies are implemented using computer programs to analyze the patterns of examples to determine the differences between different categories of objects. These computer programs are derived using a set of representative data collected during the training phase, called the training set. After training, the computer programs output a computer algorithm containing the rules permitting classification of the objects of interest based on the data obtained after installation in the vehicle.

Biemer [0087] teaches there may be an adaption or learning algorithm involved additionally or alternatively. The adaption or learning algorithm may use the above mentioned look up table parameter set as starting parameters. This adaption or learning algorithm may be a supervised as like Adaboost, preferably an unsupervised classifying, segmentation or clustering method as like a Markov model, a Bayesian classifier, K-Means, neural net or combinations of it or otherwise statistically knowledge based.

Claim 13. Breed further teaches in which the computer vision 25 sub-system monitors the inside of the vehicle in real time for the purpose of providing care and protection for any passengers. Breed [0102] teaches any sensor which determines the presence and health state of an occupant can also be integrated into the vehicle interior monitoring system in accordance with the invention. For example, a sensitive motion sensor can determine whether an occupant is breathing and a chemical sensor can determine the amount of carbon dioxide, or the concentration of carbon dioxide, in the air in the passenger compartment of the vehicle which can be correlated to the health state of the occupant(s). The motion sensor and chemical sensor can be designed to have a fixed operational field situated where the occupant's mouth is most likely to be located. In this manner, detection of carbon dioxide in the fixed operational field could be used as an indication of the presence of a human occupant in order to enable the determination of the number of occupants in the vehicle. In the alternative, the motion sensor and chemical sensor can be adjustable and adapted to adjust their operational field in conjunction with a determination by an occupant position and location sensor which would determine the location of specific parts of the occupant's body, e.g., his or her chest or mouth. Furthermore, an occupant position and location sensor can be used to determine the location of the occupant's eyes and determine whether the occupant is conscious, i.e., whether his or her eyes are open or closed or moving.

Claim 14. The combination of Breed and Biemer further teaches in which the computer vision sub-system analyses of one or more of the trajectory, pose, gesture, identity of the driver or any passenger in the vehicle. Breed [0194] teaches other occupant sensing systems can also be provided that monitor the breathing or other motion of the driver, for example, including the driver's heartbeat, eye blink rate, gestures, direction or gaze and provide appropriate responses including the control of a vehicle component including any such components listed herein.

Breed [0188] teaches the position of the driver, and particularly of the driver's head, can be monitored over time and any behavior, such as a drooping head, indicative of the driver falling asleep or of being incapacitated by drugs, alcohol or illness can be detected and appropriate action taken

Breed [0378] teaches a system for monitoring a driver includes an optical imaging system which obtains images of the driver including the driver's head and monitors the head of the driver, i.e., monitors its position and change in position, and determines whether he is paying attention to driving or is incapacitated, i.e., dead, falling asleep, sleeping, drunk, unconscious. Optical imagers or cameras may be arranged around the driver, spaced apart from one another, and to obtain images of the front of the driver and including the driver's face, of a side of the driver or of a back of the driver.

Biemer [0033] teaches the driver habits or typical actions or movements, such as typical gestures, way of looking (for example such as if the driver raises his or her eye brows in specific kinds of situations), how the driver puts his or her hands on the steering wheel, how the driver blinks (speed and closing time of the eye lids), the way the driver opens his or her eye lids, scratches his or her head, shakes his or her hair, licks his or her lips, how a double chin wobbles, how the driver chews and the like, may come into use as dynamic markers or parameters. Such markers occur over a (typically or repeatedly shown) period of time or sequence (and may be captured as a sequence of images or sequential frames of image data by the interior camera) and not just in a single frame of captured image data. The driver identification system may utilize a classification model in which the key features of a sequence are entered while the less relevant features of a sequence may diminish over consecutive times of (driver identification) learnings. Thus, the system may recognize willingly entered gestures (such as gestures for identifying the driver such as typing (in a master access code) with the hand in the air on a virtual keyboard or the like), and may utilize learning and identification of the unwilling style of acting or habitual actions or just looking to provide enhanced learning to the identification system of the present invention.

Biemer [0051] teaches the system's eye tracking may be utilized for controlling, waking up or switching devices on and off or adjusting illumination, responsive to identification of the authorized driver or user and/or to identification of a movement or position or gesture or gaze or movement of the authorized driver or user.

Claim 15. The combination of Breed and Biemer further teaches in which the computer vision sub-system infers intent of the driver or any passenger in the vehicle through an analysis of one or more of the trajectory, pose, gesture, identity of the driver or any passenger. Breed [0188] teaches the position of the driver, and particularly of the driver's head, can be monitored over time and any behavior, such as a drooping head, indicative of the driver falling asleep or of being incapacitated by drugs, alcohol or illness can be detected and appropriate action taken

Biemer [0033] teaches the driver habits or typical actions or movements, such as typical gestures, way of looking (for example such as if the driver raises his or her eye brows in specific kinds of situations), how the driver puts his or her hands on the steering wheel, how the driver blinks (speed and closing time of the eye lids), the way the driver opens his or her eye lids, scratches his or her head, shakes his or her hair, licks his or her lips, how a double chin wobbles, how the driver chews and the like, may come into use as dynamic markers or parameters. Such markers occur over a (typically or repeatedly shown) period of time or sequence (and may be captured as a sequence of images or sequential frames of image data by the interior camera) and not just in a single frame of captured image data. The driver identification system may utilize a classification model in which the key features of a sequence are entered while the less relevant features of a sequence may diminish over consecutive times of (driver identification) learnings. Thus, the system may recognize willingly entered gestures (such as gestures for identifying the driver such as typing (in a master access code) with the hand in the air on a virtual keyboard or the like), and may utilize learning and identification of the unwilling style of acting or habitual actions or just looking to provide enhanced learning to the identification system of the present invention.

Biemer [0051] teaches the system's eye tracking may be utilized for controlling, waking up or switching devices on and off or adjusting illumination, responsive to identification of the authorized driver or user and/or to identification of a movement or position or gesture or gaze or movement of the authorized driver or user.

Claim 16. The combination of Breed and Biemer further teaches in which the computer vision sub-system detects people by extracting independent characteristics including one or more of the following: the head, head & shoulders, hands and full body, each in different orientations, to enable an individual’s head orientation, shoulder orientation and full body orientation to be independently evaluated. Breed [0045] teaches the processor may use a trained pattern recognition technique when monitoring the driver's head or part thereof, e.g., when locating the head of the driver or part of the driver's head in the obtained images.

Breed [0049] teaches monitoring the driver's head or a part thereof over time may entail training a pattern recognition technique to locate the head of the driver or other part of the driver. Monitoring the driver's head or a part thereof over time may entail determining a position of the head of the driver in images obtained at different times and analyzing the determined position of the head of the driver in the different images.

Breed [0378] teaches a system for monitoring a driver includes an optical imaging system which obtains images of the driver including the driver's head and monitors the head of the driver, i.e., monitors its position and change in position, and determines whether he is paying attention to driving or is incapacitated, i.e., dead, falling asleep, sleeping, drunk, unconscious. Optical imagers or cameras may be arranged around the driver, spaced apart from one another, and to obtain images of the front of the driver and including the driver's face, of a side of the driver or of a back of the driver.

Biemer [0004] teaches the present invention provides a collision avoidance system or vision system or imaging system for a vehicle that utilizes one or more cameras (preferably one or more CMOS cameras) to capture image data representative of images interior and/or exterior of the vehicle, and provides a driver's head detection and recognition system, which, upon detection and recognition of the driver's head and face

Claim 17. Biemer further teaches in which the computer vision sub-system uses data from multiple camera sensors, each capturing different parts of an environment, to track and show an object moving through that environment and to form a global representation that is not limited to the object when imaged from a single camera sensor. Biemer [0009] teaches referring now to the drawings and the illustrative embodiments depicted therein, a vehicle 10 includes an imaging system or vision system 12 that includes at least one exterior facing imaging sensor or camera, such as a rearward facing imaging sensor or camera 14a (and the system may optionally include multiple exterior facing imaging sensors or cameras, such as a forwardly facing camera 14b at the front (or at the windshield) of the vehicle, and a sidewardly/rearwardly facing camera 14c, 14d at respective sides of the vehicle), which captures images exterior of the vehicle, with the camera having a lens for focusing images at or onto an imaging array or imaging plane or imager of the camera (FIG. 1). The cameras may be arranged so as to point substantially or generally horizontally away from the vehicle. The lens system's vertically opening angles .alpha. may be, for example, around 180 degrees, and the horizontally opening angles .beta. may be, for example, around 180 degrees, such as shown in FIG. 2.

Claim 18. The combination of Breed and Biemer further teaches in which the driver behaviour sub-system is configured to gather information on how rapid or measured the vehicle's acceleration is. Breed [0389] teaches the trained pattern recognition technique applied by the processor may be trained on all measures of occupant behavior that might indicate driver attentiveness or lack thereof while driving, i.e., whether the vehicle is moving or not. For example, vehicle parameters may also be analyzed such as acceleration, steering wheel angle, angular motion of the vehicle, etc.

Biemer [0010] teaches some of these types of systems may determine driver drowsiness or attentiveness by observing the time intervals the acceleration paddle position is changed by the driver's foot, while some systems may determine driver drowsiness or attentiveness by observing the time intervals between changes in the steering wheel position made by the driver, and some systems may determine driver drowsiness or attentiveness by monitoring the closing time and repetition rate of the driver's eye lids and eye movement.

(Although not relied on, Mimar [0110] teaches FIG. 6 embodiment of present invention includes an accelerometer and GPS, using which SoC calculates the current speed and acceleration data and continuously stores it together with audio-video data for viewing at a later time.)

Claim 19. Breed further teaches in which the driver behaviour sub-system is configured to gather information on how harsh or smooth the vehicle's braking is. Breed [0389] teaches the trained pattern recognition technique applied by the processor may be trained on all measures of occupant behavior that might indicate driver attentiveness or lack thereof while driving, i.e., whether the vehicle is moving or not. For example, vehicle parameters may also be analyzed such as acceleration, steering wheel angle, angular motion of the vehicle, etc.

Claim 20.  Breed further teaches in which the driver behaviour sub-system is configured to gather information on how hard or gentle the vehicle's cornering is. Breed [0389] teaches the trained pattern recognition technique applied by the processor may be trained on all measures of occupant behavior that might indicate driver attentiveness or lack thereof while driving, i.e., whether the vehicle is moving or not. For example, vehicle parameters may also be analyzed such as acceleration, steering wheel angle, angular motion of the vehicle, etc.

Claim 21. Breed further teaches in which the driver behaviour sub-system is configured to enable a driver profile or driver rating to be generated. Breed [0382] teaches other training of the processor or pattern recognition technique used thereby can involve motion statistics that lead to an expectation as to what a particular driver does when he is alert. If the driver passes the test, then the thresholds can be modified. In particular, as a person begins to fall asleep, he can execute some jerking motions or other telltale motions that will be different from his normal alert behavior.

Claim 23. Breed further teaches in which the driver behaviour sub-system is configured to gather information on passenger behavior. Breed [0102] teaches any sensor which determines the presence and health state of an occupant can also be integrated into the vehicle interior monitoring system in accordance with the invention. For example, a sensitive motion sensor can determine whether an occupant is breathing and a chemical sensor can determine the amount of carbon dioxide, or the concentration of carbon dioxide, in the air in the passenger compartment of the vehicle which can be correlated to the health state of the occupant(s). The motion sensor and chemical sensor can be designed to have a fixed operational field situated where the occupant's mouth is most likely to be located. In this manner, detection of carbon dioxide in the fixed operational field could be used as an indication of the presence of a human occupant in order to enable the determination of the number of occupants in the vehicle. In the alternative, the motion sensor and chemical sensor can be adjustable and adapted to adjust their operational field in conjunction with a determination by an occupant position and location sensor which would determine the location of specific parts of the occupant's body, e.g., his or her chest or mouth. Furthermore, an occupant position and location sensor can be used to determine the location of the occupant's eyes and determine whether the occupant is conscious, i.e., whether his or her eyes are open or closed or moving.

Claim 24. The combination of Breed and Biemer further teaches in which the rear-facing camera is positioned on or in relation to the internal vehicle rear view mirror. Breed [0120] teaches an infrared receiver 56 is located attached to the rear view mirror assembly 55, as shown in FIG. 8E.

Biemer [0009] teaches the vision system 12 includes a control or electronic control unit (ECU) or processor 18 that is operable to process image data captured by the cameras and may provide displayed images at a display device 16 for viewing by the driver of the vehicle (although shown in FIG. 1 as being part of or incorporated in or at an interior rearview mirror assembly 20 of the vehicle

Claim 28. Biemer further teaches in which the computer vision sub-system generates a digital representation that relates to one or more of: animals, pets, inanimate objects, dynamic or moving objects, moving vehicles. Breed [0323] teaches exploiting this phenomena, it is possible to detect the presence of an adult, child, baby or pet that is in the field of the detection circuit.

Claim 29. It differs from claim 1 in that it is a passenger monitoring system instead of a driver monitoring system of claims. Therefore claim 29 has been analyzed and reviewed in the same way as claim 1. See the above analysis. Breed [0227]

Claim 30. It differs from claim 1 in that it is a method performed by a system of claim 1. Therefore claim 30 has been analyzed and reviewed in the same way as claim 1. See the above analysis. 

Claim 31. (New) The driver monitoring system of Claim 1, including an ASIC embedded in a sensor, in which the at least some of the edge layer is configured to process raw sensor data at the ASIC embedded in the sensor.  John Archibald [0320] teaches the method 3100 of FIG. 31 may be implemented by a field-programmable gate array (FPGA) device, an application-specific integrated circuit (ASIC), a processing unit such as a central processing unit (CPU), a digital signal processor (DSP), a controller, another hardware device, firmware device, or any combination thereof. As an example, the method 3100 of FIG. 31 can be performed by a processor that executes instructions.

Claim 32. (New) The driver monitoring system of Claim 1, including an ASIC at a gateway, in which the at least some of the edge layer is configured to process raw sensor data or video data at the ASIC at the gateway. John Archibald [0320] teaches the method 3100 of FIG. 31 may be implemented by a field-programmable gate array (FPGA) device, an application-specific integrated circuit (ASIC), a processing unit such as a central processing unit (CPU), a digital signal processor (DSP), a controller, another hardware device, firmware device, or any combination thereof. As an example, the method 3100 of FIG. 31 can be performed by a processor that executes instructions.

Claim 33. (New) The driver monitoring system of Claim 1, including an ASIC at a hub, in which the at least some of the edge layer is configured to process raw sensor data or video data at the ASIC at the hub. John Archibald [0320] teaches the method 3100 of FIG. 31 may be implemented by a field-programmable gate array (FPGA) device, an application-specific integrated circuit (ASIC), a processing unit such as a central processing unit (CPU), a digital signal processor (DSP), a controller, another hardware device, firmware device, or any combination thereof. As an example, the method 3100 of FIG. 31 can be performed by a processor that executes instructions.

Claim 34. (New) The driver monitoring system of Claim 1, in which the at least some of the edge layer is configured to (a) generate from a pixel stream a digital representation of a person or other object, John Archibald [0009] teaches a non-limiting example of computer vision application-specific processing, the application processor may extract a feature (e.g., a visual descriptor) or a set of features as needed, instead of all at once. For example, the application processor may extract a first subset of features (e.g., visual descriptors) of the second frame to identify the context (e.g., a location) of the second frame. [0010-0011]

John Archibald [0060] teaches the video stream may be subject to application-specific processing (e.g., computer vision application processing). For example, in the particular illustrative embodiment, the video stream may be subject to a hand recognition application (e.g., subject to processing that detects whether a hand is in a field of view). However, in other embodiments, the video stream may be subject to other computer vision applications. For example, the video stream may be subject to security applications (e.g., surveillance, intrusion detection, object detection, facial recognition, etc.), environmental-use applications (e.g., lighting control), object detection and tracking applications, etc.

John Archibald [0064] teaches a high-power device (not shown in FIG. 1) within the processing system may perform application-specific processing on particular frames in the video stream. For example, in the particular illustrative embodiment, the application-specific processing may include determining whether a particular object (e.g., a hand) is in the selected frame. In other embodiments, the application-specific processing may include determining whether an alert event is triggered. An alert event may correspond to a change in condition between frames. As illustrative, non-limiting examples, an alert event may correspond to a patient falling out of a bed, a house intrusion, a car pulling into a driveway, a person walking through a door, etc.

John Archibald [0338] teaches the method 3300 of FIG. 33 may be implemented by a field-programmable gate array (FPGA) device, an application-specific integrated circuit ( ASIC)…

John Archibald [0384] teaches the processor and the storage medium may reside in an application-specific integrated circuit ( ASIC). The ASIC may reside in a computing device or a user terminal.

and (b) determine attributes or characteristics of the person or object from that digital representation John Archibald [0009] teaches as a non-limiting example of computer vision application-specific processing, the application processor may extract a feature (e.g., a visual descriptor) or a set of features as needed, instead of all at once.

John Archibald [0062] teaches a low-power device (not shown in FIG. 1) within a processing system may generate image statistics for each frame 102-106 in the video stream after each frame 102-106 is captured. For example, the low-power device may generate image statistics for each frame 102-106 based on the corresponding pixel representations 112-116.

John Archibald [0164] teaches memory 1408 may include threshold data 1434, selection criteria 1438, classification model(s) 1428, user preference(s) 1422, timestamp(s) 1494, feature set data 1432, context data 1490, intermediate context data 1492, an application 1444, or any combination thereof. In a particular embodiment, at least a portion of the memory 1408 may correspond to the memory 306 of FIG. 3, the memory 420 of FIG. 4, the memory 520 of FIG. 5, the DDR 914 of FIG. 9, or a combination thereof. As used herein, a "context" of an image may include information inferred or determined from the image and/or feature(s) extracted therefrom. For example, a context of an image may include a specific location, a specific person, a specific object, a specific activity, or any combination thereof. To illustrate, a context of an image of a particular person attending a meeting in a particular conference room may include the particular person (e.g., "John"), the activity of attending the meeting (e.g., "attended group meeting"), the particular conference room (e.g., "Room 2.134"), objects in the conference room (e.g., "whiteboard"), or any combination thereof.

Carrillo [page 3] teaches (1) Provides Complete Privacy. A surveillance system should provide complete privacy by hiding portions of the video that reveal the identity of individuals. These features include faces, license tags on cars, and textual information/markings. Assuming the identifying features in a video can be detected, a surveillance system should then hide these features. A few ways of hiding such features are: (1) removing/replacing the corresponding pixels from the frame and (2) encrypt the corresponding pixels.

and (c) enable one or more networked devices or sensors to be controlled. John Archibald [0011-0014] teaches in another particular embodiment, an apparatus includes means for generating a control signal based on a change amount between first sensor data captured by a sensor and second sensor data captured by the sensor, where the means for generating the control signal is included in a first processing path. The apparatus also includes means for performing computer vision application-specific processing on the second sensor data based on the control signal, where the means for performing the computer vision application-specific processing is included in a second processing path. 

John Archibald [0096] teaches the control signal 416 may be provided to the application processor 418. The control signal 416 may indicate whether to "wake up" the application processor 418 to perform application-specific processing (e.g., computer vision application-specific processing) on the second frame 104. For example, if the change detection circuit 414 determines that the change amount between the first frame 102 and the second frame 104 does not satisfy a threshold, the control signal 416 may keep the application processor 418 in a "sleep" state to conserve power. If the change detection circuit 414 determines that the change amount between the first frame 102 and the second frame 104 satisfies the threshold, the control signal 416 may wake up the application processor 418 to perform application-specific processing on the second frame 104. Thus, the change detection circuit 414 may also provide sensor data 430 (e.g., the second frame 104) to the application processor 418 for computer vision application-specific processing.

Carrillo [page 3] teaches Survive Recoding and Transcoding. Video surveillance can span large areas and videos captured are typically distributed over networks. Network distribution may require using a different video formats or changing the video bitrate to meet the network and receiver constraints.
Claim 35. (New) The driver monitoring system of Claim 1, in which the at least some of the edge layer is configured to detect multiple people in a scene and to continuously track or detect one or more of their: trajectory, pose, gesture, identity. John Archibald [0174] teaches the first image statistics and the second image statistics may be generated based on application-specific processing. The application-specific processing may include determining whether sensory data (e.g., the first sensory data 1470 and/or second sensory data corresponding to the first frame 102) indicates that a particular object (e.g., a traffic stop sign) is in a corresponding image, indicates that an alert event is triggered (e.g., a particular gesture is detected), indicates that an object of a particular color is in the image, or a combination thereof. The application-specific processing may include at least one of activity recognition, person recognition, object recognition, location recognition, or gesture recognition.
John Archibald [0183] teaches the feature set classifier 1418 may generate a first classified subset of features (e.g., classified subset(s) of features 1474) by classifying the first clustered subset of features (e.g., the clustered subset(s) of features 1476) based on a first classification model (e.g., the classification model(s) 1428). The first classification model may indicate that the first clustered subset of features corresponds to a specific location, a specific person, a specific object, a specific activity, or any combination thereof. The first classification model may also indicate confidence levels associated with the correspondence. As a result, the feature set classifier 1418 may generate the first classified subset of features indicating that the first clustered subset of features corresponds to the location, the person, the object, the activity, or any combination thereof.
Claim 36. (New) The driver monitoring system of Claim 1, in which the at least some of the edge layer is configured to infer or describe a person's behaviour or intent by analysing one or more of the trajectory, pose, gesture, identity of that person. John Archibald [0174] teaches the first image statistics and the second image statistics may be generated based on application-specific processing. The application-specific processing may include determining whether sensory data (e.g., the first sensory data 1470 and/or second sensory data corresponding to the first frame 102) indicates that a particular object (e.g., a traffic stop sign) is in a corresponding image, indicates that an alert event is triggered (e.g., a particular gesture is detected), indicates that an object of a particular color is in the image, or a combination thereof. The application-specific processing may include at least one of activity recognition, person recognition, object recognition, location recognition, or gesture recognition.
John Archibald [0183] teaches the feature set classifier 1418 may generate a first classified subset of features (e.g., classified subset(s) of features 1474) by classifying the first clustered subset of features (e.g., the clustered subset(s) of features 1476) based on a first classification model (e.g., the classification model(s) 1428). The first classification model may indicate that the first clustered subset of features corresponds to a specific location, a specific person, a specific object, a specific activity, or any combination thereof. The first classification model may also indicate confidence levels associated with the correspondence. As a result, the feature set classifier 1418 may generate the first classified subset of features indicating that the first clustered subset of features corresponds to the location, the person, the object, the activity, or any combination thereof.
Claim 37. It differs from claim 1 in that it is a road vehicle performed by a system of claim 1. Therefore claim 30 has been analyzed and reviewed in the same way as claim 1. See the above analysis. 

Claims 4 and 22 is/are rejected under 35 U.S.C. 103 as being unpatentable over 
US 2009/0261979 A1 to Breed et al., hereinafter, “Breed” in view of US 2015/0009010 A1 to Biemer, US 2014/0368626 A1 to John Archibald et al. hereinafter, “John Archibald” in view of Compression Independent Reversible Encryption for Privacy in Video Surveillance to Carrillo et al., hereinafter, “Carrillo” and US 2016/0065906 A1 to Boghossian et al., hereinafter, “Boghossian” and in further view of US 2014/0139655 A1 to Mimar.
Claim 4. The driver monitoring system of Claim 1 in which the computer vision sub-system monitors the inside of the vehicle in real time for the purpose of providing care and protection for the driver. Breed [0102] teaches any sensor which determines the presence and health state of an occupant can also be integrated into the vehicle interior monitoring system in accordance with the invention. For example, a sensitive motion sensor can determine whether an occupant is breathing and a chemical sensor can determine the amount of carbon dioxide, or the concentration of carbon dioxide, in the air in the passenger compartment of the vehicle which can be correlated to the health state of the occupant(s). The motion sensor and chemical sensor can be designed to have a fixed operational field situated where the occupant's mouth is most likely to be located. In this manner, detection of carbon dioxide in the fixed operational field could be used as an indication of the presence of a human occupant in order to enable the determination of the number of occupants in the vehicle. In the alternative, the motion sensor and chemical sensor can be adjustable and adapted to adjust their operational field in conjunction with a determination by an occupant position and location sensor which would determine the location of specific parts of the occupant's body, e.g., his or her chest or mouth. Furthermore, an occupant position and location sensor can be used to determine the location of the occupant's eyes and determine whether the occupant is conscious, i.e., whether his or her eyes are open or closed or moving.

Breed fails to explicitly teach monitoring in real-time, however, Mimar, in the field of detecting driver behavior, teaches [0023] teaches the goal of this endeavor is to provide the vehicle with the capacity to assess in real-time the visual behavior of the driver.

Mimar [0216] teaches The present system calculates the face gaze direction and level of eyes closed at least 20 times per second, and later systems will increase this to real-time at 30 frames-per-second (fps).

Hence the prior art includes each element claimed, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. Thus, it would have been obvious to one of ordinary skill in the art to modify monitoring a driver of a vehicle by Breed and Biemer with Mimar’s teaching of acquiring image data in real-time. One would have been motivated to perform this combination due to the fact that it allows one to accurately capture and analyze an image of the driver. In combination, Breed is not altered in that Breed continues to detect a driver’s behavior. Biemer's teachings perform the same as they do separately of using computer vision. In combination, John Archibald is not altered in that John Archibald continues to use application-specific integrated circuit (ASIC) based product including a computer vision system. Carrillo teachings perform the same as they do separately of to obfuscate identities of objects in video streams. Mimar continues to use image data to assess the driver’s behavior in real-time. 
Therefore one of ordinary skill in the art, such as an individual working in the field of detecting a driver’s behavior using image data could have combined the elements as claimed by known methods, and that in combination, each element merely performs the same function as it does separately. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to claim 4.

Claim 22. Mimar further teaches in which the driver behaviour sub-system is configured to gather information on how long the driver has been driving the vehicle. Mimar [0162] teaches based on the total driving time after the last stop, the driver will be tired, and the total distraction condition can be reduced accordingly, for example, for every additional hour after 4 hours of non-stop driving, the total distraction distance can be reduced by 5 percent, as shown by 3002 and 3005.

Claim 25. The combination of Breed and Mimar further teaches in which the computer vision sub-system performs real-time virtualisation of a scene, generating a virtualised or digital representation that defines the appearance of a generalized driver or any passenger, and not the specific driver or any passenger, in which a person is represented as one of the following: a standardised shape, a flat or 2-dimensional shape including head, body, arms and legs, a symbolic or simplified representation of a person, or an avatar. Breed [0045] teaches the processor may use a trained pattern recognition technique when monitoring the driver's head or part thereof, e.g., when locating the head of the driver or part of the driver's head in the obtained images.

Breed [0049] teaches monitoring the driver's head or a part thereof over time may entail training a pattern recognition technique to locate the head of the driver or other part of the driver. Monitoring the driver's head or a part thereof over time may entail determining a position of the head of the driver in images obtained at different times and analyzing the determined position of the head of the driver in the different images.

Breed [0378] teaches a system for monitoring a driver includes an optical imaging system which obtains images of the driver including the driver's head and monitors the head of the driver, i.e., monitors its position and change in position, and determines whether he is paying attention to driving or is incapacitated, i.e., dead, falling asleep, sleeping, drunk, unconscious. Optical imagers or cameras may be arranged around the driver, spaced apart from one another, and to obtain images of the front of the driver and including the driver's face, of a side of the driver or of a back of the driver.

Mimar [0023] teaches the goal of this endeavor is to provide the vehicle with the capacity to assess in real-time the visual behavior of the driver.

Mimar [0216] teaches The present system calculates the face gaze direction and level of eyes closed at least 20 times per second, and later systems will increase this to real-time at 30 frames-per-second (fps).

Claims 25-27 is/are rejected under 35 U.S.C. 103 as being unpatentable over US 2009/0261979 A1 to Breed et al., hereinafter, “Breed” in view of US 2015/0009010 A1 to Biemer US 2014/0368626 A1 to John Archibald et al. hereinafter, “John Archibald” in view of Compression Independent Reversible Encryption for Privacy in Video Surveillance to Carrillo et al., hereinafter, “Carrillo”, US 2016/0065906 A1 to Boghossian et al., hereinafter, “Boghossian” in further view of Mixed Reality Participants in Smart Meeting Rooms and Smart Home Environments to Nijholt et al., hereinafter, “Nijholt”.

Claim 26. The driver monitoring system of Claim 1 in which the computer vision sub-system generates a digital representation in which symbolic or simplified representations of different people are distinguished using different colours. Biemer [0098] teaches the vehicle may include any type of sensor or sensors, such as imaging sensors or radar sensors or lidar sensors or ladar sensors or ultrasonic sensors or the like…The imaging array may capture color image data, such as via spectral filtering at the array, such as via an RGB (red, green and blue) filter or via a red/red complement filter or such as via an RCC (red, clear, clear) filter or the like. The logic and control circuit of the imaging sensor may function in any known manner, and the image processing and algorithmic processing may comprise any suitable means for processing the images and/or image data.

While Biemer teaches color images, Biemer fails to explicitly teach representations of different people are distinguished using different colours. Nijholt, in the field of smart environments, teaches [Fig. 1]

Nijholt [Capturing Meeting Activity] teaches this allows us to display animated representations of meeting participants in a (3D) virtual reality environment. The 3D positions of head, elbows and hands can reasonably be calculated [12]. 3D technology based upon portable standards, like VRML/X3D and H-Anim avatars is used.

Nijholt [Fig. 3]

Hence the prior art includes each element claimed, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. Thus, it would have been obvious to one of ordinary skill in the art to modify monitoring a driver of a vehicle by Breed and Biemer with Nijholt’s teaching of representations of different people are distinguished using different colours. One would have been motivated to perform this combination due to the fact that it allows one to accurately distinguish the target of interest, i.e the driver. In combination, Breed is not altered in that Breed continues to detect a driver’s behavior. Biemer's teachings perform the same as they do separately of using computer vision. In combination, John Archibald is not altered in that John Archibald continues to use application-specific integrated circuit (ASIC) based product including a computer vision system. Carrillo teachings perform the same as they do separately of to obfuscate identities of objects in video streams. Nijholt continues to use using different colors to distinguish objects in image data. 
Therefore one of ordinary skill in the art, such as an individual working in the field of detecting a driver’s behavior using image data could have combined the elements as claimed by known methods, and that in combination, each element merely performs the same function as it does separately. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to claim 26.

Claim 27. Nijholt further teaches in which the computer vision sub-system is switchable between (i) a first mode in which it generates a digital representation that is not a photographic image or video image and does not enable a photographic or video image of a person to be created from which that person can be recognised and (ii) a second mode in which a photographic image or video image is generated. Nijholt [5.2 VMR for distributed meeting assistance] teaches the same hold true for the other participants: they will see the remote person at his or her virtual position, making the movements and gestures of the real person. The technology is based upon simple consumer web cams, together with image recognition technology that extracts key features, like body position and gestures. This process is illustrated in Fig. 3.

Nijholt [4.2 Visualizing meetings and meeting events] teaches there is a level of available speech and image processing techniques that allows us to map captured (through microphones and cameras) meeting events (verbal and nonverbal interaction, identifying participants, and tracking of participants in the meeting environment) to multimedia online and off-line presentations of these events.

Nijholt [Capturing Meeting Activity] teaches this allows us to display animated representations of meeting participants in a (3D) virtual reality environment. The 3D positions of head, elbows and hands can reasonably be calculated [12]. 3D technology based upon portable standards, like VRML/X3D and H-Anim avatars is used. For some meetings to be recorded electromagnetic sensors were mounted on the heads of the participants for tracking their head movements. Especially in meetings this allows us to record and real-time display head orientations of the represented meeting participants. Although there can be differences in head orientation and gaze direction, it nevertheless allows a sufficiently realistic representation of focus of attention behavior (addressing persons, looking at a speaker, looking at notes or looking at the white board in the meeting room).
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DELOMIA L GILLIARD whose telephone number is (571)272-1681.  The examiner can normally be reached on 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached on 571 272-8243.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/DELOMIA L GILLIARD/Primary Examiner, Art Unit 2661