Detailed Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Specification
The disclosure is objected to because it contains an embedded hyperlink and/or other form of browser-executable code. Applicant is required to delete the embedded hyperlink and/or other form of browser-executable code; references to websites should be limited to the top-level domain name without any prefix such as http:// or other browser-executable code. See MPEP § 608.01.
There is hyperlink in paragraph 0059 of published US PGPub: US 2020/0210688 A1 Jul. 2, 2020.
An applicant is requested to remove hyperlink from the specification in response of this office action.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have 

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 2, 5 - 7, 9, 11, 16 – 19, 22, 23  and 35 are rejected under 35 U.S.C. 103 as being unpatentable over 
Rodenas US PGPuib: US 2018/0300557 A1 Oct. 18, 2018 and in view of
	Tsai US PGPub: US 2002/0076088 A1 Jun. 20, 2002.

Regarding claims 1, 18, Rodenas discloses,

a method of recognising human characteristics from image data of a subject and a system for recognising human characteristics from image data of a subject, said system comprising an input unit (input device - Fig. 10/1012), an output unit (output device 1008 – a display, printer or speaker – Fig. 10/1006, paragraph 0090), a processor (processor 1002 – Fig. 10/1002) and memory (memory 1004 – Fig. 10/1004), wherein said memory has stored thereon processor executable instructions which when executed on the processor (the device can include many types of memory, data storage or computer-readable media, such as a first data storage for program instructions for execution by the at least one processor 1002 – paragraph 0085) control the processor to (object analysis in live video content, where frames of video data from a surveillance system can be analyzed in near real time to allow for action to be taken based on the analysis. Each frame to be analyzed can be processed using at least one recognition algorithm to detect objects of interest, which can also be compared against corresponding data from earlier frames to determine relevant behaviors, moods, actions, or patterns of use. Each determination can have a corresponding confidence value. Information about the determinations and confidence levels can be analyzed to determine whether an action should be taken, as well as the type of action to take – ABSTRACT, Figs. 2A – 2E, 7B, 9/906 - 914, paragraphs 0022, 0083, 0079. The monitoring platform 316 provides the appropriate notifications or instructions, whether to the base station 302, the device 328 of a security person on site or a third party security provider 324, such as a security company or police department, that indicates an action to be taken in response to a particular alert or notification – paragraph 0028), said method comprising: 
extracting a sequence of images of the subject from the image data (extract and analyze a set of facial features, as well as objects or states such as a presence of glasses, sunglasses, open eyes, a smile, an open mouth, a mustache, a beard, etc., - paragraph 0024. Video data can be captured and analyzed to determine an overall mood of those viewers, as well as how many were happy or angry, or had other specific emotions with respect to the content - Figs. 2A – 2E, 9/906 - 914, paragraphs 0022, 0083); 
from each image estimating an emotion feature metric (video data can be captured and analyzed to determine an overall mood of those viewers, as well as how many were happy or angry, or had other specific emotions with respect to the content - Figs. 2A – 2E, 9/906 - 914, paragraphs 0022, 0083) and a facial mid-level feature metric for the subject (extract and analyze a set of facial features, as well as objects or states such as a presence of glasses, sunglasses, open eyes, a smile, an open mouth, a mustache, a beard, etc., - paragraph 0024); 

for each image, combining the associated estimated emotion metric and estimated facial mid-level feature metric to form a feature vector, thereby forming a sequence of feature vectors (video data can be captured and analyzed to determine an overall mood of those viewers, as well as how many were happy or angry, or had other specific emotions with respect to the content - Figs. 2A – 2E, 9/906 - 914, paragraphs 0022, 0083), each feature vector associated with an image of the sequence of images (extract and analyze a set of facial features, as well as objects or states such as a presence of glasses, sunglasses, open eyes, a smile, an open mouth, a mustache, a beard, etc., - paragraph 0024); and 

inputting the sequence of feature vectors to a human characteristic recognising network, wherein said human characteristic recognizing network is adapted to process the sequence of feature vectors and generate output data corresponding to at least one human characteristic derived from the sequence of feature vectors (machine learning or other training approaches can be used to improve the accuracy of the determinations over time, and help to identify behaviors, actions, emotions, or occurrences that should be identified, associated with different threat levels or scores, that lead to specific actions, etc. - Fig. 7B, paragraph 0079),

but, does not disclose, recognizing “neural network”.

Tsai teaches, a method of multi-level facial image recognition, where the sub-images are passed through self-organizing map neural networks for performing a non-supervisory classification learning (ABSTRACT, Figs. 6 – 8, paragraphs 0007, 0022).

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the object analysis in live video content of Rodenas (Rodenas, ABSTRACT, Figs. 2A – 2E, 7B, 9/906 - 914, paragraphs 0022, 0083, 0079), wherein the system of Rodenas, would have incorporated, a method of multi-level facial image recognition, where the sub-images are passed through self-organizing map neural networks for performing a non-supervisory classification learning of Tsai (Tsai, ABSTRACT, Figs. 6 – 8, paragraphs 0007, 0022) for reducing the amount of data to be compared in the recognition process thereby greatly increasing the recognition speed (Tsai, paragraph 0006).

Regarding claims 2, 19, Rodenas discloses,

a method according to claim 1, wherein the image data is video data, the extracted sequence of images are facial images of a face of the subject, and the face of the subject is a human face (extract and analyze a set of facial features, as well as objects or states such as a presence of glasses, sunglasses, open eyes, a smile, an open mouth, a mustache, a beard, etc., - paragraph 0024. Video data can be captured and analyzed to determine an overall mood of those viewers, as well as how many were happy or angry, or had other specific emotions with respect to the content - Figs. 2A – 2E, 9/906 - 914, paragraphs 0022, 0083).

Regarding claims 5, 22, Rodenas discloses,

a method according to claim 2, wherein the emotion metric is estimated by an emotion recognising neural network trained to recognise a plurality of predetermined emotions from images of human faces (video data can be captured and analyzed to determine an overall mood of those viewers, as well as how many were happy or angry, or had other specific emotions with respect to the content - Figs. 2A – 2E, 9/906 - 914, paragraphs 0022, 0083).

Regarding claims 6, 23, Rodenas discloses,

a method according to claim 5, wherein the emotion metric is associated with a human emotion of one or more of anger (anger - Fig. 2B, paragraph 0023), contempt, disgust, fear (a mood of fear, surprise or apprehension – Fig. 2C, paragraph 0023), happiness (video data can be captured and analyzed to determine an overall mood of those viewers, as well as how many were happy or angry, or had other specific emotions with respect to the content - Figs. 2A – 2E, 9/906 - 914, paragraphs 0022, 0083), sadness and surprise. 

Regarding claim 7, Rodenas discloses,

a method according to claim 5, comprising outputting by the emotion recognising neural network an n-dimensional vector (the base station can send selected frames of video – either all or a subset of captured frames – paragraph 0026), wherein each component of the vector corresponds to one of the predetermined emotions, and a magnitude of each component of the vector corresponds to a confidence with which the emotion recognising neural network has recognised the emotion (a monitoring console can provide information such as the overall mood of the people in a location, any changes in the overall mood, an indication of people with substantially different or suspicious moods, and the like – paragraph 0018. The result can be any appropriate result data in any appropriate format as discussed elsewhere herein, as may include information about a type of object of interest, a threat level, a confidence level, location data, and the like – paragraph 0082).

Regarding claim 9, Rodenas discloses,

a method according to claim 1, wherein the facial mid-level feature metric of the human face is estimated based on an image recognition algorithm, and the facial mid-level feature metric is one or more of gaze, head position and eye closure (any appropriate detectable features or objects, as may relate to expressions, poses, glasses, weapons, merchandise, and the like – paragraph 0027. Feature detection, object detection, or facial analysis can be used to determine various aspects of a person's face, body language, or movements. For facial analysis, for example, the relative locations and shapes of things like a person's lips, eyebrows, eyelids, and other such features can be indicative of the mood or sentiment of a user – Figs. 2A – 2E, paragraph 0023).
Regarding claim 11, Rodenas discloses,
a method according to claim 1, wherein the human characteristic recognising neural network is trained from video data classified to contain human faces associated with one or more of the plurality of the predetermined human characteristics (extract and analyze a set of facial features, as well as objects or states such as a presence of glasses, sunglasses, open eyes, a smile, an open mouth, a mustache, a beard, etc., - paragraph 0024. Video data can be captured and analyzed to determine an overall mood of those viewers, as well as how many were happy or angry, or had other specific emotions with respect to the content - Figs. 2A – 2E, 9/906 - 914, paragraphs 0022, 0083. Each frame to be analyzed can then be processed using at least one recognition algorithm trained or configured to detect objects, people, faces, expressions, features, and the like – paragraph 0014).  
Regarding claim 16, Rodenas discloses,
a method according to claim 1, wherein the output data of the human characteristic recognising neural network (video monitoring device 308 – Fig. 3/308, paragraph 0026) comprises an n- dimensional vector (the base station can send selected frames of video – either all or a subset of captured frames – paragraph 0026), wherein each component of the vector corresponds to a human characteristic, and a magnitude of each component of the vector corresponds to an intensity with which that characteristic is detected (a monitoring console can provide information such as the overall mood of the people in a location, any changes in the overall mood, an indication of people with substantially different or suspicious moods, and the like – paragraph 0018. The result can be any appropriate result data in any appropriate format as discussed elsewhere herein, as may include information about a type of object of interest, a threat level, a confidence level, location data, and the like – paragraph 0082).
Regarding claim 17, Rodenas discloses,
a method according to claim 1, wherein the plurality of predetermined characteristics includes one or more of passion, confidence (each determination can have a corresponding confidence value – ABSTRACT. The suspicion score, which also come with an associated confidence score, can be compared against one or more thresholds to determine an action to be taken – paragraph 0025. The result can be any appropriate result data in any appropriate format as discussed elsewhere herein, as may include information about a type of object of interest, a threat level, a confidence level, location data, and the like – paragraph 0082), honesty, nervousness (a person is detected to be unusually angry or nervous – paragraphs 0021, 0024), curiosity, judgment and disagreement.  
Regarding claim 35, Rodenas discloses,
a non-transitory computer readable storage medium (memory 1004 – Fig. 10/1004), comprising computer readable instructions stored thereon, wherein the computer readable instructions, when executed on a suitable computer processor (processor 1002 – Fig. 10/1002), control the computer processor to perform a method according to claim 1 (the device can include many types of memory, data storage or computer-readable media, such as a first data storage for program instructions for execution by the at least one processor 1002 – paragraph 0085).
Claims 12 - 15 are rejected under 35 U.S.C. 103 as being unpatentable over 
Rodenas US PGPuib: US 2018/0300557 A1 Oct. 18, 2018 and in view of
	Tsai US PGPub: US 2002/0076088 A1 Jun. 20, 2002 and further in view of
Liang US PGPub: US 2018/0114116 A1 Apr. 16, 2018.

Regarding claim 12, both Rodenas and Tsai discloses all the claimed features,

but, does not disclose, a method according to claim 1, wherein the human characteristic recognising neural network is a recurrent neural network.  
Liang teaches, cooperative evolution deep neural network structure, where the supermodule structures include a plurality of modules. The modules are neural networks (ABSTRACT). A blueprint 600 identify five supermodules, namely a fully convolutional network supermodule without a fully-connected neural network, a recurrent or recursive neural network RNN supermodule, a fully-connected neural network supermodule, a convolutional neural network CNN supermodule with a fully-connected neural network, and a hybrid supermodule that combines a RNN with a CNN into a single RNN-CNN supermodule (Fig. 6, paragraphs 0028, 0054, 0058). Long-Short-Term Memory LSTM network, convolutional neural network, neural network (paragraph 0028). WaveNet (paragraphs 0149, 0194).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the object analysis in live video content of combined Rodenas and Tsai (combined Rodenas and Tsai, ABSTRACT, Figs. 2A – 2E, 7B, 9/906 - 914, paragraphs 0022, 0083, 0079), wherein the system of combined Rodenas and Tsai, would have incorporated, cooperative evolution deep neural network structure of Liang (Liang, ABSTRACT, Fig. 6, paragraphs 0028, 0149, 0194) to provide improved systems and methods for cooperatively evolving deep neural network structures (Liang, paragraph 0009).
Regarding claim 13, both Rodenas and Tsai discloses all the claimed features,

but, does not disclose, a method according to claim 12, wherein the human characteristic recognising neural network is a Long Short-Term Memory network. 

Liang teaches, cooperative evolution deep neural network structure, where the supermodule structures include a plurality of modules. The modules are neural networks (ABSTRACT). A blueprint 600 identify five supermodules, namely a fully convolutional network supermodule without a fully-connected neural network, a recurrent or recursive neural network RNN supermodule, a fully-connected neural network supermodule, a convolutional neural network CNN supermodule with a fully-connected neural network, and a hybrid supermodule that combines a RNN with a CNN into a single RNN-CNN supermodule (Fig. 6, paragraphs 0028, 0054, 0058). Long-Short-Term Memory LSTM network, convolutional neural network, neural network (paragraph 0028). WaveNet (paragraphs 0149, 0194).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the object analysis in live video content of combined Rodenas and Tsai (combined Rodenas and Tsai, ABSTRACT, Figs. 2A – 2E, 7B, 9/906 - 914, paragraphs 0022, 0083, 0079), wherein the system of combined Rodenas and Tsai, would have incorporated, cooperative evolution deep neural network structure of Liang (Liang, ABSTRACT, Fig. 6, paragraphs 0028, 0149, 0194) to provide improved systems and methods for cooperatively evolving deep neural network structures (Liang, paragraph 0009).
 Regarding claim 14, both Rodenas and Tsai discloses all the claimed features,

but, does not disclose, a method according to claim 1, wherein the human characteristic recognising neural network is a convolutional neural network.  

Liang teaches, cooperative evolution deep neural network structure, where the supermodule structures include a plurality of modules. The modules are neural networks (ABSTRACT). A blueprint 600 identify five supermodules, namely a fully convolutional network supermodule without a fully-connected neural network, a recurrent or recursive neural network RNN supermodule, a fully-connected neural network supermodule, a convolutional neural network CNN supermodule with a fully-connected neural network, and a hybrid supermodule that combines a RNN with a CNN into a single RNN-CNN supermodule (Fig. 6, paragraphs 0028, 0054, 0058). Long-Short-Term Memory LSTM network, convolutional neural network, neural network (paragraph 0028). WaveNet (paragraphs 0149, 0194).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the object analysis in live video content of combined Rodenas and Tsai (combined Rodenas and Tsai, ABSTRACT, Figs. 2A – 2E, 7B, 9/906 - 914, paragraphs 0022, 0083, 0079), wherein the system of combined Rodenas and Tsai, would have incorporated, cooperative evolution deep neural network structure of Liang (Liang, ABSTRACT, Fig. 6, paragraphs 0028, 0149, 0194) to provide improved systems and methods for cooperatively evolving deep neural network structures (Liang, paragraph 0009).
Regarding claim 15, both Rodenas and Tsai discloses all the claimed features,

but, does not disclose, a method according to claim 14, wherein the human characteristic recognising neural network is a WaveNet based neural network.  

Liang teaches, cooperative evolution deep neural network structure, where the supermodule structures include a plurality of modules. The modules are neural networks (ABSTRACT). A blueprint 600 identify five supermodules, namely a fully convolutional network supermodule without a fully-connected neural network, a recurrent or recursive neural network RNN supermodule, a fully-connected neural network supermodule, a convolutional neural network CNN supermodule with a fully-connected neural network, and a hybrid supermodule that combines a RNN with a CNN into a single RNN-CNN supermodule (Fig. 6, paragraphs 0028, 0054, 0058). Long-Short-Term Memory LSTM network, convolutional neural network, neural network (paragraph 0028). WaveNet (paragraphs 0149, 0194).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the object analysis in live video content of combined Rodenas and Tsai (combined Rodenas and Tsai, ABSTRACT, Figs. 2A – 2E, 7B, 9/906 - 914, paragraphs 0022, 0083, 0079), wherein the system of combined Rodenas and Tsai, would have incorporated, cooperative evolution deep neural network structure of Liang (Liang, ABSTRACT, Fig. 6, paragraphs 0028, 0149, 0194) to provide improved systems and methods for cooperatively evolving deep neural network structures (Liang, paragraph 0009).
The prior arts made of record and not relied upon are considered pertinent to applicant’s disclosure.
Bhattacharya US PGPub: US 2018/0053364 A1 Feb. 22, 2018.
The one or more physiological and/or behavioural characteristics of the first user may correspond to facial features, a voice sample, a clothing pattern, an emotional state, and/or a current activity of the first user (Figs. 6/606, paragraphs 0015, 0037, 0055).

Avital US Patent: US 9,547,763 B1 Han. 17, 2017.
Generating a set of one or more face images, each face image having a facial expression score for a particular emotion associated with that face image, the facial expression scores being specific to the user (ABSTRACT, Figs, 1/56(a…n), 2/56(a….n), 3/256, 4 – 6/750).

Irie US PGPub: US 2013/0329970 A1 Dec. 12, 2013.

A numerical value indicating a degree of smiling face may be cited as an example of the continuous value. That is, the numerical value takes a small value for the expressionless face, and takes a large value as the face changes from a smile to the smiling face. Hereinafter the numerical value is also referred to as the "smile intensity". As to the classification, the smile intensity is divided by a predetermined range into divisions, and "expressionless", "smile", and "smiling face" are allocated to the divisions (Figs. 2/Facial expression, 8, paragraphs 0068, 0163 – 0165).

Song US PGPub: US 2008/0201144 A1 Aug. 21, 2008.
The method for recognizing emotion by setting different weights to at least of two kinds of unknown information, such as image and audio information, based on their recognition reliability respectively. The weights are determined by the distance between test data and hyperplane and the standard deviation of training data and normalized by the mean distance between training data and hyperplane, representing the classification reliability of different information. The method is capable of recognizing the emotion according to the unidentified information having higher weights while the at least two kinds of unidentified information have different result classified by the hyperplane and correcting wrong classification result of the other unidentified information so as to raise the accuracy while emotion recognition (ABSTRACT, all Figs. paragraphs 0006, 0007).

Bhat US PGPub: US 2017/0372505 A1 Dec. 28, 2016.
(Figs. 33 – 35).

Yehezkel US PGPub: US 2016/0086088 A1 Mar. 24, 2016.
(Figs. 3C, 3D).

Moon US PGPub: US 2009/0285456 A1 Nov. 19, 2009.
A method and system for measuring human emotional response to visual stimulus, based on the person's facial expressions (ABSTRACT, Figs. 1, 6, 10 – 13, 18, ABSTRACT, paragraphs 0037 – 0039).
Allowable Subject Matter
Claims 8 and 24 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NIMESH PATEL whose telephone number is (571)270-1228.  The examiner can normally be reached on Monday thru Friday: 6:30 AM - 3:30 PM EST.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 


If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Rafael Perez-Gutierrez can be reached on 571-272-7915.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NIMESH PATEL/Primary Examiner, Art Unit 2642