DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 1 is objected to because of the following informalities:  Examiner suggests replacing “detecting-presence” in line 17 with –detecting presence--.  Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-5 and 9-13 are rejected under 35 U.S.C. 103 as being unpatentable over Krupat et al., US 2018/0303397 in view of Pinter et al., US 2018/0308565.
 	Regarding claim 1, Krupat discloses a device configured to transcribe an appearance of a human being (fig.2; para 0054; a system for facial analysis and metric/output generation), said device comprising: 
 	a common housing holding (fig. 2): 
 	an image capturing sensor (fig. 2, element 222; a camera); 
 	a computing device comprising a data processor (fig. 2, element 230; a processing component) and a computer program product comprising: 
 	a first machine learning model trained for detecting and labeling human beings in at least one image (fig. 19; para 0092-0093 and 0108; supervised machine learning models is based on support vector machines (SVMs). an SVM can be trained using “known good” data that is labeled as belonging to one of two categories (e.g. smile and no-smile); the label is used to indicate that a particular facial expression has been detected in the one or more images or video frames which constitute the image that was received); 
 	a second machine learning model trained for detecting appearances of human beings in at least one image (fig. 19; para 0092-0093 and 0107-0108; the choice of classifiers used is based on the training of a supervised learning technique to identify facial expressions);
 	a transcription module configured to transcribe the detected appearances of human beings to text (para 0136-0137; generating a graphical representation of a facial expression for the user based on the threshold value having been met. The graphical representation can include text, images, a webpage, and so on), wherein said computer program product when running on said data processor causes said computing device to: 
 	retrieve at least one image from said image capturing sensor (para 0048 and 0104; the image data including the facial images is collected using a camera); 
 	analyze said at least one image, wherein analyzing comprises inputting said at least one image to said first machine learning model said first machine learning model detecting-presence of a human being in said at least one image and said first machine learning model labeling the detected human being in said at least one image using a label (fig. 19; para 0092-0093 and 0108; supervised machine learning models is based on support vector machines (SVMs). an SVM can be trained using “known good” data that is labeled as belonging to one of two categories (e.g. smile and no-smile); the label is used to indicate that a particular facial expression has been detected in the one or more images or video frames which constitute the image that was received); 
 	input at least a part of said at least one image to said second machine learning model, said part of said at least one image comprising the labeled human being, and said second machine learning model providing said appearance of said labeled human being as an output (fig. 19; para 0092-0093 and 0107-0108; the choice of classifiers used is based on the training of a supervised learning technique to identify facial expressions; the label is used to indicate that a particular facial expression has been detected in the one or more images or video frames which constitute the image that was received); and 
 	apply said transcription module to transcribe the retrieved appearance of said labeled human being to text and outputs said text, wherein the transcription to text in said transcription module involves creating a record and output said text into said record (para 0136-0138; generating a graphical representation of a facial expression for the user based on the threshold value having been met. The graphical representation can include text, images, a webpage, and so on; the graphical representation attached to the representation of the media presentation can be rendered on a display device; the summary emotional intensity metric can be represented, the graphical representation can be attached, and so on).
 	Krupat discloses claim 1 as enumerated above, but Krupat does not explicitly disclose a medical record as claimed.
 	However, Pinter discloses the outputs can be automatically incorporated into the appropriate electronic medical record (EMR) (para 0015).
 	Therefore, taking the combined disclosures of Krupat and Pinter as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate a medical record as taught by Pinter into the invention of Krupat for the benefit of greatly increasing the time a physician has available to be with patients (Pinter: para 0012).
 	Regarding claim 2, the device according to claim 1, Krupat in the combination further disclose wherein said device is configured to transcribe said appearance of at least one human being (para 0136-0138) within a plurality of human beings (para 0068, 0074, and 0078; the data collected on the user or on a plurality of users can be in the form of one or more videos, video frames, and still images), wherein said analyzing comprises: 
 	said first machine learning model detecting presence of said plurality of human beings in said at least one image (fig. 19; para 0092-0093 and 0108); 
 	said first machine learning model labeling said at least one human being within the detected plurality of human beings in said at least one image using a label (fig. 19; para 0092-0093 and 0108); 
 	retrieving at least a part of said at least one image (para 0048 and 0104), said part of said at least one image comprising the labeled at least one human being within said detected plurality of human beings, resulting in at least one labeled image (para 0093 and 0108); 
 	inputting said at least one labeled image to said second machine learning model (fig. 19; para 0092-0093 and 0107-0108), and 
 	retrieving said appearance of said labeled at least one human being within said detected plurality of human beings from said second machine learning model as output (fig. 19; para 0092-0093 and 0107-0108).
 	Regarding claim 3, the device according to claim 1, Krupat in the combination further disclose wherein said device is configured to transcribe multiple appearances (para 0103-0104 and 0108; detect wide range of facial expressions) of said labeled human being (para 0136), and said computer program product when running on said data processor causes said computing device to: 
 	receive multiple images from said image capturing sensor (para 0048 and 0104); 
 	analyze said multiple images (para 0092-0093 and 0108), the analyzing comprises: 
 	input said multiple images to said first machine learning model; said first machine learning model detecting presence of said human being in a first image of said multiple images; said first machine learning model labeling the detected human being in said first image of said multiple images with said label (fig. 19; para 0092-0093 and 0108); 
 	retrieve at least a part of said first image of said multiple images, said part of said first image of said multiple images comprising the labeled human being, resulting in a labeled first image; said first machine learning model detecting presence of said labeled human being in every further image of said multiple images; said first machine learning model labeling said detected human being in every further image of said multiple images with said label; retrieving at least a part of said every further image of said multiple images, said part of said every further image of said multiple images comprising said labeled human being, resulting in a labeled set of further images (fig. 19; para 0092-0093 and 0108); 
 	input said labeled first image and said labeled set of further images to said second machine learning model (fig. 19; para 0092-0093 and 0107-0108), 
 	retrieve said multiple appearances of said labeled human being from said second machine learning model (fig. 19; para 0092-0093 and 0107-0108); 
 	apply said transcription module to transcribe the retrieved multiple appearances, of said labeled human being, to text (para 0136), and 
 	output said text (para 0136-0138).
 	Regarding claim 4, the device according to claim 3, Krupat in the combination further disclose wherein said analyzing comprises: 
 	inputting a first image of said multiple images (para 0048 and 0104) to said first machine learning model; said first machine learning model detecting presence of said human being in said first image of said multiple images; said first machine learning model labeling the detected human being in said first image of said multiple images with said label (fig. 19; para 0092-0093 and 0108);  
 	retrieving at least a part of said first image of said multiple images, said part of said first image of said multiple images comprising the labeled human being, resulting in a labeled first image (fig. 19; para 0092-0093 and 0108); 
 	inputting a further image of said multiple images to said first machine learning model; said first machine learning model detecting presence of said labeled human being in said further image of said multiple images; said first machine learning model labeling said detected human being in said further image of said multiple images with said label (fig. 19; para 0092-0093 and 0108); 
 	retrieving at least a part of said further image of said multiple images, said part of said further image of said multiple images comprising said labeled human being, resulting in a labeled further image (fig. 19; para 0092-0093 and 0108); 
 	inputting said labeled first image and said labeled further image to said second machine learning model (fig. 19; para 0092-0093 and 0107-0108), and 
 	retrieving said multiple appearances of said labeled human being from said second machine learning model (fig. 19; para 0092-0093 and 0107-0108).
 	Regarding claim 5, the device of claim 1, Krupat in the combination further disclose wherein said device is configured to transcribe said appearance of each human being (para 0136-0138) within a plurality of human beings (para 0068, 0074, and 0078; the data collected on the user or on a plurality of users can be in the form of one or more videos, video frames, and still images), wherein said analyzing comprises: 
 	a) said first machine learning model detecting presence of a plurality of human beings in said at least one image (fig. 19; para 0092-0093 and 0108); 
 	b) said first machine learning model labeling the detected plurality of human beings in said at least one image using a label for each detected human being (fig. 19; para 0092-0093 and 0108); 
 	c) retrieving at least one of the labeled human beings, resulting in a set of retrieved human beings (para 0093 and 0108); 
 	d) inputting at least a part of said at least one image, said part of said at least one image comprising at least one being of said set of retrieved human beings, to said second machine learning model (fig. 19; para 0092-0093 and 0107-0108), 
 	e) retrieving said appearance of said labeled human beings in said set of retrieved human beings from said second machine learning model (fig. 19; para 0092-0093 and 0107-0108), and 
 	f) repeating said c), d) and e) until said appearance of each human being within a plurality of human beings is retrieved (fig. 19; para 0092-0093 and 0107-0108), wherein said computer program product when running on said data processor causes said computing device to: 
 	apply said transcription module to transcribe the retrieved appearances of said each human being within a plurality of living beings to text (para 0136), and 
 	output said text (para 0136-0138).
 	Regarding claim 9, the device of claim 1, Krupat in the combination further disclose wherein said multiple images comprise a time base, in an embodiment said multiple images comprise a part of a video recording or a series of time-laps images (para 0136 and 0140).
 	Regarding claim 10, the device of claim 1, Krupat in the combination further disclose wherein said multiple images comprise a real-time processed video recording (para 0136 and 0140).
 	Regarding claim 11, the device of claim 1, Krupat in the combination further disclose wherein said appearance comprises a pose (para 0063 and 0104-0105; different facial expressions).
 	Regarding claim 12, the device of claim 1, Krupat in the combination further disclose wherein said appearances comprises a series of poses or a change of poses, said series of poses or change of poses defining at least one action (para 0063 and 0104-0105).
 	Regarding claim 13, this claim recites substantially the same limitations that are performed by claim 1 above, and it is rejected for the same reasons.

Allowable Subject Matter
Claims 6-8 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

The following is a statement of reasons for the indication of allowable subject matter:  
 	The prior art made of record and considered pertinent to the applicant's disclosure, taken individually or in combination, does not teach the claimed invention having the following limitations, in combination with the remaining claimed limitations.
 	Regarding dependent claim 6, the prior art does not teach or suggest the claimed invention having “wherein said second machine learning model comprising: a first deep neural network which captures the skeleton data of said human being in said at least a part of said at least one image, said first deep neural network using said at least a part of said at least one image as an input and outputs said skeleton data; a second deep neural network which captures a first appearance of said human being, said second deep neural network using said skeleton data from said first deep neural network as an input and outputs said first appearance in first appearance data; a third deep neural network which captures a second appearance of said human being in said at least a part of said at least one image, said third deep neural network using said at least a part of said at least one image as an input and outputs said second appearance in second appearance data, and a fourth deep neural network which captures a third appearance of said human being using said first and second appearance data as an input and outputs third appearance data, said third appearance data comprising a prediction of probabilities of said appearance”, and a combination of other limitations thereof as recited in the claims.         
 	Regarding claims 7-8, the claims have been found allowable due to its dependencies to claim 6 above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
 	Wexler et al., US 2017/0064363 discloses a system is provided for selecting content for a user of a wearable apparatus based on the user's behavior.
 	Chen, US 2014/0247343 discloses a method and apparatus may capture an image of a second person with a camera on a frame worn by a user, transmit the image to a second device, receive a recommendation for the user to interact with the second person in response to a behavior of the second person, and display the recommendation to the user on how to interact with the second person in response to the second person's behavior

Any inquiry concerning this communication or earlier communications from the examiner should be directed to VAN D HUYNH whose telephone number is (571)270-1937. The examiner can normally be reached 8AM-6PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Edward F Urban can be reached on (571) 272-7899. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/VAN D HUYNH/Primary Examiner, Art Unit 2665