DETAILED ACTION
Priority
Acknowledgment is made of applicant's claim for foreign priority based on an application filed in Republic of Korea on May 4, 2018. It is noted, however, that applicant has not filed a certified copy of the KR10-2018-0051931 application as required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on April 15, 2019 and October 15, 2019 were filed on or after the mailing date of the instant application on April 15, 2019.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Objections
Claim 1 is objected to because of the following informalities: it recites “receiving” on line 16, which should be “receive” and “reproducing” on line 17, which should be “reproduce”. Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


With regard to Claim 1, it recites the limitation "the communication module" in lines 14-15.  There is insufficient antecedent basis for this limitation in the claim. Accordingly, Claims 2-5 are rejected due to their dependency to Claim 1.
Therefore, for the purpose of examination, Examiner will interpret Claim 1 to recite “… a communication module” This interpretation appears to be more consistent with the rest of the specification.
With regard to Claim 3, it is rejected because of the following informalities: Claim 3 recites, inter alia, “… for an electronic appliance.” It is unclear whether the recited “an electronic appliance” corresponds to the previously recited “at least one electronic appliance” in Claim 2 from which Claim 3 depends.
Therefore, for the purpose of examination, Examiner will interpret Claims 3 to recite “… the at least one electronic appliance” This interpretation appears to be more consistent with the rest of the specification.

Claim Rejections - 35 USC § 103 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 6-9, 11-20 are rejected under 35 U.S.C. 103 as being unpatentable over Chandrasekaran et al. (US Patent 10,522,143 B2), with an effective filing date of February 27, 2018, in view of Kim et al. (US Patent 9,582,245 B2).
With regard to Claims 1, and 18, Chandrasekaran teaches (a) an electronic device (client device 610), and (b) a computer program product (CPP) included in a non-transitory computer-readable storage medium (software components, code embodied on a machine-readable medium) comprising: an audio module (module 614, interaction skill);  5a communication circuitry (communication components 764); a microphone (microphone 38); a memory storing programming instructions (memory device); and a processor (processing circuitry), wherein the computer program product is configured to include one or more instructions that, when executed by the electronic device, cause the processor of the electronic device to:
medium” means a device able to store instructions (e.g., instructions 716) and data temporarily or permanently and may include, but is not limited to, random-access memory (RAM), read-only memory (ROM), buffer memory, flash memory, optical media, magnetic media, cache memory, other types of storage (e.g., Erasable Programmable Read-Only Memory (EEPROM)), and/or any suitable combination thereof. The term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, or associated caches and servers) able to store the instructions 716. The term “machine-readable medium” shall also be taken to include any medium, or combination of multiple media, that is capable of storing instructions (e.g., instructions 716) for execution by a machine (e.g., machine 700), such that the instructions, when executed by one or more processors of the machine (e.g., processors 710), cause the machine to perform any one or more of the methodologies described herein. Accordingly, a “machine-readable medium” refers to a single storage apparatus or device, as well as “cloud-based” storage systems or storage networks that include multiple storage apparatus or devices. The term “machine-readable medium” as used herein excludes signals per se

[Col 1, lines 58-63] Embodiments described herein generally relate to a personal virtual assistant system, comprising processing circuitry, a speaker, at least one sensing device that collects user state information relating to an emotional state of a user from at least the user's voice, and a memory device having instructions stored therein. 

[Col 13, lines 51-65] The module(s) 614 may include native modules which are provided on the client device 610 by the developer/manufacturer of the client device 610, and third-party modules that are installed by the user after the user's acquisition of the client device 610 … such modules may also include the user state sensing devices 36 such as a microphone 38, camera 40, or sensors 44 described above with respect to FIG. 3.  Each module 614 may include a skill of a smart speaker device … as appropriate.

[Col 11, lines 35-37] PVAs may use skills or similar functions to complete tasks and perform certain actions. A brief example of a skill might include a restaurant interaction skill,

[Col 13, lines 32-36] The network 660 allows the client devices 610, module servers 620, virtual personal assistant server 630, inference store 640, and user consent data store 650 to communicate with one another.
[Col 20, lines 39-46] Communication may be implemented using a wide variety of technologies. The I/O components 750 may include communication components 764 operable to couple the machine 700 to a network 780 or devices 770 via a coupling 782 and a coupling 772, respectively. For example, the communication components 764 may include a network interface component or other suitable device to interface with the network 780. 

receive a voice command from a user (see Fig. 3, current conversational state information 20) via the microphone;
[Col 11, lines 35-40] PVAs may use skills or similar functions to complete tasks and perform certain actions. A brief example of a skill might include a restaurant interaction skill, allowing a user to issue a command, such as “Reserve a table at Mario's Italian Restaurant,” or “Order a Coffee from Fourth Coffee Company.” 

[Col 5, lines 53-58] For example, audio and video of the user's response is captured and processed with artificial intelligence (AI) services that infer the user's emotional state (happy/sad, energetic/lethargic, etc.). Additionally, input from wearables and the surrounding environment is also captured and processed to further assess the conditions under which the interaction with the user is taking place

request situation information (inferred emotional state) from a first external electronic device (client device 610 comprising module 614 requests inferred emotional state from backend module server 620, which is communicatively coupled with signal processor 46 and cognitive/AI services 48) based on device information (see Fig. 3, inputs to ML classifier 30 comprising emotional state history 53 and user state 36 captured using a variety of sensing devices) and the voice command, 
[Col 5, lines 46-58] The system described herein personalizes the PVA's responses and adapts the responses over time based on the user's reactions. The system uses multi-modal sensing and a feedback loop to constantly adjust both the content and the delivery of the PVA's responses so that the responses are more empathetic and designed to improve the user's mood. For example, audio and video of the user's response is captured and processed with artificial intelligence (AI) services that infer the user's emotional state (happy/sad, energetic/lethargic, etc.). Additionally, input from wearables and the surrounding environment is also captured and processed to further assess the conditions under which the interaction with the user is taking place.

[Col 7, lines 35-51] As shown in FIG. 3, the inputs to the ML classifier 30 may include features determined from user preferences data 32 provided by the user during system setup, cultural references data 34 collected by the system during use, and user state information 36 captured using a variety of sensing devices. For example, the user state information 36 may include outputs from a microphone 38, a camera 40, wearable sensing devices 42, environmental sensors 44, and the like. The user state information 36 is provided to a signal processor 46 including cognitive/AI services 48 for generating the user features. For example, the cognitive/AI services 48 may include one or more of the cognitive services available on the Azure service platform provided by Microsoft Corporation (azure.microsoft.com/en-us/services). For example, emotion API (azure.microsoft.com/en-us/services/cognitive-services/emotion) may be used in sample embodiments herein.

[Col 7, lines 6-9] In the embodiment of FIG. 2B, observations of the user and the conversation state produce an inferred emotional state which is, in turn, used to select the emotionally appropriate conversational responses and tone of voice

[Col 14, lines 18-23] Similarly, the modules 614 of the client device 610 may provide the user state features 36, user preference features data 32, and cultural reference features data 34 to one or more module servers 620 for implementation of the cognitive/AI services 48 and machine learning services 30 described above.

[Col 11, lines 9-13] It will be appreciated that the PVA in particular embodiments may include a variety of voice, text, or other communication interfaces, and may operate to collect a variety of location and context information of a user for personal customization of information and actions. 

receiving the situation information (module 614 receives inferred emotional state 24 from backend module server 620, which is communicatively coupled with signal processor 46 and cognitive/AI services 48)), transmit the situation information to a second external electronic device (module 614 receives specified inferences 24 from backend PVA server 630, which is communicatively coupled with TTS subsystem 28) via the communication 15module (coupling 772); and 
[Col 20, lines 39-46] Communication may be implemented using a wide variety of technologies. The I/O components 750 may include communication components 764 operable to couple the machine 700 to a network 780 or devices 770 via a coupling 782 and a coupling 772, respectively. For example, the communication components 764 may include a network interface component or other suitable device to interface with the network 780.  

[Col 14, lines 42-57] According to some implementations, the PVA server 630 determines, based on interaction of the user with the PVA 612 at one or more client devices 610 associated with an account of the user, multiple inferences about the user (e.g., about the user's mood). The PVA server 630 stores the multiple inferences in the inference store 640. The PVA server 630 stores, in the user consent data store 650, user consent data representing whether the user provided consent for the module(s) 614 to access at least a portion of the inferences in the inference store 640. The PVA server 630 receives, from the module 614, a request for a specified inference from the inference store 640. The PVA server 630 verifies the user consent data associated with the specified inference and the module 614. The PVA server 630 provides the specified inference to the module 614 in response to verifying the user consent data. 

[Col 13, lines 46-48] Each client device 610 includes a PVA 612 (e.g., Apple Siri®, Microsoft Cortana®, or Ok Google®) and module(s) 614. 

[Col 14, lines 24-26] The PVA server 630 implements the PVA 612. For example, PVA server 630 may be coupled with a web searching interface to answer the user's questions,

[Col 6, line 64 - Col 7, line 6] FIG. 2B illustrates a PVA system as described herein that maintains a conversational state 20 and infers the emotional state of a conversation 24 and uses the inferred emotional state to generate appropriate responses using response selector 26 that are sent to a text-to-speech (TTS) subsystem 28 that provides a contextualized response and tone in an example embodiment. TTS subsystem 28 may be any of a number of available TTS subsystems including the Bing Speech API available through Microsoft Azure, for example.

[Col 9, lines 58-66] FIG. 4 illustrates how, given a conversational state 20 and inferred emotional state 24, response selector 26 may vary the text response sent to the TTS subsystem 28 whereby responses may be varied based on rules or lookups against pre-selected responses or encoded as responses generated by a learned model such as a deep neural network to generate conversationally and emotionally appropriate responses having the appropriately contextualized text and tone according to an example embodiment

receiving content (see Fig. 4, contextualized text, contextualized tone) corresponding to the situation information from the second external electronic device and  
[Col 14, lines 42-57] According to some implementations, the PVA server 630 determines, based on interaction of the user with the PVA 612 at one or more client devices 610 associated with an account of the user, multiple inferences about the user (e.g., about the user's mood). The PVA server 630 stores the multiple inferences in the inference store 640. The PVA server 630 stores, in the user consent data store 650, user consent data representing whether the user provided consent for the module(s) 614 to access at least a portion of the inferences in the inference store 640. The PVA server 630 receives, from the module 614, a request for a specified inference from the inference store 640. The PVA server 630 verifies the user consent data associated with the specified inference and the module 614. The PVA server 630 provides the specified inference to the module 614 in response to verifying the user consent data.

reproducing the received content (contextualized response and tone).  
[Col 13, lines 46-48] Each client device 610 includes a PVA 612 (e.g., Apple Siri®, Microsoft Cortana®, or Ok Google®) and module(s) 614. 
[Col 6, line 64 - Col 7, line 3] FIG. 2B illustrates a PVA system … and uses the inferred emotional state to generate appropriate responses using response selector 26 that are sent to a text-to-speech (TTS) subsystem 28 that provides a contextualized response and tone in an example embodiment. 

Chandrasekaran does not explicitly state that the request for situation information comes upon receiving the voice command through the client device 610. In this regard, Chandrasekaran teaches the PVA system is continuously adjusting based on multi-modal inputs (see Col 5, lines 46-58). Further, Chandrasekaran does not recite explicitly which components are performed locally vs. via remote computing device. However, Chandrasekaran teaches that the PVA system can be performed exclusively on a single computing device or in combination with a remote computing system (see Col 12, lines 3-21):
In particular, some aspects of the described process (such as the command and control service) may take place on a different processing system (e.g., in a computer in a cloud-hosted data center) ... Similarly, operational data may be included within respective components or modules, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices.

[Col 18, lines 5-27] FIG. 7 is a block diagram illustrating components of a machine 700 which may be a PVA 612, for example, which according to some example embodiments is able to read instructions from a machine-readable medium (e.g., a machine-readable storage medium) and perform any one or more of the methodologies discussed herein … the machine 700 operates as a standalone device or may be coupled (e.g., networked) to other machines. In a networked deployment, the machine 700 may operate in the capacity of a server machine or a client machine in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine 700 may comprise, but not be limited to, a server computer, a client computer,

[Col 21, lines 30-38] The instructions 716 may be transmitted or received over the network 780 using a transmission medium via a network interface device (e.g., a network interface component included in the communication components 764) and utilizing any one of a number of well-known transfer protocols (e.g., HTTP). Similarly, the instructions 716 may be transmitted or received using a transmission medium via the coupling 772 (e.g., a peer-to-peer coupling) to the devices 770.

[Col 21, line 60 – Col 22, line 5] Those skilled in the art will further appreciate that the personal virtual assistant described herein may be implemented in an embodiment where … the emotional intelligence could live on the PVA device (for performance or privacy reasons) or in the cloud (or a combination of both).

1) comprising: an audio module (first command receiver 121); 5a communication circuitry (communication unit 13); a microphone (first command receiver 121 comprises a microphone); a memory storing programming instructions (storage unit 15); and a processor (controller 14), wherein the programming instructions are executable by the processor to cause the electronic device to:  
[Col 7, lines 4-13] The command receiver 12 receives a user's voice command. For example, the command receiver 12 may include a first command receiver 121 to receive a user's voice command. The first command receiver 121 may include a microphone to convert a received user's voice command into a voice signal. The command receiver 12 may further include a second command receiver 122 to receive a user's manipulation command. The second command receiver 122 may be implemented as a remote control signal receiver which receives a remote control signal including key input information corresponding to a user's manipulation command from a remote controller (not shown) or as a manipulation panel which is provided in the electronic device 1 and generates key input information corresponding to a user's manipulation.

[Col 6, lines 30-38] FIG. 2 is a block diagram of an electronic device 1 according to an exemplary embodiment. The electronic device 1 may include an operation performer 11, a command receiver 12, a communication unit 13 (e.g., communicator such as a wired and/or wireless interface, port, card, dongle, etc.), and a controller 14. The electronic device 1 may further include a storage unit 15 (e.g., a storage such as RAM, ROM, flash memory, a hard disk drive, etc.). The operation performer 11 performs operations of the electronic device 1. For example, if the electronic device 1 includes a display apparatus such as a TV, the operation performer 11 may include a signal receiver 111, an image processor 112, and a display unit 113 (e.g., a display such as a liquid crystal display panel, a plasma display panel, an organic light emitting diode display, etc.). However, it is understood that the operation performer 11 

receive a voice command from a user (see Fig. 12, user’s voice command) via the microphone;  
[Col 7, lines 1-4] the command receiver 12 may include a first command receiver 121 to receive a user's voice command. The first command receiver 121 may include a microphone to convert a received user's voice command into a voice signal.

request, upon receiving the voice command, situation information (see Fig. 12, converted text) from a first external electronic device (speech-to-text server 4) based on the voice command,  
[Col 2, lines 39-42] The controller may transmit the user's voice command to a second server, receive a text into which the voice command has been converted, from the second server, and transmits the received text to the first server.

[Col 9, line 56-Col 10, line 6] FIG. 12 illustrates an example of a speech-to-text (STT) server 4 according to an exemplary embodiment. The electronic device 1 may process the information regarding the user's voice command, i.e., the voice made by the user, into a text. For example, the electronic device 1 transmits the received user's voice command to the STT server 4. The STT server 4 includes an STT converter 41 which converts the user's voice command transmitted by the electronic device 1 into a corresponding text. The STT server 4 transmits the text into which the user's voice command has been converted, to the electronic device 1. The electronic device 1 may determine, on the basis of the text transmitted by the STT server 4, whether the user's voice command corresponds to the voice recognition command included in the stored voice recognition command list. The electronic device 1 may transmit the text provided by the STT server 4 to the server 1 and request the server 1 to analyze the user's voice command.


[Col 10, lines 4-5] The electronic device 1 may transmit the text provided by the STT server 4 to the server [2] 

[Col 7, lines 14-19] The communication unit 13 communicates with the analysis server 2 through the network 3. The communication unit 13 exchanges the user's voice command and the information regarding the analysis result with the analysis server 2 under a control of the controller 14.

receiving content (see Fig. 12, control command information; Fig. 13 stored command list 131) corresponding to the situation information from the second external electronic device and   
[Col 10, lines 7-22] At operation S71, the electronic device 1 transmits a user's voice command to the analysis server 2. At operation S72, the electronic device 1 identifies whether the control command information corresponding to the user's voice command has been received from the analysis server 2. If the electronic device 1 has received the control command information corresponding to the user's voice command from the analysis server 2, the electronic device 1 operates according to the control command information transmitted by the analysis server 2 at operation S73

reproducing the received content:   
[Col 11, lines 54-67] user interface (UI) 131 which shows a list of voice commands stored according to an exemplary embodiment. The electronic device 1 stores therein the voice command said by a user, and upon a user's request, may display the list of the stored voice commands as a UI 131. As shown in FIG. 13, the list of the stored voice commands displayed as the UI 131 shows voice commands 132 which have been said by a user. The electronic device 1 may store the voice commands per user, and show the stored voice commands 132 per user (reference numeral 133). The electronic device 1 may display the list of the stored voice commands in which the voice commands 132 are sorted in order of how many times the voice commands 132 have been said by a user.

As noted above, Kim teaches a configuration wherein data is transmitted and received between a first external electronic device and second external electronic device via an electronic device (see Fig. 12), and receiving content from the second external device to the electronic device for reproducing the content. Since all the claimed elements would continue to operate in the same manner, specifically the network, the electronic device, PVA module 612, PVA server 630, module 614, and the module server 620 of  Chandrasekaran, it would have been obvious to one of ordinary skill in the art before the effective filing date to utilize the compatible server-client architecture taught in Kim to support the transmission and retrieval of (a) device information from the electronic device to the module server 620 and (b) situation information from the first module server 620 to the second PVA server 630 through the client device 610. As taught by Kim, the architecture of the electronic device, the analysis server and the STT server establishes server-client communications which are initialized based on a voice command and responds with content. As an artisan of ordinary skill in the art would appreciate, this combination of the devices of Chandrasekaran configured in the manner set forth in Kim would be no more "than the 
With regard to Claims 2, and 19, the combination of Chandrasekaran and Kim teach all the limitations of Claims 1 and 19, respectively. Furthermore Chandrasekaran teaches (a) an electronic device (client device 610), and (b) a computer program product (CPP) included in a non-transitory computer-readable storage medium (software components, code embodied on a machine-readable medium), wherein the device information 20includes information obtained from at least one electronic appliance (at least one sensing device (e.g. smart home device, a smart appliance)) disposed in a location where the electronic device is located, and wherein the at least one electronic appliance includes at least one of an Internet of Things (IoT) device and a smart device:
[Col 22, lines 40-44] Example 4 is an example as in Example 1 wherein the at least one sensing device further senses the user's context from at least one of the user's location, local time, schedule, and surroundings and the user's prior interactions with the personal virtual assistant system or other devices.
[Col 11, lines 9-13] It will be appreciated that the PVA in particular embodiments may include a variety of voice, text, or other communication interfaces, and may operate to collect a variety of location and context information of a user for personal customization of information and actions.  

[Col 18, lines 5-40] FIG. 7 is a block diagram illustrating components of a machine 700 which may be a PVA 612, for example, which … perform any one or more of the methodologies discussed herein. … The machine 700 may comprise, but not be limited to, a server computer, a client computer, PC, a tablet computer, a laptop computer, a netbook, an entertainment media system, a cellular telephone, a smart phone, a mobile device, a wearable device (e.g., a smart watch), a smart home device (e.g., a smart appliance), other smart devices ... Further, while only a single machine 700 is illustrated, the term “machine” shall also be taken to include a collection of machines 700 that individually or jointly execute the instructions 716 to perform any one or more of the methodologies discussed herein.

With regard to Claims 3, and 20, the combination of Chandrasekaran and Kim teach all the limitations of Claims 2 and 19, respectively. Furthermore Chandrasekaran teaches (a) an electronic device (client device 610), and (b) a computer program product (CPP) included in a non-transitory computer-readable storage medium (software components, code embodied on a machine-readable medium), wherein the device information is configured to include at least one of operation level information (tactile input components 754), time information (local time, time spent exercising 756), illumination information (illumination sensor components 760), arrangement information (position components 762) for an electronic appliance:
[Col 18, lines 5-6] FIG. 7 is a block diagram illustrating components of a machine 700 which may be a PVA 612

[Col 5, lines 31-39] The following disclosure provides an overview of techniques and configurations to enable a PVA such as Cortana™ available from Microsoft Corporation to take inputs such as a user's tone of voice, language used, facial expression, recent interactions with other devices, context on the user's location, local time, schedule, surroundings, and the like and to process those inputs to provide outputs that have augmented speed, tone, and language for communication with the user.
 
762):
[Col 19, line 32 – Col 20, line 38] The I/O components 750 may include a wide variety of components to receive input, provide output, produce output, transmit information, exchange information, capture measurements, and so on. … In various example embodiments, the I/O components 750 may include output components 752 and input components 754. The output components 752 may include … acoustic components (e.g., speakers), … and improve the user's mood to an optimal state as described herein. The input components 754 may include … audio input components (e.g., a microphone), and the like. In further example embodiments, the I/O components 750 may include biometric components 756, motion components 758, environmental components 760, or position components 762, among a wide array of other components. For example, the biometric components 756 may include components to detect expressions (e.g., hand expressions, facial expressions, vocal expressions, body gestures, or eye tracking), measure bio-signals (e.g., blood pressure, heart rate, body temperature, perspiration, or brain waves), measure exercise-related metrics (e.g., distance moved, speed of movement, or time spent exercising) identify a person (e.g., voice identification, retinal identification, facial identification, fingerprint identification, or electroencephalogram based identification), and the like. The motion components 758 may include acceleration sensor components (e.g., accelerometer), gravitation sensor components, rotation sensor components (e.g., gyroscope), and so forth. The environmental components 760 may include, for example, illumination sensor components (e.g., photometer), temperature sensor components (e.g., one or more thermometers that detect ambient temperature), humidity sensor components, pressure sensor components (e.g., barometer), acoustic sensor components (e.g., one or more microphones that detect background noise), proximity sensor components (e.g., infrared sensors that detect nearby objects), … or other components that may provide indications, measurements, or signals corresponding to a surrounding physical environment. The position components 762 may include location sensor components (e.g., a Global Position System (GPS) receiver component), altitude sensor components (e.g., altimeters or barometers that detect air pressure from which altitude may be derived), 

With regard to Claim 4, the combination of Chandrasekaran and Kim teach all the limitations of Claim 2. Furthermore Chandrasekaran teaches an electronic device (client device 610), wherein the first external electronic device includes a server (backend module server 620 is communicatively coupled with signal processor 46 and cognitive/AI services 48) storing the device information (collected/captured on storage device),
[Col 5, lines 46-58] The system described herein personalizes the PVA's responses and adapts the responses over time based on the user's reactions. The system uses multi-modal sensing and a feedback loop to constantly adjust both the content and the delivery of the PVA's responses so that the responses are more empathetic and designed to improve the user's mood. For example, audio and video of the user's response is captured and processed with artificial intelligence (AI) services that infer the user's emotional state (happy/sad, energetic/lethargic, etc.). Additionally, input from wearables and the surrounding environment is also captured and processed to further assess the conditions under which the interaction with the user is taking place.

[Col 7, lines 35-51] As shown in FIG. 3, the inputs to the ML classifier 30 may include features determined from user preferences data 32 provided by the user during system setup, cultural references data 34 collected by the system during use, and user state information 36 captured using a variety of sensing devices. For example, the user state information 36 may include outputs from a microphone 38, a camera 40, wearable sensing devices 42, environmental sensors 44, and the like. The user state information 36 is provided to a signal processor 46 including cognitive/AI services 48 for generating the user features. For example, the cognitive/AI services 48 may include one or more of the cognitive services available on the Azure service platform provided by Microsoft Corporation (azure.microsoft.com/en-us/services). For example, 

[Col 14, lines 18-23] Similarly, the modules 614 of the client device 610 may provide the user state features 36, user preference features data 32, and cultural reference features data 34 to one or more module servers 620 for implementation of the cognitive/AI services 48 and machine learning services 30 described above.

[Col 15, lines 47-52] operational data may be included within respective components or modules, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices.

[Col 11, lines 9-13] It will be appreciated that the PVA in particular embodiments may include a variety of voice, text, or other communication interfaces, and may operate to collect a variety of location and context information of a user for personal customization of information and actions. 

and wherein the first external electronic device is configured to update the 10device information (see Fig. 3, emotional state history) upon detecting a reception of a voice command (interaction data):
[Col 8, lines 48-59] As also depicted in FIG. 3, the PVA system may include a data store 53 that stores a history of past user interactions with the PVA and/or emotional states for use with the heuristics 50. The data store 53 is updated with interaction data, emotional state inferences, etc. over time. As described herein, the historical data is useful for establishing a baseline emotional state for each user. Also, the history of past interactions (defined at the session/conversation level, for each dialog turn, or for each user action (in a non-conversational instantiation)) can be used to normalize feature values in the ML classifier 30, among other things that will be apparent to those skilled in the art.

700) comprising: an audio module (interaction skill);  5a communication circuitry (communication components 764); a microphone (audio input components (e.g., a microphone) 754); a memory storing programming instructions (memory 730); and a processor (processor 710), wherein the programming instructions are executable by the processor to cause the electronic device to:
[Col 18, lines 11-21] FIG. 7 shows a diagrammatic representation of the machine 700 in the example form of a computer system, within which instructions 716 … the machine 700 operates as a standalone device
[Col 18, lines 41-52] The machine 700 may include processors 710, memory/storage 730, and I/O components 750, which may be configured to communicate with each other such as via a bus 702. In an example embodiment, the processors 710 … may include, for example, a processor 712 and a processor 714 that may execute the instructions 716. 
[Col 11, lines 35-37] PVAs may use skills or similar functions to complete tasks and perform certain actions. A brief example of a skill might include a restaurant interaction skill,

[Col 20, lines 39-46] Communication may be implemented using a wide variety of technologies. The I/O components 750 may include communication components 764 operable to couple the machine 700 to a network 780 or devices 770 via a coupling 782 and a coupling 772, respectively. For example, the communication components 764 may include a network interface component or other suitable device to interface with the network 780. 

receive a voice command from a user (see Fig. 3, current conversational state information 20) via the microphone;
[Col 11, lines 35-40] PVAs may use skills or similar functions to complete tasks and perform certain actions. A brief example of a interaction skill, allowing a user to issue a command, such as “Reserve a table at Mario's Italian Restaurant,” or “Order a Coffee from Fourth Coffee Company.” 

[Col 5, lines 53-58] For example, audio and video of the user's response is captured and processed with artificial intelligence (AI) services that infer the user's emotional state (happy/sad, energetic/lethargic, etc.). Additionally, input from wearables and the surrounding environment is also captured and processed to further assess the conditions under which the interaction with the user is taking place

upon receiving the voice command, receive device information (see PVA system of Fig. 3, which inputs device information  into ML classifier 30, the device information comprising emotional state history 53 and user state information 36 captured using a variety of sensing devices) from a 10first external electronic device (one or more I/O components 750), obtain situation information (PVA system requests inferred emotional state from backend module server 620, which is communicatively coupled with signal processor 46 and cognitive/AI services 48) based on the device information and the voice command;
[Col 5, lines 46-58] The system described herein personalizes the PVA's responses and adapts the responses over time based on the user's reactions. The system uses multi-modal sensing and a feedback loop to constantly adjust both the content and the delivery of the PVA's responses so that the responses are more empathetic and designed to improve the user's mood. For example, audio and video of the user's response is captured and processed with artificial intelligence (AI) services that infer the user's emotional state (happy/sad, energetic/lethargic, etc.). Additionally, input from wearables and the surrounding environment is also captured and processed to further assess the conditions under which the interaction with the user is taking place.

inputs to the ML classifier 30 may include features determined from user preferences data 32 provided by the user during system setup, cultural references data 34 collected by the system during use, and user state information 36 captured using a variety of sensing devices. For example, the user state information 36 may include outputs from a microphone 38, a camera 40, wearable sensing devices 42, environmental sensors 44, and the like. The user state information 36 is provided to a signal processor 46 including cognitive/AI services 48 for generating the user features. For example, the cognitive/AI services 48 may include one or more of the cognitive services available on the Azure service platform provided by Microsoft Corporation (azure.microsoft.com/en-us/services). For example, emotion API (azure.microsoft.com/en-us/services/cognitive-services/emotion) may be used in sample embodiments herein.

[Col 7, lines 6-9] In the embodiment of FIG. 2B, observations of the user and the conversation state produce an inferred emotional state which is, in turn, used to select the emotionally appropriate conversational responses and tone of voice

[Col 14, lines 18-23] Similarly, the modules 614 of the client device 610 may provide the user state features 36, user preference features data 32, and cultural reference features data 34 to one or more module servers 620 for implementation of the cognitive/AI services 48 and machine learning services 30 described above.


[Col 11, lines 9-13] It will be appreciated that the PVA in particular embodiments may include a variety of voice, text, or other communication interfaces, and may operate to collect a variety of location and context information of a user for personal customization of information and actions. 


retrieving content (module 614 requests specified inferences 24 from backend PVA server 630, see Fig. 4, contextualized text, contextualized tone 28) content corresponding 15to the situation information;
[Col 7, lines 6-9] In the embodiment of FIG. 2B, observations of the user and the conversation state produce an inferred emotional state which is, in turn, used to select the emotionally appropriate conversational responses and tone of voice
 
[Col 6, line 64 - Col 7, line 6] FIG. 2B illustrates a PVA system as described herein that maintains a conversational state 20 and infers the emotional state of a conversation 24 and uses the inferred emotional state to generate appropriate responses using response selector 26 that are sent to a text-to-speech (TTS) subsystem 28 that provides a contextualized response and tone in an example embodiment. TTS subsystem 28 may be any of a number of available TTS subsystems including the Bing Speech API available through Microsoft Azure, for example.

[Col 9, lines 58-66] FIG. 4 illustrates how, given a conversational state 20 and inferred emotional state 24, response selector 26 may vary the text response sent to the TTS subsystem 28 whereby responses may be varied based on rules or lookups against pre-selected responses or encoded as responses generated by a learned model such as a deep neural network to generate conversationally and emotionally appropriate responses having the appropriately contextualized text and tone according to an example embodiment

receiving the content (see Fig. 4, contextualized text, contextualized tone); and reproducing the received content:
[Col 7, lines 6-9] In the embodiment of FIG. 2B, observations of the user and the conversation state produce an inferred emotional state which is, in turn, used to select the emotionally appropriate conversational responses and tone of voice
 
[Col 6, line 64 - Col 7, line 6] FIG. 2B illustrates a PVA system as described herein that maintains a conversational state 20 and infers the emotional state of a conversation 24 and uses the inferred emotional state to generate appropriate responses using response selector 26 that are sent to a text-to-speech (TTS) subsystem 28 that provides a contextualized response and tone in an example embodiment. TTS subsystem 28 may be any of a number of available TTS subsystems including the Bing Speech API available through Microsoft Azure, for example.

[Col 6, line 64 - Col 7, line 3] FIG. 2B illustrates a PVA system … and uses the inferred emotional state to generate appropriate responses using response selector 26 that are sent to a text-to-speech (TTS) subsystem 28 that provides a contextualized response and tone in an example embodiment. 

Chandrasekaran does not explicitly state that the situation information is directly propagated to the second external device 630 through transmissions and requests from the machine 700/PVA 612, thus causing the second external electronic device 630 to retrieve content. However, Chandrasekaran teaches that the PVA server 630 implements instructions received from machine 700/PVA module 612, and processes requests from module 614. Chandrasekaran also teaches that PVA 612 and module 614 can be processed on a single computing device 610 (Col 13, lines 46-48) with backend servers 620 (see Col 14, lines 13-14) and 630 (see Col 14, lines 52-57). Moreover, Chandrasekaran teaches a request for a specified inference from the inference store 640 coupled to backend PVA server 630 (see Col 14, lines 52-57). In this regard, Chandrasekaran teaches the PVA system can be performed exclusively on a single computing device or in combination with a remote computing system (see Col 12, lines 3-21):
In particular, some aspects of the described process (such as the command and control service) may take place on a different processing system (e.g., in a computer in a cloud-hosted data center) ... Similarly, operational data may be included within respective components or modules, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices.
 
[Col 18, lines 5-27] FIG. 7 is a block diagram illustrating components of a machine 700 which may be a PVA 612, for example, which according to some example embodiments is able to read instructions from a machine-readable medium (e.g., a machine-readable storage medium) and perform any one or more of the methodologies discussed herein … the machine 700 operates as a standalone device or may be coupled (e.g., networked) to other machines. In a networked deployment, the machine 700 may operate in the capacity of a server machine or a client machine in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine 700 may comprise, but not be limited to, a server computer, a client computer

[Col 21, lines 30-38] The instructions 716 may be transmitted or received over the network 780 using a transmission medium via a network interface device (e.g., a network interface component included in the communication components 764) and utilizing any one of a number of well-known transfer protocols (e.g., HTTP). Similarly, the instructions 716 may be transmitted or received using a transmission medium via the coupling 772 (e.g., a peer-to-peer coupling) to the devices 770.

[Col 21, line 60 – Col 22, line 5] Those skilled in the art will further appreciate that the personal virtual assistant described herein may be implemented in an embodiment where the personal virtual assistant system includes sensors but the AI and machine learning features are implemented on the server side via internet communication. For instance, the communications may be sent up to the cloud and the adjustment/retraining of the machine learning model might be the emotional intelligence could live on the PVA device (for performance or privacy reasons) or in the cloud (or a combination of both).

However, Chandrasekaran does not teach explicitly all possible variations of data storage/communications to the respective components between local and remote computing devices. Kim, on the other hand, teaches an electronic device (electronic device 1) comprising: an audio module (first command receiver 121); 5a communication circuitry (communication unit 13); a microphone (first command receiver 121 comprises a microphone); a memory storing programming instructions (storage unit 15); and a processor (controller 14), wherein the programming instructions are executable by the processor to cause the electronic device to:  
[Col 7, lines 4-13] The command receiver 12 receives a user's voice command. For example, the command receiver 12 may include a first command receiver 121 to receive a user's voice command. The first command receiver 121 may include a microphone to convert a received user's voice command into a voice signal. The command receiver 12 may further include a second command receiver 122 to receive a user's manipulation command. The second command receiver 122 may be implemented as a remote control signal receiver which receives a remote control signal including key input information corresponding to a user's manipulation command from a remote controller (not shown) or as a manipulation panel which is provided in the electronic device 1 and generates key input information corresponding to a user's manipulation.

[Col 6, lines 30-38] FIG. 2 is a block diagram of an electronic device 1 according to an exemplary embodiment. The electronic device 1 may include an operation performer 11, a command receiver 12, a communication unit 13 (e.g., communicator such as a wired and/or wireless interface, port, card, dongle, etc.), and a controller 14. The electronic device 1 may further include a storage unit 15 (e.g., a storage such as RAM, ROM, flash memory, a hard disk drive, etc.). The operation performer 11 performs operations of the electronic device 1. For example, if the electronic device 1 includes a display apparatus such as a TV, the operation performer 11 may include a signal receiver 111, an image processor 112, and a display unit 113 (e.g., a display such as a liquid crystal display panel, a plasma display panel, an organic light emitting diode display, etc.). However, it is understood that the operation performer 11 corresponds to operations of the product which realizes the electronic device 1, and is not limited to the example shown in FIG. 2

receive a voice command from a user (see Fig. 12, user’s voice command) via the microphone;  
[Col 7, lines 1-4] the command receiver 12 may include a first command receiver 121 to receive a user's voice command. The first command receiver 121 may include a microphone to convert a received user's voice command into a voice signal.

Obtain situation information (see Fig. 12, converted text) based on the voice command,  
[Col 2, lines 39-42] The controller may transmit the user's voice command to a second server, receive a text into which the voice command has been converted, from the second server, and transmits the received text to the first server.

[Col 9, line 56-Col 10, line 6] FIG. 12 illustrates an example of a speech-to-text (STT) server 4 according to an exemplary embodiment. The electronic device 1 may process the information regarding the user's voice command, i.e., the voice made by the user, into a text. For example, the electronic device 1 transmits the received user's voice command to the STT server 4. The STT server 4 includes an STT converter 41 which converts the user's voice command transmitted by the electronic device 1 into a corresponding text. The STT server 4 transmits the text into which the user's voice command has been converted, to the electronic device 1. The electronic 

transmit the situation information to a second external electronic device (see Fig. 12, the electronic device transmits the text provided by STT server 4 to analysis server 2) via the communication 15module ; and  
[Col 10, lines 4-5] The electronic device 1 may transmit the text provided by the STT server 4 to the server [2] 

[Col 7, lines 14-19] The communication unit 13 communicates with the analysis server 2 through the network 3. The communication unit 13 exchanges the user's voice command and the information regarding the analysis result with the analysis server 2 under a control of the controller 14.

receive content (see Fig. 12, control command information; Fig. 13 stored command list 131) corresponding to the situation information from the second external electronic device and   
[Col 10, lines 7-22] At operation S71, the electronic device 1 transmits a user's voice command to the analysis server 2. At operation S72, the electronic device 1 identifies whether the control command information corresponding to the user's voice command has been received from the analysis server 2. If the electronic device 1 has received the control command information corresponding to the user's voice command from the analysis server 2, the electronic device 1 operates according to the control command information transmitted by the analysis server 2 at operation S73

reproduce the received content:   
stores therein the voice command said by a user, and upon a user's request, may display the list of the stored voice commands as a UI 131. As shown in FIG. 13, the list of the stored voice commands displayed as the UI 131 shows voice commands 132 which have been said by a user. The electronic device 1 may store the voice commands per user, and show the stored voice commands 132 per user (reference numeral 133). The electronic device 1 may display the list of the stored voice commands in which the voice commands 132 are sorted in order of how many times the voice commands 132 have been said by a user.

As noted above, Kim teaches a configuration wherein situation information is transmitted from an electronic device to a second external electronic device via an electronic device (see Fig. 12), and content is received from the second external device to the electronic device for reproducing the content. Since all the claimed elements would continue to operate in the same manner, specifically the network, the electronic device, machine 700/PVA module 612, PVA server 630, module 614, and the module server 620 of  Chandrasekaran, it would have been obvious to one of ordinary skill in the art before the effective filing date to utilize the compatible server-client architecture taught in Kim to support the transmission and retrieval of (a) device information from the electronic device to the module server 620 and (b) situation information from the first module server 620 to the second PVA server 630 through the client device 610. As taught by Kim, the architecture of the electronic device, the analysis server and the STT server establishes server-client communications 
With regard to Claim 7, Furthermore Chandrasekaran teaches the electronic device, wherein the device information 20includes information obtained from at least one electronic appliance (at least one sensing device (e.g. smart home device, a smart appliance)) disposed in a location where the electronic device is located, and wherein the at least one electronic appliance includes at least one of an Internet of Things (IoT) device and a smart device:
[Col 22, lines 40-44] Example 4 is an example as in Example 1 wherein the at least one sensing device further senses the user's context from at least one of the user's location, local time, schedule, and surroundings and the user's prior interactions with the personal virtual assistant system or other devices.
[Col 11, lines 9-13] It will be appreciated that the PVA in particular embodiments may include a variety of voice, text, or other communication interfaces, and may operate to collect a variety of location and context information of a user for personal customization of information and actions.  

[Col 18, lines 5-40] FIG. 7 is a block diagram illustrating components of a machine 700 which may be a PVA 612, for example, which … perform any one or more of the methodologies discussed herein. … The machine 700 may comprise, but not be limited to, a server computer, a client computer, PC, a tablet computer, a laptop computer, a netbook, a personal digital assistant (PDA), an entertainment media system, a cellular telephone, a smart phone, a mobile device, a wearable device (e.g., a smart watch), a smart home device (e.g., a smart appliance), other smart devices ... Further, while only a single machine 700 is illustrated, the term “machine” shall also be taken to include a collection of machines 700 that individually or jointly execute the instructions 716 to perform any one or more of the methodologies discussed herein.

With regard to Claim 8, the combination of Chandrasekaran and Kim teach all the limitations of Claim 7. Furthermore Chandrasekaran teaches the electronic device, wherein the device information is configured to include at least one of operation level information (tactile input components 754), time information (local time, time spent exercising 756), illumination information (illumination sensor components 760), arrangement information (position components 762) for an electronic appliance:
[Col 18, lines 5-6] FIG. 7 is a block diagram illustrating components of a machine 700 which may be a PVA 612

[Col 5, lines 31-39] The following disclosure provides an overview of techniques and configurations to enable a PVA such as Cortana™ available from Microsoft Corporation to take inputs such as a user's tone of voice, language used, facial expression, recent interactions with other devices, context on the user's location, local time, schedule, surroundings, and the like and to process those inputs to provide outputs that have augmented speed, tone, and language for communication with the user.
 
and location information of the user (location sensor components 762):
[Col 19, line 32 – Col 20, line 38] The I/O components 750 may include a wide variety of components to receive input, provide output, produce output, transmit information, exchange information, capture measurements, and so on. … In various example embodiments, the I/O components 750 may include output components 752 and input components 754. The output components 752 may include … acoustic components (e.g., speakers), … and improve the user's mood to an optimal state as described herein. The input components 754 may include … audio input components (e.g., a microphone), and the like. In further example embodiments, the I/O components 750 may include biometric components 756, motion components 758, environmental components 760, or position components 762, among a wide array of other components. For example, the biometric components 756 may include components to detect expressions (e.g., hand expressions, facial expressions, vocal expressions, body gestures, or eye tracking), measure bio-signals (e.g., blood pressure, heart rate, body temperature, perspiration, or brain waves), measure exercise-related metrics (e.g., distance moved, speed of movement, or time spent exercising) identify a person (e.g., voice identification, retinal identification, facial identification, fingerprint identification, or electroencephalogram based identification), and the like. The motion components 758 may include acceleration sensor components (e.g., accelerometer), gravitation sensor components, rotation sensor components (e.g., gyroscope), and so forth. The environmental components 760 may include, for example, illumination sensor components (e.g., photometer), temperature sensor components (e.g., one or more thermometers that detect ambient temperature), humidity sensor components, pressure sensor components (e.g., barometer), acoustic sensor components (e.g., one or more microphones that detect background noise), proximity sensor components (e.g., infrared sensors that detect nearby objects), … or other components that may provide indications, measurements, or signals corresponding to a surrounding physical environment. The position components 762 may include location sensor components (e.g., a Global Position System (GPS) receiver component), altitude sensor components (e.g., altimeters or barometers that detect air pressure from which altitude may be derived), orientation sensor components (e.g., magnetometers), and the like.

With regard to Claim 9, the combination of Chandrasekaran and Kim teach all the limitations of Claim 7. Furthermore Chandrasekaran teaches the electronic device, wherein the first external electronic device includes a 620 is communicatively coupled with signal processor 46 and cognitive/AI services 48) storing the device information (collected/captured on storage device),
[Col 5, lines 46-58] The system described herein personalizes the PVA's responses and adapts the responses over time based on the user's reactions. The system uses multi-modal sensing and a feedback loop to constantly adjust both the content and the delivery of the PVA's responses so that the responses are more empathetic and designed to improve the user's mood. For example, audio and video of the user's response is captured and processed with artificial intelligence (AI) services that infer the user's emotional state (happy/sad, energetic/lethargic, etc.). Additionally, input from wearables and the surrounding environment is also captured and processed to further assess the conditions under which the interaction with the user is taking place.

[Col 7, lines 35-51] As shown in FIG. 3, the inputs to the ML classifier 30 may include features determined from user preferences data 32 provided by the user during system setup, cultural references data 34 collected by the system during use, and user state information 36 captured using a variety of sensing devices. For example, the user state information 36 may include outputs from a microphone 38, a camera 40, wearable sensing devices 42, environmental sensors 44, and the like. The user state information 36 is provided to a signal processor 46 including cognitive/AI services 48 for generating the user features. For example, the cognitive/AI services 48 may include one or more of the cognitive services available on the Azure service platform provided by Microsoft Corporation (azure.microsoft.com/en-us/services). For example, emotion API (azure.microsoft.com/en-us/services/cognitive-services/emotion) may be used in sample embodiments herein.

[Col 14, lines 18-23] Similarly, the modules 614 of the client device 610 may provide the user state features 36, user preference features data 32, and cultural reference features data 34 to one or more module servers 620 for implementation of the cognitive/AI services 48 and machine learning services 30 described above.

[Col 15, lines 47-52] operational data may be included within respective components or modules, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices.

[Col 11, lines 9-13] It will be appreciated that the PVA in particular embodiments may include a variety of voice, text, or other communication interfaces, and may operate to collect a variety of location and context information of a user for personal customization of information and actions. 

and wherein the first external electronic device is configured to update the 10device information (see Fig. 3, emotional state history) upon detecting a reception of a voice command (interaction data):
[Col 8, lines 48-59] As also depicted in FIG. 3, the PVA system may include a data store 53 that stores a history of past user interactions with the PVA and/or emotional states for use with the heuristics 50. The data store 53 is updated with interaction data, emotional state inferences, etc. over time. As described herein, the historical data is useful for establishing a baseline emotional state for each user. Also, the history of past interactions (defined at the session/conversation level, for each dialog turn, or for each user action (in a non-conversational instantiation)) can be used to normalize feature values in the ML classifier 30, among other things that will be apparent to those skilled in the art.

With regard to Claims 11 and 15, Chandrasekaran teaches (a) an electronic device (machine 700) comprising: an audio module (interaction skill); 5a communication circuitry (communication components 764); a microphone (audio input components (e.g., a microphone) 754); a memory storing programming instructions (memory 730); and a processor (processor 710), and (b) a method for an electronic device (instructions 716), wherein the programming instructions are executable by the processor to cause the electronic device to:
[Col 18, lines 11-21] FIG. 7 shows a diagrammatic representation of the machine 700 in the example form of a computer system, within which instructions 716 … the machine 700 operates as a standalone device
[Col 18, lines 41-52] The machine 700 may include processors 710, memory/storage 730, and I/O components 750, which may be configured to communicate with each other such as via a bus 702. In an example embodiment, the processors 710 … may include, for example, a processor 712 and a processor 714 that may execute the instructions 716. 
[Col 11, lines 35-37] PVAs may use skills or similar functions to complete tasks and perform certain actions. A brief example of a skill might include a restaurant interaction skill,

[Col 20, lines 39-46] Communication may be implemented using a wide variety of technologies. The I/O components 750 may include communication components 764 operable to couple the machine 700 to a network 780 or devices 770 via a coupling 782 and a coupling 772, respectively. For example, the communication components 764 may include a network interface component or other suitable device to interface with the network 780. 

receive a voice command from a user (see Fig. 3, current conversational state information 20) via the microphone;
[Col 11, lines 35-40] PVAs may use skills or similar functions to complete tasks and perform certain actions. A brief example of a skill might include a restaurant interaction skill, allowing a user to issue a command, such as “Reserve a table at Mario's Italian Restaurant,” or “Order a Coffee from Fourth Coffee Company.” 

[Col 5, lines 53-58] For example, audio and video of the user's response is captured and processed with artificial intelligence (AI) services that infer the user's emotional state (happy/sad, energetic/lethargic, etc.). Additionally, input from wearables and the surrounding environment is also captured 

requesting, in response to receiving the voice command, content (module 614 requests specified inferences 24 from backend PVA server 630, see Fig. 4, contextualized text, contextualized tone is derived from inferences and sent to TTS subsystem 28) from the PVA system (PVA server 630) wherein the content corresponds to situation information (PVA system obtains inferred emotional state from backend module server 620, which is communicatively coupled with signal processor 46 and cognitive/AI services 48):
[Col 5, lines 46-58] The system described herein personalizes the PVA's responses and adapts the responses over time based on the user's reactions. The system uses multi-modal sensing and a feedback loop to constantly adjust both the content and the delivery of the PVA's responses so that the responses are more empathetic and designed to improve the user's mood. For example, audio and video of the user's response is captured and processed with artificial intelligence (AI) services that infer the user's emotional state (happy/sad, energetic/lethargic, etc.). Additionally, input from wearables and the surrounding environment is also captured and processed to further assess the conditions under which the interaction with the user is taking place.

[Col 7, lines 6-9] In the embodiment of FIG. 2B, observations of the user and the conversation state produce an inferred emotional state which is, in turn, used to select the emotionally appropriate conversational responses and tone of voice

[Col 6, line 64 - Col 7, line 6] FIG. 2B illustrates a PVA system as described herein that maintains a conversational state 20 and infers the emotional state of a conversation 24 and uses the inferred emotional state to generate appropriate responses using response selector 26 that are sent to a text-to-speech (TTS) subsystem 28 that provides a contextualized response and tone in an example embodiment. TTS subsystem 28 may be any of a number of available TTS subsystems including the Bing Speech API available through Microsoft Azure, for example.

[Col 9, lines 58-66] FIG. 4 illustrates how, given a conversational state 20 and inferred emotional state 24, response selector 26 may vary the text response sent to the TTS subsystem 28 whereby responses may be varied based on rules or lookups against pre-selected responses or encoded as responses generated by a learned model such as a deep neural network to generate conversationally and emotionally appropriate responses having the appropriately contextualized text and tone according to an example embodiment

and the situation information is identified by the PVA system based on device information (see PVA system of Fig. 3, which inputs device information into ML classifier 30, the device information comprising emotional state history 53 and user state information 36 captured using a variety of sensing devices) and the voice command:
[Col 7, lines 35-51] As shown in FIG. 3, the inputs to the ML classifier 30 may include features determined from user preferences data 32 provided by the user during system setup, cultural references data 34 collected by the system during use, and user state information 36 captured using a variety of sensing devices. For example, the user state information 36 may include outputs from a microphone 38, a camera 40, wearable sensing devices 42, environmental sensors 44, and the like. The user state information 36 is provided to a signal processor 46 including cognitive/AI services 48 for generating the user features. For example, the cognitive/AI services 48 may include one or more of the cognitive services available on the Azure service platform provided by Microsoft Corporation (azure.microsoft.com/en-us/services). For example, emotion API (azure.microsoft.com/en-us/services/cognitive-services/emotion) may be used in sample embodiments herein.

Similarly, the modules 614 of the client device 610 may provide the user state features 36, user preference features data 32, and cultural reference features data 34 to one or more module servers 620 for implementation of the cognitive/AI services 48 and machine learning services 30 described above.

[Col 11, lines 9-13] It will be appreciated that the PVA in particular embodiments may include a variety of voice, text, or other communication interfaces, and may operate to collect a variety of location and context information of a user for personal customization of information and actions. 

after receiving the content (see Fig. 4, contextualized text, contextualized tone), and outputting the received content by controlling, by a processor, the audio module:
[Col 7, lines 35-51] As shown in FIG. 3, the inputs to the ML classifier 30 may include features determined from user preferences data 32 provided by the user during system setup, cultural references data 34 collected by the system during use, and user state information 36 captured using a variety of sensing devices. For example, the user state information 36 may include outputs from a microphone 38, a camera 40, wearable sensing devices 42, environmental sensors 44, and the like. The user state information 36 is provided to a signal processor 46 including cognitive/AI services 48 for generating the user features. For example, the cognitive/AI services 48 may include one or more of the cognitive services available on the Azure service platform provided by Microsoft Corporation (azure.microsoft.com/en-us/services). For example, emotion API (azure.microsoft.com/en-us/services/cognitive-services/emotion) may be used in sample embodiments herein.

[Col 7, lines 6-9] In the embodiment of FIG. 2B, observations of the user and the conversation state produce an inferred emotional state which is, in turn, used to select the emotionally appropriate conversational responses and tone of voice
 
[Col 6, line 64 - Col 7, line 6] FIG. 2B illustrates a PVA system as described herein that maintains a conversational state 20 and 24 and uses the inferred emotional state to generate appropriate responses using response selector 26 that are sent to a text-to-speech (TTS) subsystem 28 that provides a contextualized response and tone in an example embodiment. TTS subsystem 28 may be any of a number of available TTS subsystems including the Bing Speech API available through Microsoft Azure, for example.

[Col 6, line 64 - Col 7, line 3] FIG. 2B illustrates a PVA system … and uses the inferred emotional state to generate appropriate responses using response selector 26 that are sent to a text-to-speech (TTS) subsystem 28 that provides a contextualized response and tone in an example embodiment. 

Chandrasekaran does not explicitly state that the content is requested by the electronic device and transmitted from the external device 630, thus causing the external electronic device 630 to determine content based on the device information and the voice command. However, Chandrasekaran teaches that the PVA server 630 implements instructions components of machine 700/PVA module 612, and processes requests from module 614. Similarly, components of the PVA system are taught by Chandrasekaran to determine the device information based on the voice command either directly or indirectly from communications with module 614/backend module 620.  Chandrasekaran also teaches that PVA 612 and module 614 can be processed on a single computing device 610 (Col 13, lines 46-48) with backend servers 620 (see Col 14, lines 13-14) and 630 (see Col 14, lines 52-57). Moreover, Chandrasekaran teaches a request for a specified inference from the inference store 640 coupled to backend PVA server 630 (see Col 
[Col 15, lines 42-52] In particular, some aspects of the described process (such as the command and control service) may take place on a different processing system (e.g., in a computer in a cloud-hosted data center) ... Similarly, operational data may be included within respective components or modules, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices.
 
[Col 18, lines 5-27] FIG. 7 is a block diagram illustrating components of a machine 700 which may be a PVA 612, for example, which according to some example embodiments is able to read instructions from a machine-readable medium (e.g., a machine-readable storage medium) and perform any one or more of the methodologies discussed herein … the machine 700 operates as a standalone device or may be coupled (e.g., networked) to other machines. In a networked deployment, the machine 700 may operate in the capacity of a server machine or a client machine in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine 700 may comprise, but not be limited to, a server computer, a client computer

[Col 21, lines 30-38] The instructions 716 may be transmitted or received over the network 780 using a transmission medium via a network interface device (e.g., a network interface component included in the communication components 764) and utilizing any one of a number of well-known transfer protocols (e.g., HTTP). Similarly, the instructions 716 may be transmitted or received using a transmission medium via the coupling 772 (e.g., a peer-to-peer coupling) to the devices 770.

[Col 21, line 60 – Col 22, line 5] Those skilled in the art will further appreciate that the personal virtual assistant described the personal virtual assistant system includes sensors but the AI and machine learning features are implemented on the server side via internet communication. For instance, the communications may be sent up to the cloud and the adjustment/retraining of the machine learning model might be done offline by another computer system or in a batch process. On the other hand, the emotional intelligence could live on the PVA device (for performance or privacy reasons) or in the cloud (or a combination of both).

However, Chandrasekaran does not teach explicitly all possible variations of data storage/communications to the respective components between local and remote computing devices. Kim, on the other hand, teaches an electronic device (electronic device 1) comprising: an audio module (first command receiver 121); 5a communication circuitry (communication unit 13); a microphone (first command receiver 121 comprises a microphone); a memory storing programming instructions (storage unit 15); and a processor (controller 14), wherein the programming instructions are executable by the processor to cause the electronic device to:
[Col 7, lines 4-13] The command receiver 12 receives a user's voice command. For example, the command receiver 12 may include a first command receiver 121 to receive a user's voice command. The first command receiver 121 may include a microphone to convert a received user's voice command into a voice signal. The command receiver 12 may further include a second command receiver 122 to receive a user's manipulation command. The second command receiver 122 may be implemented as a remote control signal receiver which receives a remote control signal including key input information corresponding to a user's manipulation command from a remote controller (not shown) or as a manipulation panel which is 1 and generates key input information corresponding to a user's manipulation.

[Col 6, lines 30-38] FIG. 2 is a block diagram of an electronic device 1 according to an exemplary embodiment. The electronic device 1 may include an operation performer 11, a command receiver 12, a communication unit 13 (e.g., communicator such as a wired and/or wireless interface, port, card, dongle, etc.), and a controller 14. The electronic device 1 may further include a storage unit 15 (e.g., a storage such as RAM, ROM, flash memory, a hard disk drive, etc.). The operation performer 11 performs operations of the electronic device 1. For example, if the electronic device 1 includes a display apparatus such as a TV, the operation performer 11 may include a signal receiver 111, an image processor 112, and a display unit 113 (e.g., a display such as a liquid crystal display panel, a plasma display panel, an organic light emitting diode display, etc.). However, it is understood that the operation performer 11 corresponds to operations of the product which realizes the electronic device 1, and is not limited to the example shown in FIG. 2

receive a voice command from a user (see Fig. 1, user’s voice command) via the microphone;  
[Col 7, lines 1-4] the command receiver 12 may include a first command receiver 121 to receive a user's voice command. The first command receiver 121 may include a microphone to convert a received user's voice command into a voice signal.

requesting, in response to receiving the voice command, content (see Fig. 1, control command information) from the an external electronic device (see Fig. 1, analysis server 2) wherein the content corresponds to situation information (see Fig. 1, filtered user’s voice command based on presence in voice recognition command list) and the situation information is identified by the first electronic device based on device information (voice recognition 
[Col 5, line 54 – Col 6, line 18] The analysis server 2 is connected to the network 3, analyzes a service regarding a user's voice command, i.e., a user's voice command for the electronic device 1 as its client, and transmits the analysis result to the electronic device 1. The analysis server 2 according to the present exemplary embodiment transmits, to the electronic device 1, a voice recognition command list including a voice recognition command that is among user's voice commands which have successfully been recognized a predetermined number of times or more and corresponding control command information. The control command information is used to control the electronic device 1 to operate as desired by a user under the voice recognition command … Conversely, if the user's voice command does not correspond to a voice recognition command included in the voice recognition command list, the electronic device 1 transmits the user's voice command to the analysis server 2. The analysis server 2 analyzes the user's voice command transmitted by the electronic device 1 and transmits corresponding control command information to the electronic device 1. The electronic device 1 operates according to the control command information transmitted by the analysis server 2.

after receiving the content (see Fig. 1,control command information is received by the electronic device), outputting the received content by controlling a processor (see Col 6, lines 14-16 the electronic device 1 operates according to the control command information transmitted by the analysis server 2). 
As noted above, Kim teaches a configuration wherein situation information is transmitted from an electronic device to an external electronic device via an electronic device (see Fig. 1), and content is received from through the client device 610. As taught by Kim, the architecture of the electronic device and the analysis server establishes server-client communications which are initialized based on a voice command and responds with content. As an artisan of ordinary skill in the art would appreciate, this combination of the devices of Chandrasekaran configured in the manner set forth in Kim would be no more "than the predictable use of prior-art elements according to their established functions" (See KSR).
With regard to Claims 12 and 16, the combination of Chandrasekaran and Kim teach all the limitations of Claims 11 and 15, respectively. Furthermore Chandrasekaran teaches the electronic device and the method, wherein the device information 20includes information obtained from at least one electronic appliance (at least one sensing device (e.g. smart home 
[Col 22, lines 40-44] Example 4 is an example as in Example 1 wherein the at least one sensing device further senses the user's context from at least one of the user's location, local time, schedule, and surroundings and the user's prior interactions with the personal virtual assistant system or other devices.
[Col 11, lines 9-13] It will be appreciated that the PVA in particular embodiments may include a variety of voice, text, or other communication interfaces, and may operate to collect a variety of location and context information of a user for personal customization of information and actions.  

[Col 18, lines 5-40] FIG. 7 is a block diagram illustrating components of a machine 700 which may be a PVA 612, for example, which … perform any one or more of the methodologies discussed herein. … The machine 700 may comprise, but not be limited to, a server computer, a client computer, PC, a tablet computer, a laptop computer, a netbook, a personal digital assistant (PDA), an entertainment media system, a cellular telephone, a smart phone, a mobile device, a wearable device (e.g., a smart watch), a smart home device (e.g., a smart appliance), other smart devices ... Further, while only a single machine 700 is illustrated, the term “machine” shall also be taken to include a collection of machines 700 that individually or jointly execute the instructions 716 to perform any one or more of the methodologies discussed herein.

With regard to Claims 13 and 17, the combination of Chandrasekaran and Kim teach all the limitations of Claims 11 and 16, respectively. Furthermore Chandrasekaran teaches the electronic device and the method, wherein the device information is configured to include at least one of operation level information (tactile input components 754), time information 756), illumination information (illumination sensor components 760), arrangement information (position components 762) for an electronic appliance:
[Col 18, lines 5-6] FIG. 7 is a block diagram illustrating components of a machine 700 which may be a PVA 612

[Col 5, lines 31-39] The following disclosure provides an overview of techniques and configurations to enable a PVA such as Cortana™ available from Microsoft Corporation to take inputs such as a user's tone of voice, language used, facial expression, recent interactions with other devices, context on the user's location, local time, schedule, surroundings, and the like and to process those inputs to provide outputs that have augmented speed, tone, and language for communication with the user.
 
and location information of the user (location sensor components 762):
[Col 19, line 32 – Col 20, line 38] The I/O components 750 may include a wide variety of components to receive input, provide output, produce output, transmit information, exchange information, capture measurements, and so on. … In various example embodiments, the I/O components 750 may include output components 752 and input components 754. The output components 752 may include … acoustic components (e.g., speakers), … and improve the user's mood to an optimal state as described herein. The input components 754 may include … audio input components (e.g., a microphone), and the like. In further example embodiments, the I/O components 750 may include biometric components 756, motion components 758, environmental components 760, or position components 762, among a wide array of other components. For example, the biometric components 756 may include components to detect expressions (e.g., hand expressions, facial expressions, vocal expressions, body gestures, or eye tracking), measure bio-signals (e.g., blood pressure, heart rate, body temperature, perspiration, or brain waves), measure exercise-related metrics (e.g., distance moved, speed of movement, or time spent exercising) identify a person (e.g., voice identification, retinal identification, facial identification, fingerprint identification, or electroencephalogram based identification), and the like. The motion components 758 may include acceleration sensor components (e.g., accelerometer), gravitation sensor components, rotation sensor components (e.g., gyroscope), and so forth. The environmental components 760 may include, for example, illumination sensor components (e.g., photometer), temperature sensor components (e.g., one or more thermometers that detect ambient temperature), humidity sensor components, pressure sensor components (e.g., barometer), acoustic sensor components (e.g., one or more microphones that detect background noise), proximity sensor components (e.g., infrared sensors that detect nearby objects), … or other components that may provide indications, measurements, or signals corresponding to a surrounding physical environment. The position components 762 may include location sensor components (e.g., a Global Position System (GPS) receiver component), altitude sensor components (e.g., altimeters or barometers that detect air pressure from which altitude may be derived), orientation sensor components (e.g., magnetometers), and the like.

With regard to Claim 14, the combination of Chandrasekaran and Kim teach all the limitations of Claim 11. Furthermore Chandrasekaran teaches an electronic device (client device 610), wherein the external 15electronic device (PVA backend server 630) includes a server storing pieces of content (inferences in inference store 640), the server configured to determine a content theme (See Fig. 4, contextualized text and contextualized tone are determined from inferences) from among a plurality of stored content themes (PVA server 630 stores the multiple inferences in inference store 640) corresponding to the situation information (specified inference, user consent data) received from the electronic device:
[Col 14, lines 42-57] According to some implementations, the PVA server 630 determines, based on interaction of the user with the PVA 612 at one or more client devices 610 associated with an account of the user, multiple inferences about the user (e.g., about the user's mood). The PVA server 630 stores the multiple inferences in the inference store 640. The PVA server 630 stores, in the user consent data store 650, user consent data representing whether the user provided consent for the module(s) 614 to access at least a portion of the inferences in the inference store 640. The PVA server 630 receives, from the module 614, a request for a specified inference from the inference store 640. The PVA server 630 verifies the user consent data associated with the specified inference and the module 614. The PVA server 630 provides the specified inference to the module 614 in response to verifying the user consent data.

determine a content (See Fig. 4, contextualized text-to-speech output) corresponding to the 20determined content theme, and transmit at least the determined content to the electronic device:
[Col 6, lines 47-58] In a specific example, the communications that the PVA provides could be modified based on the inferred mood of the user. The PVA would tag the communications with various characteristics to build an initial model. For example, in the case of a joke, such characteristics may include: type (e.g., knock-knock, question-answer, animals, personal anecdotes, special dates, etc.) and topic (e.g., current events, people, sports). Then, after the communication is provided, the user's reaction (e.g., smile, laugh, grimace, groan) is captured using sensors and a feedback mechanism is used to personalize future communications for that user.
[Col 14, lines 51-57]  PVA server 630 receives, from the module 614, a request for a specified inference from the inference store 640. The PVA server 630 verifies the user consent data associated with the specified inference and the module 614. The PVA server 630 provides the specified inference to the module 614 in response to verifying the user consent data.

[Col 5, lines 31-39] The following disclosure provides an overview of techniques and configurations to enable a PVA such as Cortana™ available from Microsoft Corporation to take inputs such as a user's tone of voice, language used, facial expression, recent interactions with other devices, context on the user's location, provide outputs that have augmented speed, tone, and language for communication with the user. 

Chandrasekaran does not teach the server 630 storing device information. However Chandrasekaran teaches the PVA system 612 storing device information (data store 53):
[Col 8, lines 47-59] As also depicted in FIG. 3, the PVA system may include a data store 53 that stores a history of past user interactions with the PVA and/or emotional states for use with the heuristics 50. The data store 53 is updated with interaction data, emotional state inferences, etc. over time. As described herein, the historical data is useful for establishing a baseline emotional state for each user. Also, the history of past interactions (defined at the session/conversation level, for each dialog turn, or for each user action (in a non-conversational instantiation)) can be used to normalize feature values in the ML classifier 30, among other things that will be apparent to those skilled in the art.

However, Chandrasekaran teaches that the PVA system can be performed exclusively on a single computing device or in combination with a remote computing system (see Col 12, lines 3-21):
[Col 15, lines 42-52] In particular, some aspects of the described process (such as the command and control service) may take place on a different processing system (e.g., in a computer in a cloud-hosted data center) ... Similarly, operational data may be included within respective components or modules, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices.

[Col 18, lines 5-27] FIG. 7 is a block diagram illustrating components of a machine 700 which may be a PVA 612, for example, which according to some example embodiments is able to read instructions from a machine-readable medium (e.g., a machine-readable storage medium) and perform any one or more of the methodologies discussed herein … the machine 700 operates as a standalone device or may be coupled (e.g., networked) to other machines. In a networked deployment, the machine 700 may operate in the capacity of a server machine or a client machine in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine 700 may comprise, but not be limited to, a server computer, a client computer,

[Col 21, lines 30-38] The instructions 716 may be transmitted or received over the network 780 using a transmission medium via a network interface device (e.g., a network interface component included in the communication components 764) and utilizing any one of a number of well-known transfer protocols (e.g., HTTP). Similarly, the instructions 716 may be transmitted or received using a transmission medium via the coupling 772 (e.g., a peer-to-peer coupling) to the devices 770.

[Col 21, line 60 – Col 22, line 5] Those skilled in the art will further appreciate that the personal virtual assistant described herein may be implemented in an embodiment where … the emotional intelligence could live on the PVA device (for performance or privacy reasons) or in the cloud (or a combination of both).

Claims 5, and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Chandrasekaran, in view of Kim, and further in view of Zomet et al. (US Patent Pub 2016/0260135 A1).
With regard to Claim 5, the combination of Chandrasekaran and Kim teach all the limitations of Claim 1. Furthermore Chandrasekaran teaches an electronic device (client device 610), wherein the second external 15electronic device (PVA backend server 630) includes a server storing content (inferences in inference store 640), the server configured to determine a content theme (See Fig. 4, contextualized text and contextualized tone are 630 stores the multiple inferences in inference store 640) corresponding to the situation information (specified inference, user consent data) received from the first external electronic device:
[Col 14, lines 42-57] According to some implementations, the PVA server 630 determines, based on interaction of the user with the PVA 612 at one or more client devices 610 associated with an account of the user, multiple inferences about the user (e.g., about the user's mood). The PVA server 630 stores the multiple inferences in the inference store 640. The PVA server 630 stores, in the user consent data store 650, user consent data representing whether the user provided consent for the module(s) 614 to access at least a portion of the inferences in the inference store 640. The PVA server 630 receives, from the module 614, a request for a specified inference from the inference store 640. The PVA server 630 verifies the user consent data associated with the specified inference and the module 614. The PVA server 630 provides the specified inference to the module 614 in response to verifying the user consent data.

determine a content (See Fig. 4, contextualized text-to-speech output) corresponding to the 20determined content theme, and transmit at least the determined content to the electronic device.:
[Col 6, lines 47-58] In a specific example, the communications that the PVA provides could be modified based on the inferred mood of the user. The PVA would tag the communications with various characteristics to build an initial model. For example, in the case of a joke, such characteristics may include: type (e.g., knock-knock, question-answer, animals, personal anecdotes, special dates, etc.) and topic (e.g., current events, people, sports). Then, after the communication is provided, the user's reaction (e.g., smile, laugh, grimace, groan) is captured using sensors and a feedback mechanism is used to personalize future communications for that user.
[Col 14, lines 51-57]  PVA server 630 receives, from the module 614, a request for a specified inference from the inference store 640. The PVA server 630 verifies the user consent data associated with the specified inference and the module 614. The PVA server 630 provides the specified inference to the module 614 in response to verifying the user consent data.

[Col 5, lines 31-39] The following disclosure provides an overview of techniques and configurations to enable a PVA such as Cortana™ available from Microsoft Corporation to take inputs such as a user's tone of voice, language used, facial expression, recent interactions with other devices, context on the user's location, local time, schedule, surroundings, and the like and to process those inputs to provide outputs that have augmented speed, tone, and language for communication with the user. 

Chandrasekaran in view of Kim does not teach the server storing and transmitting media content. However, Zomet teaches an external device 10 storing media content from a media server 67: 
[0045] In some embodiments, the smart-device environment 30 may be in communication with one or more servers 67 that supply content to a device 10 (e.g., portable electronic device 66, TV, computer) … The devices 10 that receive the content from the servers 67 may select at least a piece of the content to display based on people and/or object data obtained via sensors or received from another device 10 within the environment 30, a score assigned by the server to each piece of content, or both.

[0101] Having discussed the smart-device environment 30, the discussion now turns to providing privacy-aware personalized content via smart devices 10. FIG. 5 is a schematic drawing of a system 130 that provides privacy-aware content 132 to an occupant via smart devices in the smart home environment 30, in accordance with an embodiment. Generally, the system 130 may enable providing a set of content 132 via the one or more servers 67 to one or more client devices 134. The one or more servers 67 may provide the set of content 132 via a network, such as the Internet 62, to the client devices 134. The set of content 132 may include numerous different types of content 132. For example, the types of content 132 in the set of content 132 may include advertisements, movie/television show recommendations, shopping recommendations, application (e.g., Google Now) alerts, and/or modified search results rankings, among others.

Kim teaches a configuration wherein data is transmitted and received between a first external electronic device and second external electronic device via an electronic device (see Fig. 12), and receiving content from the second external device to the electronic device for reproducing the content. Since all the claimed elements would continue to operate in the same manner, specifically the network, the electronic device, PVA module 612, PVA server 630, module 614, and the module server 620 of  Chandrasekaran, it would have been obvious to one of ordinary skill in the art before the effective filing date to utilize the compatible server-client architecture taught in Kim to support the transmission and retrieval of (a) device information from the electronic device to the module server 620 and (b) situation information from the first module server 620 to the second PVA server 630 through the client device 610. As taught by Kim, the architecture of the electronic device, the analysis server and the STT server establishes server-client communications which are initialized based on a voice command and responds with content. As an artisan of ordinary skill in the art would appreciate, this combination of the devices of Chandrasekaran configured in the 
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Chandrasekaran and Kim with the components of Zomet that add the capability to store, determine, and transmit media content to the device of Chandrasekaran. As would be appreciated by one of ordinary skill in the art before the effective filing date of the claimed invention, the capability to provide personalized and situation-based media, in addition to the personalized and situation-based jokes taught by Chandrasekaran (see Col 6, lines 47-58), would “ensure a desirable amount of diversity between the content” (see [0139] of Zomet).
With regard to Claim 10, the combination of Chandrasekaran and Kim teach all the limitations of Claim 6. Furthermore Chandrasekaran teaches an electronic device (machine 700/PVA module 612), wherein the second external 15electronic device (PVA backend server 630 configured to transmit/receive situation-information from machine 700, as taught by the architecture of the electronic device and analysis server of Kim) includes a server storing content (inferences in inference store 640), the server configured to determine a content theme (See Fig. 4, contextualized text and contextualized tone are determined from inferences) from among a 630 stores the multiple inferences in inference store 640) corresponding to the situation information (specified inference, user consent data) received from the first external electronic device:
[Col 14, lines 42-57] According to some implementations, the PVA server 630 determines, based on interaction of the user with the PVA 612 at one or more client devices 610 associated with an account of the user, multiple inferences about the user (e.g., about the user's mood). The PVA server 630 stores the multiple inferences in the inference store 640. The PVA server 630 stores, in the user consent data store 650, user consent data representing whether the user provided consent for the module(s) 614 to access at least a portion of the inferences in the inference store 640. The PVA server 630 receives, from the module 614, a request for a specified inference from the inference store 640. The PVA server 630 verifies the user consent data associated with the specified inference and the module 614. The PVA server 630 provides the specified inference to the module 614 in response to verifying the user consent data.

determine a content (See Fig. 4, contextualized text-to-speech output) corresponding to the 20determined content theme, and transmit at least the determined content to the electronic device.:
[Col 6, lines 47-58] In a specific example, the communications that the PVA provides could be modified based on the inferred mood of the user. The PVA would tag the communications with various characteristics to build an initial model. For example, in the case of a joke, such characteristics may include: type (e.g., knock-knock, question-answer, animals, personal anecdotes, special dates, etc.) and topic (e.g., current events, people, sports). Then, after the communication is provided, the user's reaction (e.g., smile, laugh, grimace, groan) is captured using sensors and a feedback mechanism is used to personalize future communications for that user.
[Col 14, lines 51-57]  PVA server 630 receives, from the module 614, a request for a specified inference from the inference store 640. The PVA server 630 verifies the user consent data associated with the specified inference and the module 614. The PVA server 630 provides the specified inference to the module 614 in response to verifying the user consent data.

[Col 5, lines 31-39] The following disclosure provides an overview of techniques and configurations to enable a PVA such as Cortana™ available from Microsoft Corporation to take inputs such as a user's tone of voice, language used, facial expression, recent interactions with other devices, context on the user's location, local time, schedule, surroundings, and the like and to process those inputs to provide outputs that have augmented speed, tone, and language for communication with the user. 

Chandrasekaran does not teach the server storing and transmitting media content. However, Zomet teaches an external device 10 storing media content from a media server 67: 
[0045] In some embodiments, the smart-device environment 30 may be in communication with one or more servers 67 that supply content to a device 10 (e.g., portable electronic device 66, TV, computer) … The devices 10 that receive the content from the servers 67 may select at least a piece of the content to display based on people and/or object data obtained via sensors or received from another device 10 within the environment 30, a score assigned by the server to each piece of content, or both.

[0101] Having discussed the smart-device environment 30, the discussion now turns to providing privacy-aware personalized content via smart devices 10. FIG. 5 is a schematic drawing of a system 130 that provides privacy-aware content 132 to an occupant via smart devices in the smart home environment 30, in accordance with an embodiment. Generally, the system 130 may enable providing a set of content 132 via the one or more servers 67 to one or more client devices 134. The one or more servers 67 may provide the set of content 132 via a network, such as the Internet 62, to the client devices 134. The set of content 132 may include numerous different types of content 132. For example, the types of content 132 in the set of content 132 may include advertisements, movie/television show recommendations, shopping recommendations, application (e.g., Google Now) alerts, and/or modified search results rankings, among others.

Kim teaches a configuration wherein data is transmitted and received between a first external electronic device and second external electronic device via an electronic device (see Fig. 12), and receiving content from the second external device to the electronic device for reproducing the content. Since all the claimed elements would continue to operate in the same manner, specifically the network, the electronic device, machine 700/PVA module 612, PVA server 630, module 614, and the module server 620 of  Chandrasekaran, it would have been obvious to one of ordinary skill in the art before the effective filing date to utilize the compatible server-client architecture taught in Kim to support the transmission and retrieval of (a) device information from the electronic device to the module server 620 and (b) situation information from the first module server 620 to the second PVA server 630 through the client device 610. As taught by Kim, the architecture of the electronic device, the analysis server and the STT server establishes server-client communications which are initialized based on a voice command and responds with content. As an artisan of ordinary skill in the art would appreciate, this combination of the devices of Chandrasekaran 
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Chandrasekaran and Kim with the components of Zomet that add the capability to store, determine, and transmit media content to the device of Chandrasekaran. As would be appreciated by one of ordinary skill in the art before the effective filing date of the claimed invention, the capability to provide personalized and situation-based media, in addition to the personalized and situation-based jokes taught by Chandrasekaran (see Col 6, lines 47-58), would “ensure a desirable amount of diversity between the content” (see [0139] of Zomet).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Dean Webb whose telephone number is (408) 918-7531.  The examiner can normally be reached on Monday - Thursday and alternate Fridays, 7:30-4:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/D.M.W./         Examiner, Art Unit 2658   

/Paras D Shah/Primary Examiner, Art Unit 2659                                                                                                                                                                                                        
03/02/2021