Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 
Response to Arguments
Applicant’s arguments with respect to claims 1-21 have been considered but are moot because the arguments do not apply to any of the references being used in the current rejection.
Applicant’s arguments, see pg. 8, filed 7/2/21, with respect to the rejection(s) of claim(s) 1-21 under USC 103 have been fully considered and are persuasive. 
In re pgs. 13-14 applicant argues the claim essentially requires two distinct speech recognition regimes.
In response, there is no mention of these limitations in the claims and the specification is not the measure of the invention.  Therefore, limitations contained therein cannot be read into the claims for the purpose of avoiding the prior art; see In re Sprock, 55 CCPA 743, 386 F.2d 924, 155 USPQ 687 (1968).  Although claims are read in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2D 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).

In re pgs. 13-14 applicant argues To the contrary, the claims are quite clear that in the low power consumption mode, speech commands are processed locally, and in the high power consumption mode, speech recognition is performed remotely
In response, there is no mention of these limitations in the claims and the specification is not the measure of the invention.  Therefore, limitations contained therein cannot be read into the claims for the purpose of avoiding the prior art; see In re Sprock, 55 CCPA 743, 386 F.2d 924, 155 USPQ 687 (1968).  Although claims are read in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2D 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).

In re pg. 14, applicant argues Santori clearly does not teach that speech recognition is performed remotely, and
In re pg. 15, applicant argues Tran also does not teach or suggest communicating “information representing the speech to a remote server operating speech recognition software” and 
In re pg. 16 applicant argues Zurek does not remedy these deficiencies.
In re pg. 16, applicant argues Hayes-Roth propose a client-server model in which all input speech processing is performed at the server.
In response, the applicant cannot show non-obvious by attacking the references individually whereas here the In re Keller USPQ 871 (CCPA 1981).
Regarding what is claimed, 
Hayes discloses communicate information representing the speech to a remote server (processing speech locally or remote, “The present invention operates over a computer network such as an intranet or the Internet utilizing client-server technologies”, “The present invention provides a human-like customizable expert agent capable of having personalized conversational interactions with human users. The customizable expert agent combines natural language conversation, animated gestures, general expertise, and subject expertise to create enjoyable and effective online experiences in a variety of contexts. Each customizable expert agent is preferably a computer-controlled improvisational character having distinct personality, moods, and other life-like qualities. The customizable expert agent can act as a sales agent or a course coach and may proactively initiate a conversation with the user at any time. The present invention further provides an integrated software system and program products, including an application shell and an authoring tool, for desktop online application authoring and enterprise hosted web deployment. The customizable expert agent is particularly useful in providing computer-based training and coaching, and capable of assisting invention operates over a computer network such as an intranet or the Internet utilizing client-server technologies.”, abstract;
“The customizable expert agent can operate over a local or global computer network, over a wireless network, or locally on a computer or a computer-enabled device. It has application-independent expertise and can be given application-specific expertise. It is capable of interacting with human users/learners/customers/trainees utilizing both types of expertise, much like a human expert agent”, 0004; 
“"intelligent" or "smart" digital characters created and developed to interact on various levels with human users for a variety of purposes would be particularly beneficial to electronic sites, such as web sites on the Internet and various commercial electronic media including on-location electronic kiosks and automatic teller machines (ATM), and consumer electronic devices, such as phones and personal digital assistant (PDA)”, 0006)

Claim Interpretation
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic a network connection configured to, claims 6, 11: processor configured to, claim 18: speech output device configured to; memory configured to, port configured to, claim 20: port configured to,
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Hayes discloses:
20, 21. A user interface device comprising: a microphone configured to receive a natural language (“Substituting alternative platforms, channels, or media to enable interaction between an Agent and a user, for example: phone, PDA, voice, vision, TV, robots, etc.”, 0755; “customizable expert agent combines natural language conversation, animated gestures, general expertise, and subject expertise to create enjoyable and effective online experiences in a variety of contexts”, abstract; “The expert sales agent would communicate with the customer in natural language dialogue. This dialogue may be exchanged via various interface input/output (I/O) technologies, including but not limited to text, speech/voice/audio, and graphics/images modalities.”, 0041) input sufficient to determine at least one topic of interest to a user (“may spontaneously initiate a specific dialogue topic at any time. For example, at various times in the interaction”, 0042; “The expert sales agent would communicate with the customer in natural language dialogue. This dialogue may be exchanged via various interface input/output (I/O) technologies, including but not limited to text, speech/voice/audio, and graphics/images modalities”, 0041; 0184; 0505; “The dialogue may be mixed-initiative, i.e., either the customer or the expert agent may spontaneously initiate a specific dialogue topic at any time. For example, at various times in the interaction, the agent topic by offering a comment or question”, 0042; “enable interaction between an Agent and a user, for example: phone, PDA, voice, vision, TV”, 0755, 0006); 
a local processor configured to determine start up and shut down commands in the natural language input, wherein the local processor enters a power saving power consumption mode in a shut down state (see below);
a communication port configured to communicate with a communication network (phone, PDA, voice, vision, TV”, 0755, 0006); 
at least one automated processor configured to:
control communication of a representation of the natural language input through the communication port to a remote automated data processing system through the communication network (phone, PDA, voice, vision, TV”, 0755, 0006);
communicate through the communication network with a software module of the remote (The software controlling the agent's behavior and dialogue may reside at the client or server. The server system may be local or remote to the client system and needs not be at the same location with computer systems hosting the electronic sites.”, 0025; “operates over a computer network, e.g., a World Wide Web (web), utilizing client-server technologies”, 0015, 0025, 0145) automated data processing system having a modular software infrastructure according to an application programming interface (“The Run-time Environment includes guidance for launching, communicating with and tracking content in a web-based environment. This includes a common Launch and standard API (e.g., JDBC. ODBC) specification, and the AICC Data Model for web-based data elements”, 0152); 
receive a response from the remote automated data processing system (“The present invention operates over a computer network such as an intranet or the Internet utilizing client-server technologies”, abstract; “customizable expert agent can operate over a local or global computer network, over a wireless network, or locally on a computer or a computer-enabled device”, 0004) the conversational topic of interest (when shopping online, customers themselves search for products and related information using an on-site directory or a general-purpose search engine, search for answers and help using on-site frequently asked questions (FAQ) listings or help page”, 0007), said response being responsive to prior inputs from the user and a gender of the user(0046, 0048; “expert agent remembers what purchases the customer has made on a previous visit”, 0048; “profile data”, 0147, 0163, 0170-0176; “user profile databases”, 0179, 0181; “produce coaching content that is personalized to the particular situation and learning history of the Learner”, 0189”; “The Master Database contains User IDs and passwords as profiles.”, 0089; “STOW User Profile Database 211”, 0163; “This will produce coaching content that is personalized to the particular situation and learning history of the Learner”. 0189); and 
an audio output port configured to present a natural language response to the user, dependent on the response received from the remote automated data processing system, conveying information about the conversational topic of interest to the user (“The expert sales agent would communicate with the customer in natural language dialogue. This dialogue may be exchanged via various interface input /output (I/O) technologies, including but not limited to text, speech/voice/audio, and graphics/images modalities”, 0041; “The user interface for STOW may optionally be designed to include software plug-ins, for example to provide animation or voice for the Coach or to enable certain types of learning objects in a particular application”, 0121, 0500, 0755; “phones and personal digital assistant (PDA)”, 0006; 0018, 0022).
Hayes fails to particularly call for a local processor configured to determine start up and shut down commands in the natural language input, wherein the local processor enters a power saving power consumption mode in a shut down state
 a local processor configured to determine start up and shut down commands in the natural language input, wherein the local processor enters a power saving power consumption mode in a shut down state (e.g., “the user may " wake-up" the CPU 3 (e.g., through a button press or a voice command) prior to making a request for any application”, 0047; “he or she may issue a voice-activated command to activate the Application which may be received by microphone 29. A module in the CPU 3 may include computer-executable instructions for converting the speech into text. The text may then be communicated (e.g., in ASCII code) to the client-side API or applet in the ND 53 for activating the requested Application”, 0057); application programming interface (“utilizing an application programming interface (API) to establish a connection”, 0011).
It is obvious to combine the references at time of filing because they are in same field of endeavor.  It is well known to use interrupts to wake up and/or shut down a processor and the examiner takes official notice that it is well-known to save power by causing processors such as mobile phones to shut down.  Adding a feature of saving energy by having a processor sleep can make batteries last longer.

Claim Rejections - 35 USC § 103

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

Claims 1-5, 8-13 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Hayes-Roth (US 2003/0028498) in view of Zurek (US 2010/0169091) and Santori (2010/0306309).
Hayes-Roth (US 2003/0028498)
1. An electronic device comprising: 
at least one microphone (“customizable expert agent combines natural language conversation, animated gestures, general expertise, and subject expertise to create enjoyable and effective online experiences in a variety of contexts”, abstract; “phones and personal digital assistant (PDA)”, 0006; 0018, 0022; “interface for STOW may optionally be designed to include software plug-ins, for example to provide animation or voice for the Coach”, 0121; “voice software permitting the Learner to input natural language dialogue”, 0505) for receiving speech from a user, sufficient to determine a content of the speech including at least one conversational topic (e.g., topics read on any category/product a user/student searches on; “The topic at any time. For example, at various times in the interaction, the agent might initiate a topic by offering a comment or question such as: "May I help you?", 0042) of interest to the user, a gender of the speaker and a mood of the user (“The customizable expert agents of the present invention, each with its distinctive personality, mood, manner of interaction, and other life-like qualities, such as normal variability, idiosyncrasies, and irregularities in behavior, also can offer humanized interactions”, 0050; “For example, when shopping online, customers themselves search for products and related information using an on-site directory or a general-purpose search engine, search for answers and help using on-site frequently asked questions (FAQ) listings or help page”, 0007; “Integration with a client system's own search engine, product database, or frequently-asked questions (FAQ) resource, along with site-specific information enabling the agent to translate a learner/customer's natural language request into an effective query.”, 0017; “The expert sales agent would communicate with the customer in natural language dialogue. This dialogue may be exchanged via various interface input/output (I/O) technologies, including but not limited to text, speech/voice/audio, and graphics/images modalities”, 0041; 0184; 0505; “The dialogue may  at least one processor;
a system memory coupled with the processor (“STOW CAT can install and perform acceptably on a Pentium III class system with 600 MHz processor, 128 MB RAM, at least 100 MB free hard drive space, TCP/IP-capable network connection and Windows 2000/XP operating system”, 0143; “Each customizable 
expert agent is preferably a computer-controlled improvisational character 
having distinct personality, moods, and other life-like qualities”, abstract; client server system using AI); 
the at least one processor being configured to 
start up and shut down in response to spoken commands by the user (see below); a network connection (“The present invention operates over a computer network such as an intranet or the Internet utilizing client-server technologies”, abstract; “customizable expert agent can operate over a local or global computer network, over a wireless network, or locally on a computer or a computer-enabled device”, 0004) “configured to” 
communicate (as opposed to actually communicating) information representing the speech to a remote server (“The present invention operates over a computer network such as an intranet or the Internet utilizing client-server technologies”, abstract, 0004; 0006) operating speech recognition software, and
communicate with a remote server having a modular software infrastructure according to an application programming interface (“The software controlling the agent's behavior and dialogue may reside at the client or server. The server system may be local or remote to the client system and needs not be at the same location with computer systems hosting the electronic sites.”, 0025; “The Run-time Environment includes guidance for launching, communicating with and tracking content in a web-based environment. This includes a common Launch and standard API (e.g., JDBC. ODBC) specification, and the AICC Data Model for web-based data elements”, 0152) and
receive a response from the remote server (“Substituting alternative platforms, channels, or media to enable interaction between an Agent and a user, for example: phone, PDA, voice, vision, TV, robots, etc.”, 0755); and
at least one speaker under control of the electronic processor “configured to” verbally present said response from the remote server (not further defined, servers are on a network and are remote “The server system may be local or remote to the location with computer systems hosting the electronic sites”, 0025), conveying information responsive to the conversational topic of interest to the user, the mood of the user, and gender of the user (“The expert sales agent would communicate with the customer in natural language dialogue. This dialogue may be exchanged via various interface input /output (I/O) technologies, including but not limited to text, speech/voice/audio, and graphics/images modalities”, 0041; “The user interface for STOW may optionally be designed to include software plug-ins, for example to provide animation or voice for the Coach or to enable certain types of learning objects in a particular application”, 0121, 0500, 0755; “phones and personal digital assistant (PDA)”, 0006; 0018, 0022).
	However, Hayes fails to particularly call for speech recognition software.
Zurek teaches speech recognition and determining gender (“a media system, which is operable via a user's voice command, may analyze the user's speech to determine personal identity information, non-limiting examples of which include gender, age, emotional state, and areas of interest. The determination of gender, age and emotional state may be based on voice analysis of the recorded commands”, 0011)

Hayes fails to particularly call for the at least one processor being configured to start up and shut down in response to spoken commands by the user 
Santori teaches the at least one processor being configured to start up and shut down in response to spoken commands by the user (e.g., “the user may " wake-up" the CPU 3 (e.g., through a button press or a voice command) prior to making a request for any application”, 0047; “he or she may issue a voice-activated command to activate the Application which may be received by microphone 29. A module in the CPU 3 may include computer-executable instructions for converting the speech into text. The text may then be communicated (e.g., in ASCII code) to the client-side API or applet in the ND 53 for activating the requested Application”, 0057); application programming interface (“utilizing an application programming interface (API) to establish a connection”, 0011).
It is obvious to combine the references at time of filing because they are in same field of endeavor.  It is well known to use interrupts to wake up and/or shut down a processor and the 2. The electronic device of claim 1, wherein information responsive the topic of interest to the user and the query by the user are modified according to the mood of the user (“Each customizable expert agent is preferably a computer-controlled improvisational character having distinct personality, moods, and other life-like qualities”, abstract). Zurek teaches mood (“a media system, which is operable via a user's voice command, may analyze the user's speech to determine personal identity information, non-limiting examples of which include gender, age, emotional state, and areas of interest. The determination of gender, age and emotional state may be based on voice analysis of the recorded commands”, 0011)
3. The electronic device of claim 1, wherein said electronic device is a mobile device and said network connection comprises an interface to a wireless network (“phones and personal digital assistant PDA)”, 0006; 0018, 0022; “The present invention operates over a computer network such as an intranet or the Internet utilizing client-server technologies”, abstract; network, over a wireless network, or locally on a computer or a computer-enabled device”, 0004).4. The electronic device of claim 3, wherein at least part of the wireless network is a WiFi network (“The customizable expert agent can operate over a local or global computer network, over a wireless network, or locally on a computer or a computer-enabled device. It has application-independent expertise and can be given application-specific expertise. It is capable of interacting with human users/learners/customers/trainees utilizing both types of expertise, much like a human expert agent”, 0004; Santori: “These devices can be connected through a wireless 67 or wired 69 connection. Also, or alternatively, the CPU could be connected to a vehicle based wireless router 73, using for example a WiFi 71 transceiver. This could allow the CPU 3 to connect to remote networks in range of the local router 73”, 0042).5. The electronic device of claim 1, wherein the gender of the user is determined at the remote server (“The software controlling the agent's behavior and dialogue may reside at the client or server. The server system may be local or remote to the client system and needs not be at the same location with speech recognition and determining gender (“a media system, which is operable via a user's voice command, may analyze the user's speech to determine personal identity information, non-limiting examples of which include gender, age, emotional state, and areas of interest. The determination of gender, age and emotional state may be based on voice analysis of the recorded commands”, 0011).
8. The electronic device of claim 1, further comprising a structured light emitter, and a structured light image capture device, configured to capture spatial information about the user (“the Coach may present non-verbal communications though gesture, facial expression, body language, etc”, 0505; “expert agent combines natural language conversation, animated gestures”, abstract, 0184, 0015; “The ICA is an artificial intelligence engine driving individualized and dynamic feedback with synchronized video and graphics used to simulate real-world environment and interactions”, 0012).
9. The electronic device of claim 1, wherein the response from the remote server is selectively dependent on an emotional state of the user, determined dependent on information communicated through the network connection (“Each customizable expert agent moods, and other life-like qualities”, abstract, 0014, 0050; “"I would be happy to help you complete your purchase”, 0045; “Customization of the agent's persona; for example, the agent's background, emotional dynamics, sense of humor”, 0023; “Store emotional state in Imp for duration of session”, 0220)10. The electronic device of claim 1, further comprising a screen configured to display an avatar which verbally communicates the response from the remote server (“The customizable expert agents of the present invention, each with its distinctive personality, mood, manner of interaction, and other life-like qualities, such as normal variability, idiosyncrasies, and irregularities in behavior, also can offer humanized interactions”, 0005; “Customization of the agent's persona; for example, the agent's background, emotional dynamics, sense of humor, political positions, or formality”, 0023; “The ICA is an artificial intelligence engine driving individualized and dynamic feedback with synchronized video and graphics used to simulate real-world environment and interactions. This feedback is received and displayed through the Visual Basic Architecture”, 0012, 0184; “The Application interface comprises a user-friendly graphic user interface (GUI) screen or a browser window.”, 0500; “agents that interact with users in characteristically human ways by displaying personalities, expressing empathy”, 0040; “phones and personal digital assistant PDA)”, 0006; 0018, 0022; “interface for STOW may optionally be designed to include software plug-ins, for example to provide animation or voice for the Coach”, 0121; “voice software permitting the Learner to input natural language dialogue”, 0505; “Substituting alternative platforms, channels, or media to enable interaction between an Agent and a user, for example: phone, PDA, voice, vision, TV, robots, etc.”, 0755).
11. The electronic device of claim 10, wherein the at least one processor is further configured to communicate information dependent on a facial expression of the user to the remote server (“The software controlling the agent's behavior and dialogue may reside at the client or server. The server system may be local or remote to the client system and needs not be at the same location with computer systems hosting the electronic sites.”, 0025; “operates over a computer network, e.g., a World Wide Web (web), utilizing client-server technologies”, 0015, 0025, 0145), and to generate an avatar dependent on the response from the remote server, wherein the response from the remote server is dependent on the facial expression (“The customizable agents of the present invention, each with its distinctive personality, mood, manner of interaction, and other life-like qualities, such as normal variability, idiosyncrasies, and irregularities in behavior, also can offer humanized interactions”, 0005; “customizable expert agent combines natural language conversation, animated gestures, general expertise, and subject expertise to create enjoyable and effective online experiences in a variety of contexts”, abstract; 0015, 0184, 0505).
12. The electronic device of claim 1, wherein the response from the remote server (0025) is dependent on a user profile stored in a memory (0046, 0048; “expert agent remembers what purchases the customer has made on a previous visit”, 0048; “profile data”, 0147, 0163, 0170-0176; “user profile databases”, 0179, 0181; “produce coaching content that is personalized to the particular situation and learning history of the Learner”, 0189”; “The Master Database contains User IDs and passwords as well as application independent information that STOW has gathered about learners, such as learning style and preferences, thus a repository for general user (Learner) profiles.”, 0089; “STOW User Profile Database 211”, 0163; “This will produce coaching content that is personalized to the particular situation and learning history of the Learner”. 0189)
13. The electronic device of claim 1, wherein the user profile is adaptively updated based on interaction with the user (0046, 0048; “expert agent remembers what purchases the customer has made on a previous visit”, 0048; “profile data”, 0147, 0163, 0170-0176; “user profile databases”, 0179, 0181; “produce coaching content that is personalized to the particular situation and learning history of the Learner”, 0189”; “The Master Database contains User IDs and passwords as well as application independent information that STOW has gathered about learners, such as learning style and preferences, thus a repository for general user (Learner) profiles.”, 0089; “STOW User Profile Database 211”, 0163; “This will produce coaching content that is personalized to the particular situation and learning history of the Learner”. 0189)

Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

s 14-17, 21 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Hayes-Roth (US 2003/0028498) combined with Nathan (US 2008/0221892), Reddy (US 9,514,748), Rapaport (US 2010/0205541) and Santori (2010/0306309).
Hayes discloses:14. A method of conversing with a conversational agent, the method comprising: receiving speech from a user through a microphone (“The expert sales agent would communicate with the customer in natural language dialogue. This dialogue may be exchanged via various interface input/output (I/O) technologies, including but not limited to text, speech/voice/audio, and graphics/images modalities”, 0041; 0184; 0505; “The dialogue may be mixed-initiative, i.e., either the customer or the expert agent may spontaneously initiate a specific dialogue topic at any time. For example, at various times in the interaction, the agent might initiate a topic by offering a comment or question”, 0042; “enable interaction between an Agent and a user, for example: phone, PDA, voice, vision, TV”, 0755, 0006);
storing data representing the speech in a memory (“customizable expert agent combines natural language conversation, animated gestures, general expertise, and subject expertise to create enjoyable and effective online experiences in a variety of contexts”, abstract; “phones and personal digital assistant (PDA)”, 0006; 0018, 0022; “interface for STOW voice for the Coach”, 0121; “voice software permitting the Learner to input natural language dialogue”, 0505; “Substituting alternative platforms, channels, or media to enable interaction between an Agent and a user, for example: phone, PDA, voice, vision, TV, robots, etc.”, 0755); 
processing the speech from the memory in a digital signal processor (phone, PDA, voice, vision, TV”, 0755, 0006; see below);
starting up and shutting down an automated processor in response to spoken the commands by the user (see below); 
communicating the processed speech from the electronic device to a remote server on a communication network(“The present invention operates over a computer network such as an intranet or the Internet utilizing client-server technologies”, abstract; “customizable expert agent can operate over a local or global computer network, over a wireless network, or locally on a computer or a computer-enabled device”, 0004, 0006), said remote server being equipped with speech recognition software and artificial intelligence software comprising at least one of an artificial neural network, a Hidden Markov Model and a predictive statistical network, having a modular software infrastructure according to an application programming interface;
receiving from the remote server, through the communication network, a response to the processed at least a portion of speech, recognized by the speech recognition software, and generated by the artificial intelligence software; and communicating the response received from the remote server to the user (“The expert sales agent would communicate with the customer in natural language dialogue. This dialogue may be exchanged via various interface input /output (I/O) technologies, including but not limited to text, speech/voice/audio, and graphics/images modalities”, 0041; “The user interface for STOW may optionally be designed to include software plug-ins, for example to provide animation or voice for the Coach or to enable certain types of learning objects in a particular application”, 0121, 0500, 0755; “phones and personal digital assistant (PDA)”, 0006; 0018, 0022).
Although it may be inherent to store/buffer voice in a memory/cache in order to process it in any way, it may be argued Hayes fails to particularly call for storing the at least a portion of speech in a memory, a neural network and processing said at least a portion of speech in a digital signal processor.
Nathan more clearly teaches storing the at least a portion of speech in a memory (“Together, Data Sources 102a to 102x and the Databases 110 provide the statistical information and data required for accurate language processing and knowledge history. These Databases 110 provide the resources necessary for basic natural language processing with a persistent conversation memory”, 0069; “Intellectual Attributes 2642 may include the Autonomous Avatar's 1720a backstory, history and memory”, 0195;
“The Databases 110 may include previous conversation logs, providing a level of personal `memory` for the avatar. In this way, previous conversations and happenings may be `recalled` by the avatar”, 0092; “These answer entries may include separate data sources, previous dialogue conversations”, 0146).
Reddy teaches it is well known to process a portion of speech in a digital signal processor (Fig.2; C26, 3-13).
Rapaport teaches it is well known to use neural networks (0067, 0076, 0081; “a neural network type of weighting or statistical classifying method is used for adaptively learning through experience with the user”, 0390).
It would have been obvious to combine the references at time of filing because Hayes collect profile data and processes speech data, and by making it more clear that voice data is/may be processed by a DSP in e.g., a PDA and stored, the combined references can use voice data in the profile database to better 
Santori teaches the at least one processor being configured to start up and shut down in response to spoken commands by the user (e.g., “the user may " wake-up" the CPU 3 (e.g., through a button press or a voice command) prior to making a request for any application”, 0047; “he or she may issue a voice-activated command to activate the Application which may be received by microphone 29. A module in the CPU 3 may include computer-executable instructions for converting the speech into text. The text may then be communicated (e.g., in ASCII code) to the client-side API or applet in the ND 53 for activating the requested Application”, 0057); application programming interface (“utilizing an application programming interface (API) to establish a connection”, 0011).
It is obvious to combine the references at time of filing because they are in same field of endeavor.  It is well known to use interrupts to wake up and/or shut down a processor and the examiner takes official notice that it is well-known to save power by causing processors such as mobile phones to shut down.  Adding a feature of saving energy by having a processor sleep can make batteries last longer.15. The method of claim 14, further comprising receiving a visual input through at least one camera, said visual input containing at least one an image of the user's face and an image of the user's hand, communicating information dependent on the visual input to the server through the communication network, wherein the response is further dependent on the visual input (“the Coach may present non-verbal communications though gesture, facial expression, body language, etc”, 0505; “expert agent combines natural language conversation, animated gestures”, abstract, 0184, 0015; “The ICA is an artificial intelligence engine driving individualized and dynamic feedback with synchronized video and graphics used to simulate real-world environment and interactions”, 0012;
Nathan: “graphical generator may provide graphical representations, movements, facial features, stance, gestures or other visual stimulus appropriate to the generated language”, 0073, 0089; “In said Language Converter 221, voice recognition software may be required. Additionally, in some embodiments the Language Converter 221 may include image recognition in order to interpret body language, sign language and facial features for conversion into the native language for the Natural Language Analyzer 130”, 0079, 0104;
Rapaport: hand gesturing, 0038; “observing the user's body language may be provided in, or coupled to, the client machine (or operatively coupled to the cloud while being located at the hands, etc. for the purpose (among others) of reporting body language tells to the cloud.”, 0068; 0071, 0082, 0165).16. The method of claim 14, further comprising displaying an avatar on a screen, configured to audibly and visually articulate the response from the server (“The ICA is an artificial intelligence engine driving individualized and dynamic feedback with synchronized video and graphics used to simulate real-world environment and interactions. This feedback is received and displayed through the Visual Basic Architecture”, 0012, 0184; “The Application interface comprises a user-friendly graphic user interface (GUI) presented within a screen or a browser window.”, 0500; “agents that interact with users in characteristically human ways by displaying personalities, expressing empathy”, 0040; “phones and personal digital assistant PDA)”, 0006; 0018, 0022; “interface for STOW may optionally be designed to include software plug-ins, for example to provide animation or voice for the Coach”, 0121; “voice software permitting the Learner to input natural language dialogue”, 0505; “Substituting alternative platforms, channels, PDA, voice, vision, TV, robots, etc.”, 0755
Nathan: “the graphical generator may output a set of mood indicators that may then be utilized by the avatar host to generate movements that correspond to the given mood of the language”, 0074; 0063; “Avatars may be as complex as a 3D rendered graphical embodiment that includes detailed facial and body expressions, or may be as simple as a faceless, non-graphical widget, capable of limited, or no function beyond the natural language processor”, 0065)..17. The method of claim 14, wherein said response from the server is further dependent on prior user inputs from the user (0046, 0048; “expert agent remembers what purchases the customer has made on a previous visit”, 0048; “profile data”, 0147, 0163, 0170-0176; “user profile databases”, 0179, 0181; “produce coaching content that is personalized to the particular situation and learning history of the Learner”, 0189”; “The Master Database contains User IDs and passwords as well as application independent information that STOW has gathered about learners, such as learning style and preferences, thus a repository for general user (Learner) profiles.”, 0089; “STOW User Profile Database 211”, 0163; “This will produce coaching history of the Learner”. 0189;
Nathan: “Together, Data Sources 102a to 102x and the Databases 110 provide the statistical information and data required for accurate language processing and knowledge representation. The Databases 110 may include, but are not limited to four child databases. These databases include a dictionary, wordnet, part-of-speech tags (or speech tags), and conversation history. These Databases 110 provide the resources necessary for basic natural language processing with a persistent conversation memory”, 0069; “Intellectual Attributes 2642 may include the Autonomous Avatar's 1720a backstory, history and memory”, 0195;
“The Databases 110 may include previous conversation logs, providing a level of personal `memory` for the avatar. In this way, previous conversations and happenings may be `recalled` by the avatar”, 0092; “These answer entries may include separate data sources, previous dialogue conversations”, 0146).
Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if 

Claims 6-7, 18-19 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Hayes-Roth (US 2003/0028498), Zurek (US 2010/0169091), Santori (2010/0306309) Nathan (US 2008/0221892 and Rapaport (US 2010/0205541), as specified in claim 1.
The combination of Hayes, Zurek, and Santori fails to particularly call for using GPS and hand gestures.
6. The electronic device of claim 1, further comprising a GPS receiver, wherein the at least one processor is “configured to” receive a geolocation from the GPS receiver, and to communicate the geolocation with the remote server, wherein the response from the remote server is dependent on the geolocation (“The server system may be local or remote to the client system and needs not be at the same location with computer systems hosting the electronic sites”, 0025; Nathan: “Often these stored semantics may be annotated with the time, location, identity of Language Source 103a to 103z and any additional relevant information”, 0094).
Rapaport teaches using GPS for location (“One of the many topics that a user may inferentially have in mind is that of location as reported by the user's GPS and wondering what best to do at that location and time”, abstract).
It would have been obvious to combine the references at the time off filing because they are in the same field of endeavor and adding GPS to the well known communication protocols that indication location e.g., country code of servers and clients can provide more accuracy in location determination.

7. The electronic device of claim 1, further comprising at least one camera configured to acquire an image of at least one of a user's face having an expression and a user's hand making a gesture, wherein the response from the remote server is further dependent on at least one of the expression and the gesture (“the Coach may present non-verbal communications though gesture, facial expression, body language, etc”, 0505; “expert agent combines natural language conversation, animated gestures”, abstract, 0184, 0015; “The ICA is an artificial intelligence engine driving individualized and dynamic feedback with synchronized video and graphics used to simulate real-world environment and interactions”, 0012;
Nathan: “graphical generator may provide graphical representations, movements, facial features, stance, gestures or other visual stimulus appropriate to the generated language”, 0073, 0089; “In said Language Converter 221, voice recognition image recognition in order to interpret body language, sign language and facial features for conversion into the native language for the Natural Language Analyzer 130”, 0079, 0104).
It would have been obvious to combine the references at the time off filing because they are in the same field of endeavor and adding hand gestures because that would add to accuracy of the body gestures and facial expressions in determining moods.
18. A distributed speech processing system, comprising: a microphone configured to receive at least a portion of speech from a user, and to produce an electrical signal representing the at least a portion of speech user (“The expert sales agent would communicate with the customer in natural language dialogue. This dialogue may be exchanged via various interface input/output (I/O) technologies, including but not limited to text, speech/voice/audio, and graphics/images modalities”, 0041; 0184; 0505; “The dialogue may be mixed-initiative, i.e., either the customer or the expert agent may spontaneously initiate a specific dialogue topic at any time. For example, at various times in the interaction, the agent might initiate a topic by offering a comment or question”, 0042; “enable interaction between an Agent and a user, for example: phone, PDA, voice, vision, TV”, 0755, 0006); a speech output device, configured to produce a signal corresponding to an audio waveform of speech from a speech output signal (phone, PDA, voice, vision, TV”, 0755, 0006); a memory configured to store the at least a portion of speech (phone, PDA, voice, vision, TV”, 0755, 0006); at least one digital signal processor, configured to: 
process said at least a portion of speech, and generate the speech output signal; a communication port (phone, PDA, voice, vision, TV”, 0755, 0006), configured to: communicate the processed at least a portion of speech through a packet switched communication network to a remote server (digital mobile network, phone, PDA, voice, vision, TV”, 0755, 0006), and receive a response to the at least a portion of speech through the packet switched communication network from the remote server (“The present invention operates over a computer network such as an intranet or the Internet utilizing client-server technologies”, abstract; “customizable expert agent can operate over a local or global computer network, over a wireless network, or locally on a computer or a computer-enabled device”, 0004, 0006); and 
the remote server (over Internet), comprising speech recognition software and artificial intelligence software comprising at least one of an artificial neural network, a Hidden Markov Model, and a predictive statistical network, configured to receive the processed at least a portion of speech and to generate the response.
Although it may be inherent to store/buffer voice in a memory/cache in order to process it in any way, it may be argued Hayes fails to particularly call for storing the at least a portion of speech in a memory and using AI/neural networks.
Nathan more clearly teaches storing the at least a portion of speech in a memory (“Together, Data Sources 102a to 102x and the Databases 110 provide the statistical information and data required for accurate language processing and knowledge representation. The Databases 110 may include, but are not limited to four child databases. These databases include a dictionary, wordnet, part-of-speech tags (or speech tags), and conversation history. These Databases 110 provide the resources necessary for basic natural language processing with a persistent conversation memory”, 0069; “Intellectual Attributes 2642 may include the Autonomous Avatar's 1720a backstory, history and memory”, 0195;
“The Databases 110 may include previous conversation logs, providing a level of personal `memory` for the avatar. In this way, previous conversations and happenings may be `recalled` by the avatar”, 0092; “These answer entries may include separate data sources, previous dialogue conversations”, 0146).
neural network type of weighting or statistical classifying method is used for adaptively learning through experience with the user”, 0390).
It would have been obvious to combine the references at time of filing because Hayes collect profile data and processes speech data, and by making it more clear that voice data is/may be processed by a DSP in e.g., a PDA and stored, the combined references can use voice data in the profile database to better understand a user.  Modeling via neural networks allows for better classifying of data.19. The distributed speech processing system according to claim 46 (reading is 18), further comprising a camera (phone, PDA, vision, 0755, 0006) configured to receive an image of a face of the user concurrently with receipt of the at least a portion of speech, wherein information relating to the facial expression is communicated through the communication port to the remote server, and analyzed in conjunction with the processed at least a portion of speech, to together generate the response (“the Coach may present non-verbal communications though gesture, facial expression, body language, etc”, 0505; “expert agent combines natural language conversation, animated gestures”, abstract, 0184, 0015; “The ICA is an artificial intelligence engine driving individualized and dynamic feedback with video and graphics used to simulate real-world environment and interactions”, 0012;
Nathan: “graphical generator may provide graphical representations, movements, facial features, stance, gestures or other visual stimulus appropriate to the generated language”, 0073, 0089; “In said Language Converter 221, voice recognition software may be required. Additionally, in some embodiments the Language Converter 221 may include image recognition in order to interpret body language, sign language and facial features for conversion into the native language for the Natural Language Analyzer 130”, 0079, 0104).
/DAVID R VINCENT/     Primary Examiner, Art Unit 2123