DETAILED ACTION
This action is responsive to the Request for Continuation filed on 11/23/2021. Claims 1-3, 5-11, 13-18, and 20 are pending in the case. Claims 1, 9, and 16 are independent claims.
This action is non-final.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. 4. 
Applicant's submission filed on 11/23/2021 has been entered.
Applicant’s Response
In Applicant’s response dated 11/23/2021 (hereinafter Response), Applicant amended Claims 1 and 9; and argued against the objections and/or rejections previously set forth in the Office Action dated 08/26/2021 (hereinafter Previous Action).
Examiner notes Applicant did not amend independent claim 16, thus independent claim does not 16 recite the similar features as independent claims 1 and 9 (specifically missing: wherein the historical modified user inputs are user inputs from prior inputs).
Response to Amendment/Arguments
Applicants’ amendment to claims 1 and 9 to further clarify the metes and bounds of the invention are acknowledged.
In response to Applicant’s statement (see Response page 6 § II Claim Interpretation) that “definitions for [virtual assistant, intent, entity, or context] can be found at least in paragraphs 17 and 26 of the original application”, a special definition is used to limit the interpretation of a term during claim interpretation (see MPEP 2111, The broadest reasonable interpretation does not mean the broadest possible interpretation. Rather, the meaning given to a claim term must be consistent with the ordinary and customary meaning of the term (unless the term has been given a special definition in the specification), and must be consistent with the use of the claim term in the specification and drawings). Examples are not a special definition and do not limit the interpretation of a term, but merely provide guidance when applying the broadest reasonable interpretation. The instant application does not provide special definitions and at best provides examples as illustrated below (emphasis added):
[0017]    Virtual assistants may include conversational, computer-generated characters to simulate a conversation for the purpose of delivering audio or textual information to a user. A large portion of virtual assistants is developed for end-users, either commercially (e.g., a home assistant) or within an enterprise (e.g., a help desk or IT support chatbot). Such virtual assistants provide end-users with the ability to make queries and receive pertinent information and/or services (e.g., receive answers to questions, troubleshoot technical issues, play music, start a video, call another user, etc.). Virtual assistants can provide consistency in the quality of services/information provided, as well as increase the number of queries an IT department or other services provider can handle within a given time frame.
[0026]   NLP classifier 130 may include, for example, a natural language processor for converting user inputs into machine-readable text. In some embodiments, NLP classifier 130 may further include a triplestore, text index, or relational database to enhance the contextual information of the machine-readable text output and increase the accuracy of the NLP techniques employed by the NLP classifier 130. For example, NLP classifier 130 may identify or determine the intent of a user query, an entity involved with the query (e.g., the subject of a request, the identity of the user, the identity of a person or place within the query, the service or product name, etc.), or a set of additional contextual information related to the query (e.g., to differentiate similar terms with different meanings). In some embodiments, a neural network may be implemented within the NLP classifier to employ the NLP techniques.

In response to Applicant's argument with respect to the rejection of claim 1 as unpatentable over ROY in view of ACERO (see Response, starting page 6 § III Rejection under § 103), Examiner respectfully disagrees.
Applicant states that [0026-0032] of the instant application as originally filed provides examples (which would not be limiting) of what it means to modify the user input. 
It is noted Applicant has amended claim 1 to explicitly recite “wherein the modified historical inputs are user inputs from prior inputs” which does not require the prior user inputs to have been modified before aggregation with the currently-processed input.
Applicant’s first argument is with respect to one portion of ROY which was previously cited (see Response page 7, “This portion of Roy simply discusses what the system state is and that it is possible to suspend the process and restart that same process again”. Examiner agrees that is a reasonable interpretation of this portion of ROY, however this is not the only relevant portion of ROY to be cited in the rejection.
Applicant’s second argument (see Response page 7, bottom) is with respect to how the claim elements should be interpreted, however as previously explained. Examiner is required to use the broadest reasonable interpretation which is consistent with the disclosure and any arguments of how limitations should be interpreted, while carefully considered, cannot be persuasive unless there is an interpretation of the claim element which cannot be consistent with the disclosure (for example, not relying on a special definition if one is provided; not using the ordinary and customary meaning when there is no special definition).
Applicant’s third argument (see Response page 8) is with respect to statements made in the Advisory action when Applicant proposed the present claim amendments. Specifically, Applicant disagrees with the statements made in the advisory action because “…Roy does not consider any past user input during its aggregation process”. This is clearly incorrect. A portion of the abstract of ROY is reproduced below, emphasis added.
…A speech recognizer receives input from a user, and when a command is identified in the speech input, if the command meets conditions that require additional processing, a representation of the speech inputs stored for subsequent processing. A logical command processor performs additional processing of command input by analyzing the command and its elements, determining which elements are required for successful processing the command and which elements are present and lacking. The user is prompted to supply missing information, and subsequent user input is added to the command structure until the command input is aborted or the command structure reaches sufficient completeness to enable execution of the command.
From the abstract extract above, it is clear that a first input (user speech) is obtained, processed, analyzed, and stored. A second input (in response to prompt for missing information) is obtained, processed, and added to the first input to create a modified input.  If the command is complete, it is executed. If the command is not complete, a third input is obtained, processed, and added to the modified input. This process is repeated until either (a) the command is complete and can be executed or (b) the command is aborted.
The intended result is (col 4 line 47) to enable the processing of complex commands which require multiple parameters for successful processing.
Further, the invention of ROY is intended to address a deficiency in the art (col 7 line 39) it is not possible to have multiple commands and data (such as dictation) present in the same input stream. In current systems, if an input stream contains two commands, once the first command is identified, the system doesn't get to the second command when (col 7 line 46) one or more multiple commands require additional processing by the logical command processor, then each command and data element are processed accordingly by the logical command processor.
Thus, in the situation where a first input stream (voice utterance) contains more than one command, the first command may be aggregated with additional user input; then the second command may be aggregated with additional user input. The commands are executed. This process repeats until all commands in the first input stream have been processed. 
Applicant makes no other arguments with respect any other claim other than to rely on the argument against the rejection of claim 1.
Applicant is again reminded that claim interpretation requires the broadest reasonable interpretation of the claim which is consistent with the written description.
The solution that is broadly described in the instant application are (emphasis added) [0020] Embodiments of the present disclosure may introduce an additional component into the process flow to consolidate and/or enhance user inputs prior to an API call. For example, the additional component may be capable of classifying the input as a statement, question, or answer. Upon classification, the input may be cached and consolidated with additional inputs/utterances to create an aggregated input. Aggregated inputs may include additional information, for example, a user intent, entities involved with the input/query, other contextual information, etc. [0021] The aggregated input may be passed to the virtual assistant for a single API call.
ROY is directed to this same broad solution, albeit using different language (see (abstract, emphasis added) A speech recognizer receives input from a user, and when a command {e.g. the user intent} is identified in the speech input, if the command meets conditions that require additional processing, a representation of the speech inputs stored for subsequent processing. A logical command processor performs additional processing of command input by analyzing the command and its elements, determining which elements are required for successful processing the command and which elements are present {e.g. additional information in initial input} and lacking {e.g. additional information still required from the user}. The user is prompted to supply missing information, and subsequent user input is added to the command structure until the command input is aborted or the command structure reaches sufficient completeness to enable execution of the command {which is then executed by performing a function call with the command and all required parameters}).
Having carefully considered all of Applicant’s arguments of merit, Claims 1-3, 5-11, 13-18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over ROY in view of ACERO as explained below.
Claim Interpretation
Claims 9-15 are directed to a computer program product comprising a computer readable storage medium having program instructions embodied therewith which is described in the disclosure as originally filed at [0087] which states A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
The instant application does not provide any specific definition which would limit the interpretation of the following terms: virtual assistant, intent, entity, or context.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-3, 5-11, 13-18, and 20  are rejected under 35 U.S.C. 103 as being unpatentable over ROY et al. (Patent No.: US 8,219,407 B1, previously cited) in view of ACERO et al. (Pub. No.: US 2004/0148170 A1, previously cited).
Regarding claim 1, ROY teaches the method for enhancing virtual assistant interactions (intended use), the method comprising (relying primarily on FIGs 5 and 6 (which are further adaptations of FIGs 1, 2, and 4 (see col 23 line 39); broadly (col 2 lines 12-25, emphasis added) determining that a recognized command by a speech recognizer requires additional processing; storing a representation of the output of the speech recognizer in a command structure; iteratively determining if the command is sufficiently complete and ready for processing, and if so executing the command in a respective application or process and exiting said iteratively determining step; if the command is insufficiently complete or not ready for processing, prompting a user for further input; receiving, processing and storing in the command structure prompted user command-related input; and determining an abort condition, and if the abort condition exists, exiting the iterative determining, else continuing said iteratively determining step):
receiving a user input (initial input at FIG 5 (S105); note that additional user input is received at FIG 6 (S664) and (S213) when command is incomplete; note (col 16 lines 10-15) Complex command input is the input of multiple commands and data in a single string of speech input. Examples of complex command input are two commands and a data element in a single string of speech input. Other examples are multiple commands only and a single command and a data element together) ;
classifying, the user input (interpreting “classifying” as determining how the user input should be interpreted; assuming the first user input is not a request for termination (S106) process with speech recognizer (S107) to determine whether speech is a command (S109), or input data for an application or process (S119); further interpreting “command” as an “intent” of the user, e.g. what the user wishes the system to accomplish using the voice system; additionally, once system has determined it is trying to collect information in order to execute a command; generally, input is processed with speech recognizer in at least FIG 5 (S107), FIG 6 (S665) and (S214); as noted above, the input could include one or more commands which need to be individually processed; input could be the response (data) for a prompt; input could be termination request; and so on)
determining, based on the classification, a set of contextual information for the user input (after resolving any ambiguity with the command (S113, S114, S115) and when the command requires additional processing (S121) store representation of input so far (S123); perform any additional processing needed (S124); starting at (S600) in FIG 6; this includes obtaining further input from the user (S662, S664); processing the input (S667); repeating until complete command or abort; as noted above there may be multiple command which need to be individually processed; thus “determining” what information is needed in order to complete the command);
modifying the user input, based on the classification and the set of contextual information
aggregating the modified user input with a plurality of historical modified user inputs, wherein the historical modified user inputs are user inputs from prior inputs (as noted above, user input is collected (aggregated) until sufficient information has been received to process a command of the initial input; for example a first command may need two different pieces of information which the user provides in two separate subsequent inputs. The first command will be aggregated with the first subsequent input (after it has been processed) and then a test is performed to see if additional input is needed);
generating an aggregate modified user input, based on the classification, the set of contextual information, and the aggregated plurality of historical modified user inputs (as noted above, user input is collected (aggregated) until sufficient information has been received to process the command); and
passing the aggregated modified user input to the virtual assistant (once all ambiguity has been resolved; when the command does not need additional processing (S121), execute the command (S116); when the command does need additional processing, after collecting all needed information (S206), execute command (S207) executing a command once all necessary input has been received. Particularly note one of the problems that ROY is specifically addressing (col 7 lines 35-43, emphasis added) Current state of the art speech recognition systems analyze speech input for a command first, and if a command is found the input is processed based on that command. Such systems are limited to the input of one command in a single speech input, and it is not possible to have multiple commands and data (such as dictation) present in the same input stream. (col 7 line 47, emphasis added) If one or more multiple commands require additional processing by the logical command processor, then each command and data element are processed accordingly by the logical command processor; interpreting “virtual assistant” as the process which is executing the complete command).
Note example use cases include (a) (col 8 lines 1-20) controlling various device functions such as operating system, automobile and control systems, embedded in VCR or DVD recorder; or other devices; and (b) (col 11 line 60 to col 12 line 17); (col 15 line 1 to col 16 line 8) making a calendar appointment in a calendar program by collecting the necessary data elements before creating the appointment. 
While ROY clearly analyzes the user input, regardless of when it is provided, in order to determine what the user intends with the input (e.g. to terminate/abort, to execute a command, to provide additional contextual information for a command, to resolve any ambiguity), and ROY describes using several different speech recognizers, as needed (see e.g. (col 25 lines 40-57)) in order to increase the overall accuracy of the speech recognition process, including different biasing algorithms, HMM ((col 18 lines 15-35) statistically-biased model), context-free-grammars, etc.; ROY does not expressly disclose the processing of speech input using a speech recognizer comprises using a “neural network”.
ACERO is similarly directed to using statistical classifiers in order to perform task identification on natural language inputs, e.g. in order ascertain if an input is a search query or a natural-language input (see abstract).  ACERO describes using a number of different statistical classifiers 204 and a selector 221 (see e.g. FIG 4) to determine the task (or classification) identification which will be passed to an application for execution (see broadly [0047]). Of particular note, ACERO states that [0085] the selector 221 which ultimately selects the task or class ID could be other components as well, such as a neural network or a component other than the voting component 222; where the selector is [0086-0087] trained using training data and (optionally) with confidence measures and biasing [0122]. Thus, ACERO clearly teaches classifying, using a neural network, the user input in order to determine what the user input was so that it may be properly executed.
As ACERO teaches it was known to use a neural network for classifying natural language input as part of a statistical classifier; and ROY uses statistical classification for determining how to interpret natural language input, it would have been obvious for one having ordinary skill in the art at the time the invention was effectively filed to have used a neural network (such as is taught in ACERO) in order to classify the user input (as is required in ROY) with a reasonable expectation of success, merely by substituting (or using) the known neural network implementation of a statistical classifier in the “speech recognizer” in ROY which is used to determine the user’s intent (e.g. command, target, any additional data input, abort request) of the speech input.
Regarding dependent claim 2, incorporating the rejection of claim 1, ROY in view of ACERO, combined at least for the reasons discussed above, further teaches wherein the neural network is a classifier neural network (as taught in ACERO; discussed in and wherein the user input is classified as a question, a statement, or an answer (e.g. a query, a command to be executed, additional information provided in response to a system request; request to abort).
Regarding dependent claim 3, incorporating the rejection of claim 1, ROY further teaches wherein the set of contextual information includes an intent (e.g. the command), an entity (e.g. target of command), and a context (e.g. any other required command elements) (see e.g. ROY (col 31 line 20) At S201a the system determines if the representation of the speech input contains a command or data input, the context of the input (for example command or dictation and the target for the command or data input), and if it contains a command, the completeness, including required command elements and which elements are present and missing; see for example the use case (col 11 line 60 to col 12 line 17); (col 15 line 1 to col 16 line 8) for making a calendar appointment)
Regarding dependent claim 4 – canceled.
Regarding dependent claim 5, incorporating the rejection of claim 1, ROY further teaches the method further comprising:
receiving, from the virtual assistant, a response to the {modified} user input (FIG 5 (S117) command executed successfully; see also FIG 6 (S208) command executed successfully); and
notifying the user of the response ( FIG 5 (S118) notify user; see also FIG 6 (S126) notify user).
Regarding dependent claim 6, incorporating the rejection of claim 5, ROY further teaches wherein modifying the user input is predicated upon the set of contextual information meeting a context threshold (when additional processing of input is required, FIG 6 (S201a) determining completeness of a command (S206) is command sufficiently complete for execution, does it include a command (intent), target (entity), and other necessary parameters; note this is analogous to “context threshold” as described in the instant application as originally filed [0030] (e.g., at least 1 intent, 1 entity, and 1 context) on a received input
Regarding dependent claim 7, incorporating the rejection of claim 5, ROY further teaches wherein the classified user input is cached and aggregated with a plurality of classified user inputs, (interpreting “cached” as stored in a memory location; see FIG 5 (S123) store representation of speech input to memory location; FIG 6 (S216) parse representation of applicable speech input into memory location (S216) and (S667)) and wherein passing the modified user input is predicated upon the modified user input meeting a context threshold (when additional processing of input is required, FIG 6 (S201a) determining completeness of a command (S206) is command sufficiently complete for execution, does it include a command (intent), target (entity), and other necessary parameters; note this is analogous to “context threshold” as described in the instant application as originally filed [0030] (e.g., at least 1 intent, 1 entity, and 1 context) on a received input. )
Regarding dependent claim 8, incorporating the rejection of claim 5, ROY in view of ACERO, combined at least for the reasons discussed above, further teaches wherein the neural network is trained prior to receiving the user input (ACERO [0086-0087] selector 221 is trained) and wherein the training includes adjusting a weight or a bias of the neural network (ACERO [0087] Selector 221 can receive the confidence measure both during training, and during run time, in order to improve the accuracy with which it identifies the task or class corresponding to feature vector 212; note also discussion of biasing based on training data [0122]).
Regarding claims 9-11, 13-15, ROY in view ACERO, combined at least for the reasons discussed above, similarly teaches the computer program product for enhancing virtual assistant interactions, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a device (ROY (col 2 line 49) environment having speech recognizer software process and logical command processor software process; (col 5 line 20) System functions as an interface enabling humans to exercise command and control over computer software applications and to input complex commands and information into multiple software applications by speech; inherently software executed by a computer in order to effect control must be stored in some memory; example use cases may be found in (col 8 lines 1-20) and include controlling various device functions such as operating system, automobile and control systems; embedded in VCR or DVD recorder; other devices; see use case (col 11 line 60 to col 12 line 17); (col 15 line 1 to 
Regarding claims 16-18, 20, ROY in view ACERO, combined at least for the reasons discussed above, similarly teaches a system (e.g. a computer; embedded control system for a vehicle, embedded control system for other devices; see use cases (col 8 lines 1-20) for enhancing virtual assistant interactions, comprising: a memory with program instructions included thereon; and a processor in communication with the memory (inherent components of a computer or other embedded control system), wherein the program instructions cause the processor to: execute the methods of claims 1-3, 5; thus rejected under similar rationale, noting that claim 16 does not require “wherein the historical modified user inputs are user input from prior inputs” which is recited in claim 1.

It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned. They are part of the literature of the art, relevant for all they contain.” In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)). Further, a reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill the art, including nonpreferred embodiments. Merck & Co. v. Biocraft Laboratories, 874 F.2d 804, 10 USPQ2d 1843 (Fed. Cir.), cert. denied, 493 U.S. 975 (1989). See also Upsher-Smith Labs. v. Pamlab, LLC, 412 F.3d 1319, 1323, 75 USPQ2d 1213, 1215 (Fed. Cir. 2005); Celeritas Technologies Ltd. v. Rockwell International Corp., 150 F.3d 1354, 1361, 47 USPQ2d 1516, 1522-23 (Fed. Cir. 1998).

	
	

CONCLUSION
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US-10929485-B1 is directed to improving access to and interactions with bots. In an example, a first bot, hosted on a computing system, may identify an action to be performed for a user associated with a computing device. The action may be identified based on a user interaction with the first bot, where the user interaction may be provided from the computing device. The first bot may select a second bot based on the action…
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMY M LEVY whose telephone number is 571-270-3771.  The examiner can normally be reached on Mon-Fri 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RENEE CHAVEZ can be reached on 571-270-1104.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Amy M Levy/Primary Examiner, Art Unit 2179