DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this 
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
The “units” in claim 1 (“unit” is a generic placeholder, the portions of the names of the “units” preceding the word “unit” are functional words, and the words following “unit” are function descriptions that do not recite sufficient structure to perform the functions)
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 5, 6, 8, 10, 11, 13, 15, is/are rejected under 35 U.S.C. 103 as being unpatentable over Horioka et al. (US 2007/0198272), hereafter Horioka, in view of Aleksic et al. (US 2017/0270929), hereafter Aleksic.

As per Claim 1, Horioka suggests A conversation support apparatus comprising: an utterance reception unit configured to receive an utterance in an on-going conversation; an… position estimation unit configured to, for each of a plurality of nodes in a conversation tree…, and estimate a node that is most…; and a display unit configured to add a visual characteristic representation to the estimated node and display the conversation tree on a screen (Figures 1, 5, 6; paragraphs 28-30, 55-76, 79, 105, 120-141; [all paragraphs and Figures are cited for each limitation with “key” paragraphs and Figures pertaining to each limitation identified below, i.e. all other paragraphs and Figures not specifically referenced for any particular limitation are eligible to provide context and additional support]
“A conversation support apparatus”: Paragraph 28 describes where a “dialog state” refers to the content of an input from the user to the system, and to the point or stage that a user is at in a series of stages from the beginning to the end of a session”.  
“comprising: an utterance reception unit configured to receive an utterance in an on-going conversation;”: Paragraph 55 describes where a voice response unit receives a phone call from a user, recognizes speech spoken by the user, and responds to user inquiries, and forwards a call from the user to a human operator if it is impossible to provide an automatic response to the user.  Paragraph 56 describes a dialog history log which is a collection of data about the results of speech recognition performed on user's speech spoken during the period from the start of the service to the time of forwarding to the operator, and other information such as response sentences provided by the system.  Paragraphs 57-76 describe an example dialog history log which at least suggests where speech is received as part of a conversation/dialog, and paragraph 56 further describes where the user’s speech can be spoken during the period from the start of the service to the time of forwarding to the operator [i.e. during an on-going conversation between the user and the system]

“an… position estimation unit configured to, for each of a plurality of nodes in a conversation tree…, and estimate a node that is most…; and a display unit configured to add a visual characteristic representation to the estimated node and display the conversation tree on a screen”: Paragraphs 56-76 describes a dialog history log that includes, among other things, speech recognition results and response sentence/response sentence ID.  Paragraph 79 describes where a dialog information analyzing unit uses the dialog history log and the dialog state determination model to estimate a dialog state at the time of forwarding to an operator.  Paragraphs 120-137 describe a process for calculating a current dialog state S[t] by calculating probabilities for multiple states based on ResID [response sentence ID], where the highest probability state is determined to be S[t] [i.e. the current dialog state as per paragraph 121].  Paragraph 137 further describes that input parameters for F [i.e. the function for estimating the current dialog state as per paragraph 121] are ResIDs [response sentence IDs] and also describes where input parameters can be “results of recognition of user’s speech” [i.e. information about user utterances].  Figure 5 and paragraphs 138-140 describe where the current dialog state identified by the dialog information 
These portions suggest “an… position estimation unit configured to, for each of a plurality of nodes in a conversation tree in which at least one of a label and a topic is provided to each of the plurality of nodes,…, and estimate a node that is most…;” [the combination of the dialog information analyzing unit and the dialog state determination model and the dialog state diagram definition file can collectively be interpreted as a “position estimation unit” that is configured to determine/”estimate” whether each of a 
Horioka suggests an… position estimation unit configured to, for each of a plurality of nodes in a conversation tree…, and estimate a node that is most…;  Horioka does not, but Aleksic suggests an utterance position estimation unit configured to, for each of a plurality of nodes in a conversation tree in which at least one of a label and a topic is provided to each of the plurality of nodes, collate the at least one of the label and the topic provided to the node and the received utterance, and estimate a node that is most related to the received utterance; (paragraphs 9-10, 37, 45; Figure 1-2;

In Aleksic, paragraph 9 describes determining the particular dialog state that corresponds to the voice input can include generating a transcription of the voice input and determining a match between one or more n-grams that occur in the transcription of the voice input and one or more n-grams in the set of n-grams that are associated with the particular dialog state.  Paragraph 10 describes determining the match can include determining a semantic relationship between the one or more n-grams that occur in the transcription of the voice input and the one or more n-grams in the set of n-grams that are associated with the particular dialog state.  Figure 1-2 and paragraph 45 describes where each of a plurality of dialog states are associated with one or more n-grams [words, phrases, numbers, etc. that frequently occur in voice inputs that have been determined to correspond to given dialog state].  Figures 1-2 also describe an example where a user input is “pepperoni and mushroom” and where state 3 includes multiple words including “pepperoni” and “mushroom” and where the other states do not include “pepperoni” and “mushroom” [suggesting that by applying the technique of paragraphs 9-10 to the example in Figures 1-2 would lead the system determining that state 3 is “most related” to the voice input “pepperoni and mushroom” because state 3 has words that match the input whereas the other states do not have words that match the input]  
Aleksic thus suggests where determining the current dialog state based on recognition results of user speech in Horioka is, instead, done by comparing/matching/”collating”, for each of the plurality of states/”nodes” [performing the comparing once for each of a plurality of the states in the dialog state diagram in Horioka], the words of the recognition results/transcription of a user’s-speech/voice-input/“the received utterance” [received by Horioka’s voice response unit] to a set of words that are associated with the respective/corresponding dialog state and that frequently occur in voice inputs that have been determined to correspond to the respective/corresponding dialog state [which can be interpreted as a “topic” of a respective/corresponding dialog state, see Applicant’s Specification, paragraph 25], and determining/”estimating” the dialog state whose corresponding set of words most closely match [and is thus “most related to”] the words in the transcription/recognition results of “the received utterance” to be the current dialog state, thereby determining/”estimating” the current dialog state “position” in the dialog state diagram corresponding to “the received utterance” [“an utterance position estimation unit configured to, for each of a plurality of nodes in a conversation tree in which at least one of a label and a topic is provided to each of the plurality of nodes, collate the at least 
Applicant’s Specification [paragraph 25] describes where “topics… each include feature words having a high possibility of appearing in a conversation in the state of the corresponding node” and therefore Aleksic’s description in paragraph 45 of words, phrases, numbers, etc. that frequently occur in voice inputs that have been determined to correspond to given dialog state can be interpreted as “topic” as claimed [since words that frequently occur in voice inputs corresponding to a state logically have a high possibility of appearing in a conversation in that state.
In Applicant’s Specification, paragraphs 37-38 and Figure 3 describe where “collating” compares/matches one or more extracted feature words [extracted from text forming the utterance] to topic words for a node [“sofa” and “ring” for an “accept order” node in the example of paragraph 38 are the same words for an “accept order” state depicted in Figure 3], and thus the transcription word-to-word set comparison in Aleksic [discussed above] can be interpreted as the claimed “collating”)
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of dialog state identification based on an utterance with another because the prior art teaches the claimed invention except for the substitution of dialog state identification based on an utterance which does not identify a dialog state corresponding to an utterance by collating a topic of a dialog state with the utterance with dialog state identification based on an utterance which does.  Aleksic teaches that dialog state identification based on an utterance which identifies a dialog state corresponding to an utterance by collating a 
	
As per Claim 3, Horioka suggests wherein the conversation is a chat, and the utterance reception unit receives text representing an utterance in the chat (Figures 1, 5, 6; paragraphs 28-30, 55-76, 79, 105, 120-141;
The combination [thus far] is as discussed in the rejection of claim 1, including where, as discussed in the rejection of claim 1, the “voice response unit” in paragraph 55 can be interpreted as an “utterance reception unit” that, among other things, receives speech utterances from the user as part of an “on-going” dialog between the user and the system [where the operator continues the dialog/conversation between the user and the system when forwarding to the operator occurs], and that also recognizes the utterances.
Paragraphs 62-76 describe an example dialog/”conversation” where the user is “chatting” [exchanging communications] with the system.  Paragraph 55-56 describes where the voice response unit recognizes speech spoken by the user and results of 
Horioka thus suggests “wherein the conversation is a chat, and the utterance reception unit receives text representing an utterance in the chat” [the on-going “conversation” which is continued by the operator when forwarding to the operator occurs is a dialog/back-and-forth-“chat” “conversation” and the voice response unit/”utterance reception unit” receives text speech recognition results from speech recognition processing performed on a speech utterance spoken by the user as part of the chat/dialog, and where the text speech recognition results represent the word content of the speech utterance])

As per Claim 5, Horioka suggests wherein the display unit adds a visual characteristic representation to the estimated node by making a color of the estimated node different from a color of other nodes (Figures 1, 5, 6; paragraphs 28-30, 55-76, 79, 105, 120-141;
As discussed in the rejection of claim 1, Horioka suggests “a display unit configured to add a visual characteristic representation to the estimated node and display the conversation tree on a screen” [the dialog information display unit displays the dialog state diagram and also highlights the determined current dialog state, thereby adding, for example, a black background with white text “representation” of a “visual characteristic” to the state/”node” that is determined to be the current dialog state].


As per Claims 6, 8, 10, 11, 13, 15, they are directed to method and medium equivalents of claims 1, 3, and 5 and so are rejected under similar rationale.

Claims 2, 7, 12, is/are rejected under 35 U.S.C. 103 as being unpatentable over Horioka, in view of Aleksic, as applied to Claims 1, 6, and 11, above, and further in view of Odinak et al. (US 2016/0227038), hereafter Odinak.

As per Claim 2, Horioka, in view of Aleksic suggests wherein the utterance reception unit receives an utterance… on-going conversation… the utterance position estimation unit estimates a node that is most related to the received utterance, and the display unit adds a visual characteristic representation to the estimated node… and displays the conversation tree on the screen (see rejection of claim 1)
Horioka, in view of Aleksic, do not, but Odinak suggests wherein the utterance reception unit receives an utterance for each of a plurality of on-going conversations, for each of the plurality of conversations, the utterance position estimation unit estimates a node that is most related to the received utterance, and the display unit adds a visual characteristic representation to the estimated node for each of the plurality of conversations and displays the conversation tree on the screen (Figures 2, 7A; paragraphs 22, 46, 48, 49, 50, 73;
The combination [thus far] is as discussed in the rejection of claim 1.
In Odinak, paragraph 46 describes multiple users can call into an automated call center.  Paragraph 48 describes where an automated call center provides support and problem resolution and can be used in areas of commerce, and where an automated call center can, in one embodiment, be a single point.  Paragraph 49 describes where a user call sequence which includes greeting providing options, and where a user can engage with an automated prompt [e.g. an automated voice response system] or a live person.  Paragraph 50 describes multiple simultaneous calls handled by one or more agents executing agent applications on agent consoles.  Paragraph 73 describes where up to four sessions can be presented to an agent simultaneously and where an agent can view the contents of all sessions on a single screen.  Figure 7A depicts multiple windows which are identical in structure but different in conversation substance.  Paragraph 22 describes where visual presentation allows the agent to track more than one session with a customer, and where an agent may handle multiple calls simultaneously.
These portions suggest where multiple users can simultaneously call into an automated call center and where each user’s call is processed by the automated call center in a similar way [e.g. according to Figure 2 and paragraph 49], and where an agent can be provided, in a single display, one instance of a display interface for each of a plurality of simultaneously-calling users [i.e. so that the agent can handle multiple simultaneous calls simultaneously]
for each of a plurality of on-going conversations, for each of the plurality of conversations, the utterance position estimation unit estimates a node that is most related to the received utterance, and the display unit adds a visual characteristic representation to the estimated node for each of the plurality of conversations and displays the conversation tree on the screen”.)


.

Allowable Subject Matter
Claims 4, 9, 14, are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  
As per Claim 4 (and similarly claims 9 and 14), the prior art of record does not teach or suggest the combination of all limitations in claims 1 and 4 together, including (i.e. in combination with the remaining limitations in claims 1 and 4) wherein the utterance position estimation unit specifies a state indicated by the utterance and a feature word included in the utterance based on the received utterance, and collates the specified state of the utterance and the specified feature word with the label and the topic of each of the plurality of nodes to estimate a node that is most related to the received utterance.
In addition to what was discussed in the rejection of claim 1, Aleksic describes comparing (matching or determining a strong correlation) “context data” of a request and context data of a dialog state to determine that the request pertains to the dialog state that corresponds to the matching set of context data, and where context data can be analogized to a fingerprint that uniquely identifies the dialog state (paragraph 76) and analyzing both the voice input and the context data to determine a dialog state for the voice input (paragraph 80).  In Aleksic, context associated with a voice input can include data that characterizes a display of a user interface at a computing device at which the voice input was received at a time that the voice input was received (paragraph 12) and “Some types of context data may indicate a condition or state of the user device 108 at or near a time that the voice input 108 was detected by the device 108. As described further below, examples of context data include user account information, anonymized user profile information (e.g., gender, age, browsing history data, data indicating previous queries submitted on the device 108), location information, and a screen signature (i.e., data that indicates content displayed by the device 108 at or near a time when the voice input 110 was detected by the device 108). In some implementations, the application identifier, dialog identifier, and dialog state identifier may be considered as special types of context data” (paragraph 38) and “dialog state history data may be used alone (i.e., without other context data) to determine the dialog state associated with a transcription request”, (paragraph 51) and where a request can include a dialog state identifier and “other context data” (paragraph 52).  Paragraph 59 describes comparing context data included in the request with respective context data associated with each of the dialog states to determine a respective context similarity score for each of the dialog states.  (paragraph 59).  
Aleksic thus suggests an embodiment of the comparison where both the words of the transcription and a dialog state identifier in the request are compared to words corresponding to a dialog state and a dialog state identifier corresponding to the dialog state, respectively, to determine a dialog state corresponding to the voice input.
utterance position estimation unit specifies a state indicated by the received utterance based on the received utterance (the dialog state identifier in paragraph 52, in particular, appears to be sent with the request as a separate part of the request and thus appears to be specified before the system receives and processes the request [see e.g. paragraph 63] and is not based on the utterance [i.e. the audio of the request]) and where the utterance position estimation unit collates the specified state of the utterance (i.e. the state that is indicated by the received utterance and which is specified based on the received utterance) with the label and the topic of each of the plurality of nodes.
Aleksic also teaches, in paragraph 52, where a speech recognizer may provide an indication of a dialog state identifier that corresponds to a given request to the user device along with the transcription result (at least suggesting that a speech recognizer can deermine a dialog state identifier).  In this example, the speech recognizer provides a transcription result and a dialog state identifier associated with a first request, and then in a subsequent/second request, the dialog state identifier associated with the first request which can be used by the speech recognizer to determine a dialog state for the second/subsequent request.  Therefore, Aleksic suggests where a speech recognizer [part of the voice response unit in Horioka] can determine a dialog state identifier, but in Aleksic that dialog state identifier is not indicated by the received utterance and specified based on the received utterance which is determined to be most related to the estimated dialog state (the current dialog state in the rejection of claim 1).
	In the following reference Bui et al., paragraph 55 describes a dialog state tracker that analyzes utterances that are represented by computer readable text to 
	In this reference, it appears that the determined dialog state is analogous to the determined dialog state corresponding to a voice input (i.e. determined based on the comparison of n-grams and/or context data), and not to the dialog state identifier that is used in the comparison.  It is, therefore, not clear that one of ordinary skill in the art would use the dialog state determination in Bui to determine, using the utterance position estimation unit, the dialog state identifier that is collated/compared to dialog states’ dialog state identifiers.
2017/0228366 “In managing dialog session 130, session manager 122 provides inputs of dialog session 130 to dialog state tracker 118 and receives outputs comprising dialog states of dialog session 130 from dialog state tracker 118. The inputs include the utterances that are represented by computer readable text (e.g., utterances 132), which dialog state tracker 118 analyzes to determine dialog states (e.g., later stored as history of states 134 and current state 136) of the utterances”, paragraph 55; 
2017/0364323 teaches where a dialog processing server sends a response message including a dialog state identifier and text of a user’s utterance (paragraph 68) where the dialog state name is “Search_spot (Kyoto)” (paragraph 68) which appears to be based on the user’s utterance “Tell me sightseeing spots in Kyoto” (paragraph 66).  sequence identifier for the “Narrow down to the Arashimaya area” utterance is determined to be the same as the “Tell me sightseeing spots in Kyoto” utterance (since the utterances are in the same utterance group).  Figure 12 and paragraphs 97-101 further describe other dialog states determined for utterances like “Show me hotels”.  In this reference, similar to Bui, it appears that the dialog state name is more analogous to the dialog state that is determined for an utterance in Aleksic, and not the dialog state identifier in the request which is compared/collated.
The following reference is similar to the previous reference
2017/0337036 “The dialogue receiver 202 receives the user's utterance U2 and performs speech recognition, and converts the user's utterance into text. The target determiner 204 refers to the dialogue information stored in the dialogue information storage 203, and determines a dialogue state as a target of the user's utterance U2. The dialogue state to become a target is a dialogue state with the display status flag "1", and herein, the dialogue state with the dialogue status identifier "1" is determined as a target dialogue state”, paragraph 72;
The following reference describes, for tuning a speech recognition process,  maintaining a database of utterances, where information associated with the utterances 
7069513 col. 9, lines 16-45;
The following reference teaches comparing information in an out-of-band signal including a serialized form of a recognition result and an indication of a present dialog state to an expected recognition result and an expected dialog state, and logging a mismatch if any information does not match their counterpart.
2006/0224392 “Upon answering the call, speech application 308 plays a prompt, "Welcome, where would you like to fly?" In addition to playing the prompt, an out-of-band signal indicative of the prompt and the present dialog state is sent to the testing application 312. Testing application 312 interprets the out-of-band signal and compares the present dialog state with an expected dialog state. Additionally, call answer latency as well as QoS measures can be logged based on the information sent from speech application 308”, paragraph 63; “In addition to the prompt, an out-of-band signal is sent with a serialized form of the recognition result, an indication of the prompt that is played and an indication of the present dialog state. The out-of-band signal can be sent using any application communication interface, for example using SIP INFO or other mechanisms such as NET Remoting. The information in the out-of-band signal is compared to an expected recognition result, in this case "Seattle" and "Boston, an expected prompt and an expected dialog state. If any of the information sent does not match their expected counterparts, testing application 312 can log the mismatch along 
The following reference teaches “estimating a dialog state from an utterance” but was published in 2018 and therefore does not qualify as prior art.
Kim, A., Song, H., & Park, S. (2018). “A two-step neural dialog state tracker for task-oriented dialog processing”. Computational Intelligence and Neuroscience, 2018, NA. Retrieved from https://dialog.proquest.com/professional/docview/2225569324?accountid=131444
The prior art teaches “A spoken dialog system stores a history of dialog states in a memory, outputs a system response in a current dialog state, inputs a user utterance, performs speech recognition of the user utterance, to obtain one or a plurality of recognition candidates of the user utterance and likelihoods thereof with respect to the user utterance, calculates a degree of state conformance of each of the current and the preceding dialog states stored in the memory with respect to the user utterance, selects one of the current and the preceding dialog states and one of the recognition candidates based on a combination of the degree of state conformance of each dialog state and the likelihood of each recognition candidate, and performs transition from the current dialog state to a new dialog state based on dialog state selected and recognition candidate selected” (Abstract).  This reference teaches calculating a degree of state conformance of each of the current and the preceding dialog states stored in the memory with respect to the user utterance (paragraph 19).  Paragraphs 81-83 (and Equations 1 and 2) appear to describe where degree of state conformance is based on response time.  
2008/0201135 “These problems are ascribed to the estimation of a dialog state from only an input time in reference 1 and to the estimation of a dialog state from only input contents in reference 2. In order to accept a correction input by the user, it is necessary to perform input interpretation by comprehensively handling both the estimation of input contents and the estimation of a dialog state on which the input acts”, paragraph 12; 
The following reference teaches analyzing a newly input utterance from the user and updating a dialog state of the user based on a newly input utterance
2018/0075847 “The task state database 150 may include task states, e.g. in form of task lineages, of different users of the web-based conversational agent 140. The web-based conversational agent 140 may keep tracking the dialog state or task state of a user, by analyzing a newly input utterance from the user and updating the dialog state of the user based on the newly input utterance, e.g. by extending a dialog lineage of the user with newly estimated tasks requested by the user based on the newly input utterance”, paragraph 44;
The following reference teaches comparing slot fills and queued events with a dialog state column to determine a best match among eligible states.
2006/0212515 “filtering process for an eligible state set 750 generated in accordance with FIG. 7A is schematically illustrated in FIG. 7B. The eligible state set 
The prior art teaches determining whether a user’s utterance correlates to a language model corresponding to a particular diloag state.
7162421 “If the user's utterance correlates to a language model corresponding to a particular dialog state, the natural language understanding 335 translates the user's utterance in order to determine whether the sequence of words correlates to a possible meaning for the current state of the speech recognition system”
	The prior art teaches identifying tags in a spoken language input and where a dialog state belief tracking system identifies entities, attributes, and relationships that match at least some of the tags and creates a state graph based on matched entities and attributes.
2016/0163311 “The spoken language system is operable to receive a spoken language input, identify tags within the spoken language input, and communicate with the dialogue state belief tracking system. The dialogue state belief tracking system is operable to communicate with the spoken language system and to search a knowledge 
	The prior art teaches utterance states detected by an utterance state detector, and determining a conversation state among a plurality of users based on the utterance states detected by the utterance state detector.
2007/0150274 “a determination unit that determines a conversation state among a plurality of users of the transmission devices, on a basis of the utterance states detected by the utterance state detector of the at least one of the reception devices”, claim 2;
2008/0201133 teaches matching between an input user text utterance and IVR dialog state category set descriptions.  This reference does not appear to be matching the input user text utterance to the topic/label of the IVR dialog state.  This reference appears to use the matching to determine the category pertaining to the user utterance.  Paragraph 23 appears to describe matching the utterance with tasks like “account balance”.  This reference appears to teach determining correspondence between input utterance words and words for IVR dialog states/prompts (paragraph 18).  This reference does not specifically describe relating an utterance with a state (it describes task which can be part of a state, but does not specifically state that the utterance is determined to correspond to the state).  Paragraph 18 in particular describes where the 3 tasks are all part of the same state, and so the categorization does not appear to estimate that the user utterance is related to a particular state (as opposed to a particular task that is available given a plurality of states).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
The prior art teaches a call center system where users and representatives conduct simultaneous voice and joint browsing sessions.
6295551 “Call center system where users and representatives conduct simultaneous voice and joint browsing sessions”, Title;
The prior art teaches permitting agents to interact with multiple callers concurrently.
9172805 “increase overall efficiency of call centers by permitting agents 120 to interact with multiple callers 102, 104 concurrently”
	The prior art describes displaying a dialog log combining information from two different dialogs.
2009/0222507 Figure 8 paragraphs 66-68.
The prior art teaches collating an input phrase with an example phrase.
6161083 “Because the probability P(Distort.vertline.E,I) is the probability of a set of word transforming operators Distort being used to collate an input phrase I with an 
The prior art teaches displaying conversation progress (i.e. displaying messages communicated by conversation participants in sequence, not a conversation tree)
	2002/0049805 “FIG. 22 illustrates a chat room that members with different nationality participate in. The names of the members such as "tom" are displayed in a member field 3122 and their conversation progresses in a main field 3120. A field 3124 for the member to enter an utterance and a submit button 3126 to send the utterance are provided at the bottom. In addition, a "other languages" button 3128 is provided”, paragraph 135; Figure 22;
The prior art teaches displaying a tree structure that displays a structure of semantic data categories.
	5918222 Figure 114B; “By interpreting the contents of the message 2119, the values of the attributes "title" and "writer" of the presented data information can be estimated, so that these values (2603 in FIG. 144A) are corrected, and the corrected values are presented. By operating a button (2604 in FIG. 144A), so that a window in which the structure of the semantic data categories is displayed by a tree structure in FIG. 114B) is displayed. In selecting a semantic category of data to be added, a relationship with other categories can be easily grasped”
2008/0256063, in Figure 11, depicts an “event decision tree generated by a decision tree generator” describing possibilities for conversations (i.e. a “conversation tree”).  Paragraphs 74-75 describe where a display may display information for the user based on such a binary decision tree, and where Figure 12 is an example of the display.  Figure 12 seems to indicate, however, that the event decision tree is not displayed.
	2008/0167914 describes providing assistance information (e.g. suggestions for what a receptionist should do when a customer says a particular thing, see paragraph 53).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC YEN whose telephone number is (571)272-4249.  The examiner can normally be reached on M-F 9:00AM -5:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RICHEMOND DORVIL can be reached on (571)272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  






EY 1/16/2021
/ERIC YEN/Primary Examiner, Art Unit 2658