DETAILED ACTION
Claims 1-7, 10, and 15-19 have been examined.  Claims 11-14 were canceled in a preliminary amendment dated 5/10/2019.  Claim 9 was canceled in amendment dated 12/23/2021.  Claim 8 was canceled in amendment dated 5/2/2022.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .   

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3-7, 10, 15, 16, 18, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Prange et al. “A Multimodal Dialogue System for Medical Decision Support in Virtual Reality” (hereinafter Prange), in view of Ross et al. (US 2002/0133355, hereinafter Ross), further in view of Sonntag et al. “Design and Implementation of a Semantic Dialogue System for Radiologists” (hereinafter Sonntag).

As per claim 1, Prange teaches the invention as claimed, including a computing system configured to conduct conversations with a user regarding review and analysis of medical data, the computing system comprising: 
a computer processor configured to execute software instructions (see at least page 23, left column, paragraphs 1, 2, page 24, Figure 1); 
an audio input device configured to provide an audio input to the computer processor (see at least page 23, left column, paragraphs 1, 2, page 24, Figure 1);
one or more tangible non-transitory computer readable medium (see at least page 23, left column, paragraphs 1, 2, page 24, Figure 1) storing: 
a medical knowledge database storing medical information regarding a plurality of clinical data elements (i.e., patient data, diagnoses, procedures, laboratory results, medications, see at least page 24, section 2.3, pages 24-26, section 3); 
a messaging backbone configured to receive messages from message producers and provide messages to message consumers (i.e., proxy server manages and relays the cross-platform communication between the different devices, see at least page 23, right column, paragraph 1, page 24, Figure 1); 
a plurality of compute engines including at least software code configured for execution by the computer processor (see at least page 23, left column, paragraphs 1, 2), the compute engines including at least: 
a conversation manager configured to dynamically update a state of a conversation between the user and the computing system (i.e., context modelled as discourse memory, see at least page 25, right column, paragraph 2, page 26, left column, paragraph 1); 
a conversation engine configured to determine an intent of action of the user (i.e., dialogue system including speech recognition using a grammar, determining of the user’s intention, see at least pages 24-26, section 3); 
a dispatcher configured to determine and initiate conversation actions based on the intent of action, context of the viewing activity by the user including medical images and data previously provided to the user as a result of a conversation action, information in the medical knowledge database, and the state of the conversation (i.e., dialogue model is based on finite-state machines, mapping of user intentions to matching multimodal system reactions, medical images are displayed on the screen and can be annotated by the radiologist, display patient records, zooming in on patient data, rule-based anaphora resolution, see at least pages 23-24, section 2.1, page 24-25, section 3),
wherein the conversation actions include providing speech output to the user (i.e., realization of multimodal output (speech output) is coordinated by SiAM-dp’s presentation planning component, see at least pages 24-26, section 3).
Prange does not explicitly teach an utterance ingestion engine configured to convert portions of the audio input to corresponding textual input. 
Ross teaches an utterance ingestion engine configured to convert portions of the audio input to corresponding textual input (i.e., speech recognition products that convert speech into text string, see at least [0002]).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to have modified Prange to include an utter ingestion engine configured to convert portions of the audio input to corresponding textual input as similarly taught by Ross because it is well known that speech may be converted to text to be analyzed and speech recognition engines known in the art convert speech into text (see at least [0002], [0063] of Ross).
 Prange does not explicitly teach the medical knowledge database including correlations between respective medical symptoms and medical diagnoses, determine conversational actions based on the textual input, automatically selecting and displaying two or more medical images for comparison by the user.
Sonntag teaches medical knowledge database including correlations between respective medical symptoms and medical diagnoses (i.e., all semantic descriptions are stored in a knowledge base and efficiently linked to publications that are relevant in the context of particular symptoms of the first diagnosis); 
determine conversation actions based on textual input from audio input (i.e., user input is speech, radiologist request to annotate Hodgkin-Lymphoma and the system annotates the image with RDF annotations, see at least page 11, Figure 3, pages 16-18, section 4.2), the conversation actions includes automatically selecting and displaying two or more medical images for comparison by the user (i.e., opens the first hit and his images that correspond to the search, the system rearranges the SIEs for the two patients for a comparison, see at least pages 16-18, section 4.2).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to have modified Prange such that the medical knowledge database including correlations between respective medical symptoms and medical diagnoses as similarly taught by Sonntag because Prange teaches storing patient data including diagnoses, and it would have been obvious that symptoms are also stored as Prange’s medical dialogue system facilities support for deciding which therapy is most suitable, which can be based on symptom information.
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to have modified Prange to determine conversation actions based on the textual input and automatically selecting and displaying two or more medical images for comparison by the user as similarly taught by Sonntag because the dialogue system of Prange displays medical images to assist in medical decision support (see at least page 23, left column, abstract, section 1, page 24, left column, paragraph 1 of Prange) and performing conversation actions based on textual input from audio and selecting and displaying two or more medical images for comparison helps medical image interpreters perform diagnostic analysis of images (see at least page 18, paragraph 1 of Sonntag).

As per claim 3, Prange does not explicitly teach wherein the dispatcher is configured to determine a time delay for implementing the determined conversation action.
Ross teaches a dispatcher configured to determine a time delay for implementing a determined conversation action (i.e., provide audible rendering of the responses in a delivery mode subject to selection by the user, the delivery mode is a delayed delivery mode, see at least [0015], [0058], [0070], [0072], [0074], [0075]).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to have modified Prange such that the dispatcher is configured to determine a time delay for implementing the determined conversation action as similarly taught by Ross such that the response can be deferred based on user activity or user request (see at least [0070], [0072], [0074], [0075] of Ross).

As per claim 4, Prange does not explicitly teach wherein the time delay is based on user preferences for receiving audio feedback.
Ross teaches wherein the time delay is based on user preferences for receiving audio feedback (i.e., provide audible rendering of the responses in a delivery mode subject to selection by the user, the delivery mode is a delayed delivery mode, see at least [0015], [0058], [0070], [0071], [0074], [0075]).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to have modified Prange such that the time delay is based on user preferences for receiving audio feedback as similarly taught by Ross such that the response can be deferred based user request (see at least [0070], [0072], [0074], [0075] of Ross).

As per claim 5, Prange does not explicitly teach wherein the time delay ends in response to a predetermined user activity or viewing context.
Ross teaches wherein the time delay ends in response to a predetermined user activity or viewing context (i.e., user can explicitly request the turn manager to resume notifications, see at least [0015], [0058], [0070], [0071], [0074], [0075]).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to have modified Prange such that the time delay ends in response to a predetermined user activity or viewing context as similarly taught by Ross such that the user can explicitly inquire whether there is any queued up speech output in the speak queue (see at least [0070], [0072], [0074], [0075] of Ross).

As per claim 6, Prange teaches wherein the audio feedback includes a medical suggestion relevant to the viewing context (i.e., recommendation for therapy, see at least page 24, section 2.3, page 25, section 3.1).

As per claim 7, Prange teaches wherein the computing system is a PACS (see at least page 23, section 2).

As per claim 10, Prange teaches wherein at least one of the conversation actions indicates a possible syndrome (i.e., i.e., navigation inside patient records, therapy prediction, see at least page 25).

As per claim 15, Prange teaches the invention as claimed, including a computing system comprising: 
a picture archiving and communication system (PACS) configured to view medical images (see at least page 23, section 2); 
a medical knowledge base providing information regarding a patient, an exam, and general medical information (i.e., patient data, diagnoses, procedures, laboratory results, medications, see at least page 24, section 2.3, pages 24-26, section 3);
a conversation manager configured to dynamically update a state of a conversation between the user and the computing system (i.e., context modelled as discourse memory, see at least page 25, right column, paragraph 2, page 26, left column, paragraph 1);
a conversation engine configured to determine an intent of action of the user (i.e., dialogue system including speech recognition using a grammar, determining of the user’s intention, see at least pages 24-26, section 3); 
a dispatcher configured to determine image management actions based on the intent of action, context of the viewing activity by the user including medical images and data previously provided to the user as a result of an image management action, information in the medical knowledge database, and the state of the conversation (i.e., dialogue model is based on finite-state machines, mapping of user intentions to matching multimodal system reactions, medical images are displayed on the screen and can be annotated by the radiologist, display patient records, zooming in on patient data, rule-based anaphora resolution, see at least pages 23-24, section 2.1, page 24-25, section 3), and initiate image management actions at a time determined by user mouse speech actions, past user behavior, and the medical knowledge, wherein the image management actions include providing speech output to the user (i.e., realization of multimodal output (speech output) is coordinated by SiAM-dp’s presentation planning component, context resolved from previous questions, input consisting of a speech input and a corresponding pointing gesture, see at least pages 24-26, section 3). 
Prange does not explicitly teach an utterance ingestion engine configured to convert portions of the audio input to corresponding textual input.
Ross teaches an utterance ingestion engine configured to convert portions of the audio input to corresponding textual input (i.e., speech recognition products that convert speech into text string, see at least [0002]).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to have modified Prange to include an utter ingestion engine configured to convert portions of the audio input to corresponding textual input as similarly taught by Ross because it is well known that speech may be converted to text to be analyzed and speech recognition engines known in the art convert speech into text (see at least [0002], [0063] of Ross).
Prange does not explicitly teach determine image management actions based on the textual input, automatically selecting and displaying two or more medical images for comparison by the user.
Sonntag teaches determine image management actions based on textual input from audio input (i.e., user input is speech, radiologist request to annotate Hodgkin-Lymphoma and the system annotates the image with RDF annotations, see at least page 11, Figure 3, pages 16-18, section 4.2), the image management action includes automatically selecting and displaying two or more medical images for comparison by the user (i.e., opens the first hit and his images that correspond to the search, the system rearranges the SIEs for the two patients for a comparison, see at least pages 16-18, section 4.2).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to have modified Prange to determine image management actions based on the textual input and automatically selecting and displaying two or more medical images for comparison by the user as similarly taught by Sonntag because the dialogue system of Prange displays medical images to assist in medical decision support (see at least page 23, left column, abstract, section 1, page 24, left column, paragraph 1 of Prange) and performing image management actions based on textual input from audio and selecting and displaying two or more medical images for comparison helps medical image interpreters perform diagnostic analysis of images (see at least page 18, paragraph 1 of Sonntag).

As per claim 16, Prange does not explicitly teach wherein the time is determined based on understanding of a semantic interpretation of the language spoken by the user of the PACS.
Ross teaches time is determined based on understanding of the semantic interpretation of the language spoken by the user (i.e., user can explicitly request the turn manager to defer notification with phrases “No interruptions” or “Not Now,” later user can explicitly request the turn manager to resume notification with phrases like “Do you have anything for me?” or “Go ahead.”, see at least [0075]).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to have modified Prange such that the time is determined based on understanding of the semantic interpretation of the language spoken by the user of the PACS as similarly taught by Ross such that the response can be delivered or deferred based on user’s request (see at least [0070], [0072], [0074], [0075] of Ross).

As per claim 18, Prange does not explicitly teach wherein the computer processor is configured to maintain a prioritized listing of possible conversation actions, and display portions of the prioritized listing in response to user input.
	Ross teaches maintain a prioritized listing of possible conversation actions (i.e., maintain a prioritized list of responses, see at least [0058]), and 
display portions of the prioritized listing in response to user input (see at least [0064], [0065], [0070]-[0083]).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to have modified Prange to maintain a prioritized listing of possible conversation actions, and display portions of the prioritized listing in response to user input as similarly taught by Ross in order to prioritize responses (see at least [0064], [0065], [0076]-[0083] of Ross).

As per claim 19, this is the system claim of claim 18.  Therefore, claim 19 is rejected using the same reasons as claim 18.

Claims 2 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Prange, in view of Ross, further in view of Sonntag, further in view of Pasupalak et al. (US 2017/0228367, hereinafter Pasupalak).

As per claim 2, Prange does not explicitly teaches a machine learning engine configured to analyze the determined intent of action of the user, the audio feedback provided to the user, and actions performed by the user subsequent to receiving the speech output; and determine, based on the analysis, updates to an intent of action determination model used by the conversation engine in determining intent of action of the user.
Pasupalak teaches a machine learning engine configured to analyze the determined intent of action of the user, the audio feedback provided to the user, and actions performed by the user subsequent to receiving the speech output (i.e., learning manager for updating, training, any of the modules used by the conversational agent, determine user intention is incorrect based on user response to output, see at least [0249]-[0253]);
determine, based on the analysis, updates to an intent of action determination model used by the conversation engine in determining intent of action of the user (i.e., learning manager for updating, training, any of the modules used by the conversational agent, see at least [0249]-[0253]).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to have modified Prange to include a machine learning engine configured to analyze the determined intent of action of the user, the audio feedback provided to the user, and actions performed by the user subsequent to receiving the speech output; and determine, based on the analysis, updates to an intent of action determination model used by the conversation engine in determining intent of action of the user as similarly taught by Pasupalak such that behavior patterns and preferences of a user can be learned over time to improve the operation of a conversational agent (see at least [0074], [0249]-[0253]).

As per claim 17, Prange does not explicitly teach a deep learning component configured to analyze user behavior and update logic used by the conversation action engine correspondingly.
Pasupalak teaches a deep learning component configured to analyze user behavior and update logic used by the conversation action engine correspondingly (i.e., learning manager for updating, training, any of the modules used by the conversational agent, determine user intention is incorrect based on user response to output, see at least [0249]-[0253]);
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to have modified Prange to include a deep learning component configured to analyze user behavior and update logic used by the conversation action engine correspondingly as similarly taught by Pasupalak such that behavior patterns and preferences of a user can be learned over time to improve the operation of a conversational agent (see at least [0074], [0249]-[0253]).

Response to Arguments
Rejection of claims under §103: 
As per claim 1, Applicant’s arguments directed to Brown have been fully considered, but are moot in light of the new grounds of rejection. 

As per claim 18, Examiner disagrees with Applicant’s statement that it recites subject matter that is not taught in the cited references. Prior art Ross teaches the features recited in claim 18, as detailed in the rejection of claim 18 above.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jue Louie whose telephone number is 571-270-1655.  The examiner can normally be reached on M-F 9:30 am - 5:00pm (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached on 571-272-3768.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/Jue Louie/
Primary Examiner
Art Unit 2121