DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 07/07/2022 has been entered.
Remarks
Claims 1-6, 9-16, 18-21, 23, 26-28 are pending in the instant application. 
Claims 1, 10, 11, 19, and 27 have been amended. Newly submitted claim 28 has been entered. No new matter was found.
Claims 1-6, 9-16, 18-21, 23, 26 and 27-28 presently stand rejected.
 

Response to Arguments
Applicant’s arguments in combination with amendments, see remarks and claims, filed 07/07/2022, with respect to the rejections of claims 1-3, 5-7, 9-12, 15-21, 23, and 26 under 35 USC 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of US Pat Pub No 20190090969A1 granted to Jarc et al. (previously presented), in view of US Pat Pub No 20190122406 granted to Wang, in further view of US Pat Pub No 20190355149 granted to Avendi et al. for claims 1-3, 5-6, 9-12, 15-16, 18-21, 23, 26, and 28. 
Dependent claims 4, 13-14 are rejected in view of the references above and in yet further view of 9788907B1 issued to Alvi et al. (previously presented).
With regards to claim 27, beginning on page 14, the applicant argues that Jarc does not refer to estimating a remaining time of a surgical procedure. This argument is fully considered and is persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of US Pat Pub No 20190090969A1 granted to Jarc et al. (previously presented), in view of US Pat Pub No 20190122406 granted to Wang and in yet further view of US Pat Pub No. 20150057646 granted to Aljuri et al.
 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-3, 5-6, 9-12, 15-16, 18-21, 23, 26, and 28 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Publication Number 20190090969A1 granted to Jarc et al. (hereinafter “Jarc” – Previously presented, also published as WO2017083768A1. A copy of this reference is included in the IDS filed by the Applicant on 08/13/2019), in view of US Pat Pub No. 20190122406 granted to Wang (hereinafter “Wang”), and in yet further view of US Pat Pub No. 20190355149 granted to Avendi et al. (hereinafter “Avendi”). 
Regarding claim 1, Jarc discloses a system for surgery (para 0010, para 0043 “the system…for performing surgical procedure” Fig. 1, 5A), comprising: a controller (Fig. 4, para 0043 “processor”) including logic that when executed by the controller causes the system to perform operations (para 0043, 0076, 0083 “Engines may be hardware, software, or firmware communicatively coupled to one or more processors in order to carry out the operations described herein”), including: receiving first images of a surgical procedure (para 0047 “para 0047 “capture images of a surgical site and output the captured images to a computer processor”); analyzing the first images with the controller executing a machine learning model to identify a surgical step in the surgical procedure (para 0008 “determine a current stage”, para 0048 “image processing” paras 0087 “determine the stage or segment of a surgical procedure”, para 0110 “a parallel-running simulation may be autonomously analyzed to determine the current surgical stage using segmenter 1004,” and para 0137 “Technique advisor 1614 may recommend a particular surgical technique to be used in a given scenario, in response to the current or upcoming surgical stage”), wherein an architecture of the machine learning model includes: including the surgical step, in the first images and output feature vectors indicating a probability that a given surgical step included in the surgical steps is occurring in a given one of the first images (para 0101 “…feature extractor 1202”; it is noted that “indicating a probability that a given surgical step included in the surgical steps is occurring in a given one of the first images” is considered to be the same as the “output feature vector”. The claim has not positively recited providing/displaying the “probability”, rather, the claim only requires the feature vectors to be indicative of the probability); and a recurrent neural network (RNN) linked to receive the feature vectors as an input (para 0101 “feature extractor 1202 applies filtering or other selection criteria to a sequence of assessed tasks and their corresponding parameters to create a feature vector as the input to RNN 1204.), output refined feature vectors with refined probabilities for identifying the surgical steps (para 0180 “para 0180 “includes a confidence measurement engine to determine a confidence score representing a probability of correct segmentation determination.”); and outputting second images related to the surgical step for display in response to identifying the surgical step (para 0072, 0137 “Technique advisor 1614 may recommend a particular surgical technique to be used in a given scenario, in response to the current or upcoming surgical stage”), wherein the second images include at least one of a diagram of human anatomy relevant to the surgical step, a preoperative image relevant to the surgical step, an intraoperative image relevant to the surgical step, or an annotated image of one of the first images (para 0140 “loading of pre-op images for the current/upcoming step from current patient or related patients”, fig. 5C).  

Jarc discloses using recurrent neural networks (RNN) and convolutional neural network (CNN), but fails to disclose a recurrent neural network (RNN) linked to the CNN to receive the feature vectors from the CNN as an input; wherein the RNN is trained to identify temporal patterns in the feature vectors received from the CNN across multiple ones of the first images and output refined feature vectors with refined probabilities.

Wang teaches a similar system and method for presentation generation for medical images using deep learning technology and in combination with convolutional neural network technology and recurrent neural network technology (para 0029). The system and method includes feeding captured medical images (para 0023) into a convolutional neural network unit which configured to extract image features of the medical images and transform them into image feature vectors and output them to a first vector space (para 0026). A recurrent neural network unit 103 is configured to determine and output semantic feature vectors corresponding to the image feature vectors according to the correspondence between image feature vectors contained in the pre-established first vector space and the matching semantic feature vectors contained in the second vector space (para 0027). This allows for translate 2D medical images into corresponding natural language through the presentation generating system to facilitate the doctor to further diagnose diseases, which achieves simpler and easier reading and analysis of medical images, improving reading efficiency while improving reading quality and drastically reducing the probability of mis-diagnoses (para 0029). It would have been obvious to one of ordinary skill in the art at the time to modify the disclosure of Jarc with the teachings of Wang to translate 2D medical images into corresponding natural language through the presentation generating system to provide the predictable result of improving reading quality and drastically reducing the probability of mis-diagnoses.

Jarc as modified by Wang render the limitations above obvious but fail to explicitly disclose wherein the first images include a sequence of video frames, and wherein adjacent frames included in the sequence of video frames are temporally segmented from one another by one second or longer. 

Avendi teaches a similar system and method for navigating anatomical object in medical images deep learning network may include one or more deep convolutional neural networks (CNNs), one or more recurrent neural networks, or any other suitable neural network configurations (para 0062). The recurrent convolutional neural network 90 can process real-time images 46 over a span of time 88 and can identify the major landmarks present in identified scenes from the images 46, which can be output to the navigation system 94 in the forms of text, speech, numbers, etc. Using a recurrent convolutional neural network 90 can ensure that a history of previously processed frames (such as the data set of images 84) is stored so that the temporal correlation of videos/images can be extracted for more accurate detection and tracking (para 0060; it is noted that the “video” is understood to include frames that are segmented or separated by more than one second). This allows the system and method to automatically detect and identify the scenes from the anatomical region …; automatically mapping each of the plurality of real-time two-dimensional images …via the deep learning network; and providing directions to the user (para 0006). Therefore, it would have been obvious to one of ordinary skill in the art at the time to modify the disclosure of Jarc as modified by Wang to provide real-time monitoring by correlating videos/images to provide the predictable result of automatically detecting and identifying the scenes from the anatomical region …; automatically mapping each of the plurality of real-time two-dimensional images …via the deep learning network; and providing directions to the user.

Regarding claim 2, Jarc as modified by Wang and Avendi (hereinafter “modified Jarc”) renders the system of claim 1 obvious as recited hereinabove, Jarc further comprising a plurality of arms coupled to the controller and configured to hold surgical instruments (Para .0049 “surgical procedures using one or more mechanical support arms 510”); and a tactile user interface coupled to the controller (Fig. 1-5a, para 0065 “graphic user interface overlaid onto the displayed video”), wherein the controller further includes logic that when executed by the controller causes the system to perform operations, including: in response to receiving user input from the tactile user interface, manipulating the plurality of arms (para 0006 “The TSS generally includes a surgeon input interface that accepts surgical control input for effecting an electromechanical surgical system to carry out a surgical procedure”). 


Regarding claim 3, modified Jarc renders the system of claim 1 obvious as recited hereinabove, Jarc comprising: a microphone coupled to the controller to send voice commands from a user to the controller; and a speaker coupled to the controller to output audio (Para 0065 “the graphic user interface can include a QWERTY keyboard, a pointing device such as a mouse and an interactive screen display, a touch-screen display, or other means for data or text entry or voice annotation/or speech to text conversion via a microphone and processor.”).  

Regarding claim 5, modified Jarc renders the system of claim 1 obvious as recited hereinabove, Jarc further comprising annotating the one of the first images to form the annotated image by at least one of highlighting a piece of anatomy, highlighting the location of a surgical step, or highlighting where a surgical instrument should be placed (para 0065-0066 “highlights or annotates certain patient anatomy shown in the displayed video using an input device of surgeon's console 52”).

Regarding claim 6, modified Jarc renders the system of claim 1 obvious as recited hereinabove, Jarc further comprising wherein the preoperative image includes a magnetic resonance image, computerized tomography scan, or an X-ray (para 0108).

Regarding claim 9, modified Jarc renders the system of claim 1 obvious as recited hereinabove, Jarc further comprising wherein the controller further includes logic that when executed by the controller causes the system to perform operations, including: estimating a remaining duration of the surgical procedure, in response to identifying the surgical step (para 0116, 0138).

Regarding claim 10, modified Jarc renders the system of claim 1 obvious as recited hereinabove, Jarc further comprising an image sensor to capture the first images (para 0042), and wherein the image sensor is disposed in an endoscope and the endoscope is coupled to the controller (Fig. 5A, para 0042 “endoscope that includes a camera to view a surgical site within a patient's body”).  

Regarding claim 11, Jarc discloses a method for operating a surgical system (para 0010, para 0043 method for performing the operation of “the system…for performing surgical procedure” Fig. 1, 5A), comprising: receiving first images of a surgical procedure  (para 0047 “para 0047 “capture images of a surgical site and output the captured images to a computer processor”); identifying, in the first images, a surgical step in the surgical procedure using a controller executing a machine learning model (para 0008 “determine a current stage”, para 0048 “image processing” paras 0087 “determine the stage or segment of a surgical procedure”, para 0110 “a parallel-running simulation may be autonomously analyzed to determine the current surgical stage using segmenter 1004,” and para 0137 “Technique advisor 1614 may recommend a particular surgical technique to be used in a given scenario, in response to the current or upcoming surgical stage”), wherein an architecture of the machine learning model includes: including the surgical step, in the first images and output feature vectors indicating a probability that a given surgical step included in the surgical steps is occurring in a given one of the first images (para 0101 “…feature extractor 1202”; it is noted that “indicating a probability that a given surgical step included in the surgical steps is occurring in a given one of the first images” is considered to be the same as the “output feature vector”. The claim has not positively recited providing/displaying the “probability”, rather, the claim only requires the feature vectors to be indicative of the probability); a recurrent neural network (RNN) linked to receive the feature vectors as an input (para 0101 “feature extractor 1202 applies filtering or other selection criteria to a sequence of assessed tasks and their corresponding parameters to create a feature vector as the input to RNN 1204.), and output refined feature vectors with refined probabilities for identifying the surgical steps (para 0180 “para 0180 “includes a confidence measurement engine to determine a confidence score representing a probability of correct segmentation determination.”), and in response to determining the surgical step, outputting second images related to the surgical step  (para 0072, 0137 “Technique advisor 1614 may recommend a particular surgical technique to be used in a given scenario, in response to the current or upcoming surgical stage”) for display, wherein the second images include at least one of a diagram of human anatomy relevant to the surgical step, a preoperative image relevant to the surgical step, an intraoperative image relevant to the surgical step, or an annotated image of one of the first images  (para 0140 “loading of pre-op images for the current/upcoming step from current patient or related patients”, fig. 5C, para 0043 “display to the surgeon 18 through the surgeon's console 16).  

Jarc discloses using recurrent neural networks (RNN) and convolutional neural network (CNN), but fails to explicitly disclose a recurrent neural network (RNN) linked to the CNN to receive the feature vectors from the CNN as an input; wherein the RNN is trained to identify temporal patterns in the feature vectors received from the CNN across multiple ones of the first images and output refined feature vectors with refined probabilities.

Wang teaches a similar system and method for presentation generation for medical images using deep learning technology and in combination with convolutional neural network technology and recurrent neural network technology (para 0029). The system and method includes feeding captured medical images (para 0023) into a convolutional neural network unit which configured to extract image features of the medical images and transform them into image feature vectors and output them to a first vector space (para 0026). A recurrent neural network unit 103 is configured to determine and output semantic feature vectors corresponding to the image feature vectors according to the correspondence between image feature vectors contained in the pre-established first vector space and the matching semantic feature vectors contained in the second vector space (para 0027). This allows for translate 2D medical images into corresponding natural language through the presentation generating system to facilitate the doctor to further diagnose diseases, which achieves simpler and easier reading and analysis of medical images, improving reading efficiency while improving reading quality and drastically reducing the probability of mis-diagnoses (para 0029). It would have been obvious to one of ordinary skill in the art at the time to modify the disclosure of Jarc with the teachings of Wang to translate 2D medical images into corresponding natural language through the presentation generating system to provide the predictable result of improving reading quality and drastically reducing the probability of mis-diagnoses.

Jarc as modified by Wang render the limitations above obvious but fail to explicitly disclose wherein the first images include a sequence of video frames, and wherein adjacent frames included in the sequence of video frames are temporally segmented from one another by one second or longer. 

Avendi teaches a similar system and method for navigating anatomical object in medical images deep learning network may include one or more deep convolutional neural networks (CNNs), one or more recurrent neural networks, or any other suitable neural network configurations (para 0062). The recurrent convolutional neural network 90 can process real-time images 46 over a span of time 88 and can identify the major landmarks present in identified scenes from the images 46, which can be output to the navigation system 94 in the forms of text, speech, numbers, etc. Using a recurrent convolutional neural network 90 can ensure that a history of previously processed frames (such as the data set of images 84) is stored so that the temporal correlation of videos/images can be extracted for more accurate detection and tracking (para 0060; it is noted that the “video” is understood to include frames that are segmented or separated by more than one second). This allows the system and method to automatically detect and identify the scenes from the anatomical region …; automatically mapping each of the plurality of real-time two-dimensional images …via the deep learning network; and providing directions to the user (para 0006). Therefore, it would have been obvious to one of ordinary skill in the art at the time to modify the disclosure of Jarc as modified by Wang to provide real-time monitoring by correlating videos/images to provide the predictable result of automatically detecting and identifying the scenes from the anatomical region …; automatically mapping each of the plurality of real-time two-dimensional images …via the deep learning network; and providing directions to the user.

Regarding claim 12, modified Jarc renders the method of claim 11 obvious as recited hereinabove, Jarc further comprising estimating a remaining duration of the surgical procedure using the machine learning architecture, in response to identifying the surgical step (para 0116, 0138). 

Regarding claim 15, modified Jarc renders the method of claim 11 obvious as recited hereinabove, Jarc further comprising using the controller to annotate the one of the first images to form the annotated image by at least one of highlighting a piece of anatomy, highlighting the location of a surgical step, or highlighting where a surgical instrument should be placed (para 0065-0066 “highlights or annotates certain patient anatomy shown in the displayed video using an input device of surgeon's console 52”).   

Regarding claim 16, modified Jarc renders the method of claim 11 obvious as recited hereinabove, Jarc discloses wherein the preoperative image includes a magnetic resonance image, computerized tomography scan, or an X-ray (para 0108).    

Regarding claim 18, modified Jarc renders the method of claim 18 obvious as recited hereinabove, Jarc discloses wherein the recurrent neural network comprises a long short-term memory (LSTM) network (para 101).  

Regarding claim 19, modified Jarc renders the method of claim 11 obvious as recited hereinabove, Jarc discloses wherein the first images are captured with an image sensor disposed in an endoscope and the endoscope is coupled to the controller (Fig. 5A, para 0042 “endoscope that includes a camera to view a surgical site within a patient's body”).  

Regarding claim 20, modified Jarc renders the method of claim 11 obvious as recited hereinabove, Jarc further comprising: capturing voice commands with a microphone coupled to the controller; and in response to capturing the voice commands, displaying the preoperative image (Para 0065 “the graphic user interface can include a QWERTY keyboard, a pointing device such as a mouse and an interactive screen display, a touch-screen display, or other means for data or text entry or voice annotation/or speech to text conversion via a microphone and processor.”).   

Regarding claim 21, modified Jarc renders the method of claim 11 obvious as recited hereinabove, Jarc further comprising, in response to determining the surgical step, starting a timer (para 0116 “a time duration of various surgical procedure segments may be assessed”; please note that the claim only requires a timer to start in order to track the duration of the step/procedure. The examiner understand “assessing the time duration of various surgical procedure reads over the claimed limitation as currently recited). 

Regarding claim 23, modified Jarc renders the method of claim 12 obvious as recited hereinabove, Jarc further comprising: automatically informing, by the controller, an operating room scheduler when an estimation of the remaining duration by the machine learning architecture changes (para 0138 “technique advisor 1614 includes a prediction function that predicts an upcoming surgical stage based on duration of steps, order of steps, performance level of steps—all based on current segment and surgeon performance, as well as optionally on historical data corresponding to that surgeon. This information can be used to forecast future events, such as the end of the procedure for optimal OR turnover, an upcoming instrument exchange”). 

Regarding claim 26, modified Jarc renders the method of claim 11 obvious as recited hereinabove, Jarc discloses further comprising: 7removing, from consideration of the machine learning model, one or more images included in the first images of the surgical procedure captured by the image sensor to reduce required processing power (para 0101; it is understood that not all images captured would need to be processed to determine the surgical step).     

Regarding claim 28, Jarc discloses at least one machine-accessible storage medium that provides instructions that, when executed by a machine, will cause the machine to perform operations (para 0043, 0076, 0083 “Engines may be hardware, software, or firmware communicatively coupled to one or more processors in order to carry out the operations described herein”), comprising: receiving first images of a surgical procedure (para 0047 “para 0047 “capture images of a surgical site and output the captured images to a computer processor”); identifying, in the first images, a surgical step in the surgical procedure using a machine learning model (para 0008 “determine a current stage”, para 0048 “image processing” paras 0087 “determine the stage or segment of a surgical procedure”, para 0110 “a parallel-running simulation may be autonomously analyzed to determine the current surgical stage using segmenter 1004,” and para 0137 “Technique advisor 1614 may recommend a particular surgical technique to be used in a given scenario, in response to the current or upcoming surgical stage”), wherein an architecture of the machine learning model includes: including the surgical step, in the first images and output feature vectors indicating a probability that a given surgical step included in the surgical steps is occurring in a given one of the first images (para 0101 “…feature extractor 1202”; it is noted that “indicating a probability that a given surgical step included in the surgical steps is occurring in a given one of the first images” is considered to be the same as the “output feature vector”. The claim has not positively recited providing/displaying the “probability”, rather, the claim only requires the feature vectors to be indicative of the probability); a recurrent neural network (RNN) linked to receive the feature vectors as an input (para 0101 “feature extractor 1202 applies filtering or other selection criteria to a sequence of assessed tasks and their corresponding parameters to create a feature vector as the input to RNN 1204.), output refined feature vectors with refined probabilities for identifying the surgical steps (para 0180 “para 0180 “includes a confidence measurement engine to determine a confidence score representing a probability of correct segmentation determination.”); and outputting second images related to the surgical step for display in response to identifying the surgical step (para 0072, 0137 “Technique advisor 1614 may recommend a particular surgical technique to be used in a given scenario, in response to the current or upcoming surgical stage”), wherein the second images include at least one of a diagram of human anatomy relevant to the surgical step, a preoperative image relevant to the surgical step, an intraoperative image relevant to the surgical step, or an annotated image of one of the first images (para 0140 “loading of pre-op images for the current/upcoming step from current patient or related patients”, fig. 5C).  

Jarc discloses using recurrent neural networks (RNN) and convolutional neural network (CNN), but fails to explicitly disclose a recurrent neural network (RNN) linked to the CNN to receive the feature vectors from the CNN as an input; wherein the RNN is trained to identify temporal patterns in the feature vectors received from the CNN across multiple ones of the first images and output refined feature vectors with refined probabilities.

Wang teaches a similar system and method for presentation generation for medical images using deep learning technology and in combination with convolutional neural network technology and recurrent neural network technology (para 0029). The system and method includes feeding captured medical images (para 0023) into a convolutional neural network unit which configured to extract image features of the medical images and transform them into image feature vectors and output them to a first vector space (para 0026). A recurrent neural network unit 103 is configured to determine and output semantic feature vectors corresponding to the image feature vectors according to the correspondence between image feature vectors contained in the pre-established first vector space and the matching semantic feature vectors contained in the second vector space (para 0027). This allows for translate 2D medical images into corresponding natural language through the presentation generating system to facilitate the doctor to further diagnose diseases, which achieves simpler and easier reading and analysis of medical images, improving reading efficiency while improving reading quality and drastically reducing the probability of mis-diagnoses (para 0029). It would have been obvious to one of ordinary skill in the art at the time to modify the disclosure of Jarc with the teachings of Wang to translate 2D medical images into corresponding natural language through the presentation generating system to provide the predictable result of improving reading quality and drastically reducing the probability of mis-diagnoses.

Jarc as modified by Wang render the limitations above obvious but fail to explicitly disclose wherein the first images include a sequence of video frames, and wherein adjacent frames included in the sequence of video frames are temporally segmented from one another by one second or longer. 

Avendi teaches a similar system and method for navigating anatomical object in medical images deep learning network may include one or more deep convolutional neural networks (CNNs), one or more recurrent neural networks, or any other suitable neural network configurations (para 0062). The recurrent convolutional neural network 90 can process real-time images 46 over a span of time 88 and can identify the major landmarks present in identified scenes from the images 46, which can be output to the navigation system 94 in the forms of text, speech, numbers, etc. Using a recurrent convolutional neural network 90 can ensure that a history of previously processed frames (such as the data set of images 84) is stored so that the temporal correlation of videos/images can be extracted for more accurate detection and tracking (para 0060 it is noted that the “video” is understood to include frames that are segmented or separated by more than one second). This allows the system and method to automatically detect and identify the scenes from the anatomical region …; automatically mapping each of the plurality of real-time two-dimensional images …via the deep learning network; and providing directions to the user (para 0006). Therefore, it would have been obvious to one of ordinary skill in the art at the time to modify the disclosure of Jarc as modified by Wang to provide real-time monitoring by correlating videos/images to provide the predictable result of automatically detecting and identifying the scenes from the anatomical region …; automatically mapping each of the plurality of real-time two-dimensional images …via the deep learning network; and providing directions to the user.

Claims 4 and 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over modified Jarc as applied to claims 1-3, 5-6, 9-12, 15-16, 18-21, 23, 26, and 28 above, and further in view of US Patent Number 9788907B1 issued to Alvi et al. (hereinafter “Alvin”). 
Regarding claim 4, modified Jarc renders the system of claim 3 obvious as recited hereinabove, but fails to disclose wherein the controller further includes logic that when executed by the controller causes the system to perform operations, including: in response to identifying the surgical step, outputting audio commands to a user of the system from the speaker.  Alvi teaches wherein the controller further includes logic that when executed by the controller causes the system to perform operations, including: in response to identifying the surgical step, outputting audio commands to a user of the system from the speaker (Col. 29, lines 20-26 “the electronic data may include an audio signal (e.g., of spoken words) that is to be presented during a visual display of the part of the video data). It would have been obvious to one of ordinary skill in the art at the time to modify the disclosure of modified Jarc with the teachings of Alvi to include audio commands to a user to provide the predictable result of allowing the user to follow instructions without having to read or detect marks on the screen using auditory senses. 

Regarding claim 13, modified Jarc renders the system of claim 11 obvious as recited hereinabove, but fails to disclose further comprising outputting audio commands from a speaker coupled to the controller, in response to determining the surgical step.  Alvi teaches comprising outputting audio commands from a speaker coupled to the controller, in response to determining the surgical step (Col. 29, lines 20-26 “the electronic data may include an audio signal (e.g., of spoken words) that is to be presented during a visual display of the part of the video data). It would have been obvious to one of ordinary skill in the art at the time to modify the disclosure of modified Jarc with the teachings of Alvi to include audio commands to a user to provide the predictable result of allowing the user to follow instructions without having to read or detect marks on the screen using auditory senses. 

Regarding claim 14, modified Jarc renders the system of claim 13 obvious as recited hereinabove, Jarc discloses outputting the duration of the surgical procedure (para 0116) but fails to disclose outputting the duration of the surgical procedure from the speaker.
Alvi teaches outputting the duration of the surgical procedure from the speaker (Col. 29, lines 20-26 “the electronic data may include an audio signal (e.g., of spoken words) that is to be presented during a visual display of the part of the video data). It would have been obvious to one of ordinary skill in the art at the time to modify the disclosure of modified Jarc with the teachings of Alvi to include audio commands to a user to provide the predictable result of allowing the user to follow instructions without having to read or detect marks on the screen using auditory senses.


Claim 27 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Publication Number 20190090969A1 granted to Jarc et al. (hereinafter “Jarc” – Previously published as WO2017083768A1. A copy of this reference is included in the IDS filed by the Applicant on 08/13/2019 – previously presented) in view of US Pat Pub No. 20190122406 granted to Wang (hereinafter “Wang”) in yet further view of US Pat Pub No. 20150057646 granted to Aljuri et al. (hereinafter “Aljuri”).
Regarding claim 27, Jarc discloses at least one machine-accessible storage medium that provides instructions that, when executed by a machine, will cause the machine to perform operations (para 0043, 0076, 0083 “Engines may be hardware, software, or firmware communicatively coupled to one or more processors in order to carry out the operations described herein”), comprising: receiving first images of a surgical procedure; identifying, in the first images(para 0046 and 0049 “image capture device” para 0047 “capture images of a surgical site and output the captured images to a computer processor”), a surgical step in the surgical procedure using a machine learning model (para 0008 “determine a current stage”, para 0048 “image processing” paras 0087 “determine the stage or segment of a surgical procedure”, para 0110 “a parallel-running simulation may be autonomously analyzed to determine the current surgical stage using segmenter 1004,” and para 0137 “Technique advisor 1614 may recommend a particular surgical technique to be used in a given scenario, in response to the current or upcoming surgical stage”), wherein an architecture of the machine learning model includes:  in the first images and output feature vectors indicating a probability that a given surgical step included in the surgical steps is occurring in a given one of the first images (para 0101 “…feature extractor 1202”; it is noted that “indicating a probability that a given surgical step included in the surgical steps is occurring in a given one of the first images” is considered to be the same as the “output feature vector”. The claim has not positively recited providing/displaying the “probability”, rather, the claim only requires the feature vectors to be indicative of the probability); a recurrent neural network (RNN) linked to receive the feature vectors as an input (para 0101 “feature extractor 1202 applies filtering or other selection criteria to a sequence of assessed tasks and their corresponding parameters to create a feature vector as the input to RNN 1204.), output refined feature vectors with refined probabilities for identifying the surgical steps (para 0180 “para 0180 “includes a confidence measurement engine to determine a confidence score representing a probability of correct segmentation determination.”), provide a real time update of when a completion of the surgical procedure is expected to occur during the surgical procedurit is noted that the claim as currently recited, only recites providing a real time update of when a completion of the procedure is expected to be completed. The claim has not required any details or requirements regarding what it considers to be the “real-time update”. Under its broadest reasonable interpretation, any indication or guidance provided by the device regarding the progress of the surgery would read over the claimed limitation as currently recited. Here, Jarc teaches providing real-time segmentaion which is applied intra-operative to generate a time duration of various surgical procedure segments.).  

Jarc discloses using recurrent neural networks (RNN) and convolutional neural network (CNN), but fails to explicitly disclose a recurrent neural network (RNN) linked to the CNN to receive the feature vectors from the CNN as an input; wherein the RNN is trained to identify temporal patterns in the feature vectors received from the CNN across multiple ones of the first images and output refined feature vectors with refined probabilities.

Wang teaches a similar system and method for presentation generation for medical images using deep learning technology and in combination with convolutional neural network technology and recurrent neural network technology (para 0029). The system and method includes feeding captured medical images (para 0023) into a convolutional neural network unit which configured to extract image features of the medical images and transform them into image feature vectors and output them to a first vector space (para 0026). A recurrent neural network unit 103 is configured to determine and output semantic feature vectors corresponding to the image feature vectors according to the correspondence between image feature vectors contained in the pre-established first vector space and the matching semantic feature vectors contained in the second vector space (para 0027). This allows for translate 2D medical images into corresponding natural language through the presentation generating system to facilitate the doctor to further diagnose diseases, which achieves simpler and easier reading and analysis of medical images, improving reading efficiency while improving reading quality and drastically reducing the probability of mis-diagnoses (para 0029). It would have been obvious to one of ordinary skill in the art at the time to modify the disclosure of Jarc with the teachings of Wang to translate 2D medical images into corresponding natural language through the presentation generating system to provide the predictable result of improving reading quality and drastically reducing the probability of mis-diagnoses.
Jarc as modified by Wang render the limitations above obvious as recited hereinabove but fails to explicitly disclose estimating the remaining duration of the surgical procedure. Aljuri shows that it is known to provide time remaining in the treatment on a patient interface output after determining the appropriate treatment and treatment profile. This would allow the user to determine the time of treatment and the time remaining in the treatment. Therefore, it would have been obvious to one of ordinary skill in the art at the time to modify the disclosure of Jarc and Wang to provide displaying the time remaining of the treatment to provide the predictable result of allowing the user to determine the time of treatment and the time remaining in the treatment. 


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US Pat No. 9767557 issued to Gulsun et al.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SANA SAHAND whose telephone number is (571)272-6842. The examiner can normally be reached M-Th 8:30 am -5:30 pm; F 9 am-3 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer McDonald can be reached on (571) 270-3061. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SANA SAHAND/Examiner, Art Unit 3792