Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim(s) 16, 17, and 21 is/are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 16 recites “the surgical video database” which lacks antecedent basis.  Examiner interprets Applicant intended to recite “[[the]] a surgical video database” instead.
Claim 17 depends upon claim 16 and thereby inherits the same indefiniteness deficiencies.
Claim 21 recites “a surgical video database” yet those of ordinary skill in the art would not know whether this is the same surgical video database previously recited in claim 20 or a different one.  Examiner interprets Applicant intended to recite “[[a]] the surgical video database” instead.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 2, 9, and 10 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Zisimopoulos et al., US 2018/0357514 A1 (hereinafter referred to as “Zisimopoulos”).

Regarding claim 1, Zisimopoulos discloses a method comprising (see Zisimopoulos Figs. 1 and 15, and paras. 0004, 0005, and 0077, where the method is performed by processors and computer readable storage media): generating, with a surgical simulator, simulated surgical videos each representative of a simulation of a surgical scenario (see Zisimopoulos Figs. 4-6, and paras. 0026, 0027, and 0072-0075, where a surgical simulator is used to generate simulated video streams of a cataract surgery); associating simulated ground truth data from the simulation with the simulated surgical videos, wherein the simulated ground truth data corresponds to context information of at least one of a simulated surgical instrument, a (see Zisimopoulos Figs. 4-6, and paras. 0026, 0027, and 0072-0075, where the simulated video streams are translated into corresponding semantic instrument segmentations); and annotating, with the surgical simulator, features of the simulated surgical videos based, at least in part, on the simulated ground truth data for training a machine learning model (see Zisimopoulos Figs. 4-6, and paras. 0026, 0027, and 0072-0075, where pixels are the features of the simulated video streams that are annotated as either belonging to a surgical instrument or as belonging to the background).

Regarding claim 2, Zisimopoulos discloses further comprising: pre-training the machine learning model with the features annotated from the simulated surgical videos, wherein the pre-training configures the machine learning model to probabilistically identify the features from unlabeled simulated videos corresponding to the surgical scenario (see Zisimopoulos Figs. 4-6, and paras. 0074-0076, 0078, and 0079, where training is performed on the simulated videos “using Stochastic Gradient Descent” – a probabilistic algorithm – before validation and testing using additional simulated videos, thereby producing the following probabilistic statistics: “pixel accuracy, mean class accuracy, mean Intersection over Union (mean IU) and frequency weighted IU (fwIU)”).

Regarding claim 9, Zisimopoulos discloses wherein the features correspond to at least one of segments of a surgical instrument, segments of an anatomical region, motion of the surgical instrument, surgical steps of a surgical procedure, surgical technique utilization, surgical event occurrence, or separation distance between the surgical instrument and the (see Zisimopoulos Figs. 4-6, and paras. 0026, 0027, and 0072-0074, where pixels are the features of the simulated video streams that are annotated as either belonging to a segment of a surgical instrument or as belonging to a segment of the background).

Regarding claim 10, Zisimopoulos discloses wherein the context information for each of the simulated surgical videos includes at least one of three-dimensional spatial boundaries of the simulated surgical instrument, three-dimensional spatial boundaries of the simulated anatomical region, motion of the simulated surgical instrument, separation distance between the simulated surgical instrument and the simulated anatomical region, orientation of the simulated surgical instrument, temporal boundaries of one or more surgical steps of the simulated surgical task from the surgical scenario, temporal boundaries of a simulated surgical complication, or spatial boundaries of the simulated surgical complication (see Zisimopoulos paras. 0026, 0027, 0036, and 0048, where objects are modeled in three-dimensions, and the simulated video may include “depth” and/or “motion” data).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claim(s) 3-5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zisimopoulos as applied to claim 2 above, and in further view of Black et al., US 10,679,046 B1 (hereinafter referred to as “Black”).

Regarding claim 3, Zisimopoulos discloses further comprising: receiving, from a surgical video data store, real surgical videos representative of the surgical scenario in a real environment, wherein the training configures the machine learning model to probabilistically identify the features from unlabeled real videos corresponding to the surgical scenario (see Zisimopoulos Figs. 4-6, and paras. 0036, 0037, 0048, 0072, 0075, 0082, and 0083, where the model is applied to a real data set to produce surgical instrument segmentations).
Zisimopoulos does not explicitly disclose a database; and wherein the real surgical videos include the features annotated for the machine learning model; and after the pre-training, training the machine learning model with the real surgical videos.
However, Black discloses a database (see Black Figs. 1A-1C, and col. 4, ll. 34-37, and col. 16, ll. 33-44, where image data is stored in a database in a data store).
It would have been obvious to one of ordinary skill in the art at the time of filing to use the databases of Black to store the images and/or videos of Zisimopoulos, because it is predictable that improving the organization of the images and/or video with a database would also improve the ease by which users can search for and find the images and/or videos via database commands.
Furthermore, Black discloses wherein the real surgical videos include the features annotated for the machine learning model; and after the pre-training, training the machine (see Black TABLE 1 and col. 5, ll. 53 through col. 6, ll. 30, where “. . . the CNN 115 pre-trained with synthetic images and then fine-tuned with real images . . .”).
It would have been obvious to one of ordinary skill in the art at the time of filing to refine the machine learning model of Zisimopoulos in the manner taught and suggested by Black, because Black states that “[w]hen the CNN 115 is pre-trained with synthetic images and then fine-tuned with real images . . ., the model predicts very accurate segmentations and outperforms the "Real" version of the CNN 115 by a large margin” (see Black col. 6, ll. 23-27).  TABLE 1 further shows that Black’s training refinement also outperforms synthetic only training as well.  Accordingly, it is predictable that Black’s training refinement would improve the accuracy of Zisimopoulos’s segmentations.

Regarding claim 4, Zisimopoulos discloses further comprising: receiving, from the surgical video data store, the unlabeled real videos representative of the surgical scenario in the real environment; identifying the features of the unlabeled real videos with the machine learning model; and annotating the unlabeled real videos by labeling the features of the unlabeled real videos identified by the machine learning model (see Zisimopoulos Figs. 4-6, and paras. 0036, 0037, 0048, 0072, 0075, 0082, and 0083, where the model is applied to a real data set to produce surgical instrument segmentations).
Furthermore, Black discloses a database (see Black Figs. 1A-1C, and col. 4, ll. 34-37, and col. 16, ll. 33-44, where image data is stored in a database in a data store); and the machine learning model trained with the real surgical videos (see Black TABLE 1 and col. 5, ll. 53 through col. 6, ll. 30, where “. . . the CNN 115 pre-trained with synthetic images and then fine-tuned with real images . . .”).

Regarding claim 5, Zisimopoulos discloses further comprising: displaying annotated surgical videos corresponding to the unlabeled real videos with the features annotated by the machine learning model superimposed on a corresponding one of the unlabeled real videos (see Zisimopoulos Figs. 4-6, and paras. 0036, 0037, 0048, 0072, 0075, 0082, and 0083, where the model is applied to a real data set to produce surgical instrument segmentations with only the segmented portion displayed in the image).

Claim(s) 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zisimopoulos as applied to claim 2 above, and in further view of Shrivastava, Ashish, et al. "Learning from Simulated and Unsupervised Images through Adversarial Training." 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017 (hereinafter referred to as “Shrivastava”).

Regarding claim 6, Zisimopoulos discloses wherein the unlabeled real images are representative of the surgical scenario in the real environment (see Zisimopoulos Figs. 4-6, and paras. 0036, 0037, 0048, 0072, 0075, 0082, and 0083, where the model is applied to a real data set to produce surgical instrument segmentations).
Zisimopoulos does not explicitly disclose further comprising: providing, to a refiner neural network, simulated images from the simulated surgical videos before the pre-training; 
However, Shrivastava discloses further comprising: providing, to a refiner neural network, simulated images from the simulated surgical videos before the pre-training; and refining the simulated surgical videos with the refiner neural network, wherein the refiner neural network adjusts the simulated images until a discriminator neural network determines the simulated images are comparable to unlabeled real images within a first threshold, wherein the features annotated from the simulated surgical videos are included after the refining (see Shrivastava Figs. 1-6, and Algorithm 1, and pgs. 2244-2247 “2. S+U Learning with SimGAN,” where simulated images are refined using a refiner network until a discriminator network, applied to both refined simulated images and real images, passes a threshold “Visual Turing Test”).
It would have been obvious to one of ordinary skill in the art at the time of filing to use Shrivastava’s GAN to refine the simulated videos of Zisimopoulos, because it is predictable that doing so would improve the quality of the simulated video by making the simulated video more realistic, and Shrivastava states “[w]e show a significant improvement over using synthetic images . . .” (see Shrivastava Abstract). 

Claim(s) 7 and 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zisimopoulos in view of Shrivastava as applied to claim 6 above, and in further view of Black.

Regarding claim 7, Zisimopoulos discloses further comprising: receiving, from a surgical video data store, real surgical videos representative of the surgical scenario in a real environment, wherein the training configures the machine learning model to probabilistically identify the features from unlabeled real videos corresponding to the surgical scenario in the real environment (see Zisimopoulos Figs. 4-6, and paras. 0036, 0037, 0048, 0072, 0075, 0082, and 0083, where the model is applied to a real data set to produce surgical instrument segmentations).
Zisimopoulos does not explicitly disclose a database; and wherein the real surgical videos include the features annotated for the machine learning model; and after the pre-training, training the machine learning model with the real surgical videos.
However, Black discloses a database (see Black Figs. 1A-1C, and col. 4, ll. 34-37, and col. 16, ll. 33-44, where image data is stored in a database in a data store).
It would have been obvious to one of ordinary skill in the art at the time of filing to use the databases of Black to store the images and/or videos of Zisimopoulos, because it is predictable that improving the organization of the images and/or video with a database would also improve the ease by which users can search for and find the images and/or videos via database commands.
Furthermore, Black discloses wherein the real surgical videos include the features annotated for the machine learning model; and after the pre-training, training the machine learning model with the real surgical videos (see Black TABLE 1 and col. 5, ll. 53 through col. 6, ll. 30, where “. . . the CNN 115 pre-trained with synthetic images and then fine-tuned with real images . . .”).
It would have been obvious to one of ordinary skill in the art at the time of filing to refine the machine learning model of Zisimopoulos – as modified by Shrivastava – in the manner taught and suggested by Black, because Black states that “[w]hen the CNN 115 is pre-trained with synthetic images and then fine-tuned with real images . . ., the model predicts very accurate segmentations and outperforms the "Real" version of the CNN 115 by a large margin” (see Black col. 6, ll. 23-27).  TABLE 1 further shows that Black’s training refinement also outperforms synthetic only training as well.  Accordingly, it is predictable that Black’s training refinement would improve the accuracy of Zisimopoulos’s segmentations.

Regarding claim 8, Zisimopoulos discloses further comprising: receiving the unlabeled real videos from the surgical data store; identifying the features of the unlabeled real videos with the machine learning model; and annotating the unlabeled surgical videos with the machine learning model by labeling the features of the unlabeled real videos identified by the machine learning model (see Zisimopoulos Figs. 4-6, and paras. 0036, 0037, 0048, 0072, 0075, 0082, and 0083, where the model is applied to a real data set to produce surgical instrument segmentations).
Furthermore, Black discloses a database (see Black Figs. 1A-1C, and col. 4, ll. 34-37, and col. 16, ll. 33-44, where image data is stored in a database in a data store) and the machine learning model trained with the real surgical videos (see Black TABLE 1 and col. 5, ll. 53 through col. 6, ll. 30, where “. . . the CNN 115 pre-trained with synthetic images and then fine-tuned with real images . . .”).

Claim(s) 11, 12, 18, and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zisimopoulos in view of Gillan, Stewart N., and George M. Saleh. “Ophthalmic surgical simulation: a new era.” JAMA ophthalmology 131.12 (2013): 1623-1624 (hereinafter referred to as “Gillan”).

Regarding claim 11, Zisimopoulos discloses a surgical simulator for simulating a surgical scenario, the surgical simulator comprising (see Zisimopoulos Figs. 4-6, and paras. 0026, 0027, and 0072-0075, where a surgical simulator is used to generate simulated video streams of a cataract surgery): a controller including one or more processors coupled to memory, the display system, and the user interface, wherein the memory stores instructions that when executed by the one or more processors cause the surgical simulator to perform operations including (see Zisimopoulos Figs. 1 and 15, and paras. 0004, 0005, and 0077, where the method is performed by processors and computer readable storage media): generating the simulated surgical videos, each representative of a simulation of the surgical scenario (see Zisimopoulos Figs. 4-6, and paras. 0026, 0027, and 0072-0075, where a surgical simulator is used to generate simulated video streams of a cataract surgery); associating simulated ground truth data from the simulation with the simulated surgical videos, wherein the simulated ground truth data corresponds to context information of at least one of a simulated surgical instrument, a simulated anatomical region, a simulated surgical task, or the simulated action (see Zisimopoulos Figs. 4-6, and paras. 0026, 0027, and 0072-0075, where the simulated video streams are translated into corresponding semantic instrument segmentations); and annotating features of the simulated surgical videos based, at least in part, on the simulated ground truth data for training a machine learning model (see Zisimopoulos Figs. 4-6, and paras. 0026, 0027, and 0072-0075, where pixels are the features of the simulated video streams that are annotated as either belonging to a surgical instrument or as belonging to the background).
Zisimopoulos does not explicitly disclose a display system adapted to show simulated surgical videos to a user of the surgical simulator; and a user interface adapted to correlate a physical action of the user with a simulated action of the surgical simulator.
However, Gillan discloses a display system adapted to show simulated surgical videos to a user of the surgical simulator; and a user interface adapted to correlate a physical action of the user with a simulated action of the surgical simulator (see Gillan pgs. 1623-1624, where “This comprises a mannequin head with a virtual eye, an operating microscope through which the VR surgery is seen, and a touch-screen monitor on which a supervisor can watch the surgeon perform, all connected to a customized personal computer. As in a real-life operating situation, there are 2 foot pedals: one to control the microscope and the other to control the phacoemulsification/vitrectomy/infusion and aspiration modes. Instruments contain colored heads from which optical tracking systems convert movements to electrical signals and are relayed to the simulator after being inserted to the artificial eye”).
It would have been obvious to one of ordinary skill in the art at the time of filing to simply substitute the surgery simulator of Zisimopoulos with the surgery simulator of Gillan, because both surgery simulators simulate the human eye – therefore, it is predictable that both (see Gillan pg. 1623), and surgeons would predictably desire the added convenience of purchasing a commercially available surgery simulator.

Regarding claim 12, Zisimopoulos discloses wherein the controller includes additional instructions that when executed by the one or more processors cause the surgical simulator to perform further operations comprising: pre-training the machine learning model with the features annotated from the simulated surgical videos, wherein the pre-training configures the machine learning model to probabilistically identify the features from unlabeled simulated videos corresponding to the surgical scenario (see Zisimopoulos Figs. 4-6, and paras. 0074-0076, 0078, and 0079, where training is performed on the simulated videos “using Stochastic Gradient Descent” – a probabilistic algorithm – before validation and testing using additional simulated videos, thereby producing the following probabilistic statistics: “pixel accuracy, mean class accuracy, mean Intersection over Union (mean IU) and frequency weighted IU (fwIU)”).

Regarding claim 18, Zisimopoulos discloses wherein the features correspond to at least one of segments of a surgical instrument, segments of an anatomical region, motion of the surgical instrument, surgical steps of a surgical procedure, surgical technique utilization, surgical event occurrence, or separation distance between the surgical instrument and the (see Zisimopoulos Figs. 4-6, and paras. 0026, 0027, and 0072-0074, where pixels are the features of the simulated video streams that are annotated as either belonging to a segment of a surgical instrument or as belonging to a segment of the background).

Regarding claim 19, Zisimopoulos discloses wherein the context information for each of the simulated surgical videos includes at least one of three-dimensional spatial boundaries of the simulated surgical instrument, three-dimensional spatial boundaries of the simulated anatomical region, motion of the simulated surgical instrument, separation distance between the simulated surgical instrument and the simulated anatomical region, orientation of the simulated surgical instrument, temporal boundaries of one or more surgical steps of the simulated surgical task from the surgical scenario, temporal boundaries of a simulated surgical complication, or spatial boundaries of the simulated surgical complication (see Zisimopoulos paras. 0026, 0027, 0036, and 0048, where objects are modeled in three-dimensions, and the simulated video may include “depth” and/or “motion” data).

Claim(s) 13 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zisimopoulos in view of Gillan as applied to claim 12 above, and in further view of Black.

Regarding claim 13, Zisimopoulos discloses further comprising: a surgical video data store coupled to the controller, wherein the surgical data store includes real surgical videos representative of the surgical scenario in a real environment, and wherein the controller includes additional instructions that when executed by the one or more processors cause the (see Zisimopoulos Figs. 4-6, and paras. 0036, 0037, 0048, 0072, 0075, 0082, and 0083, where the model is applied to a real data set to produce surgical instrument segmentations).
Zisimopoulos does not explicitly disclose a database; and wherein the real surgical videos include the features annotated for the machine learning model; and after the pre-training, training the machine learning model with the real surgical videos.
However, Black discloses a database (see Black Figs. 1A-1C, and col. 4, ll. 34-37, and col. 16, ll. 33-44, where image data is stored in a database in a data store).
It would have been obvious to one of ordinary skill in the art at the time of filing to use the databases of Black to store the images and/or videos of Zisimopoulos, because it is predictable that improving the organization of the images and/or video with a database would also improve the ease by which users can search for and find the images and/or videos via database commands.
Furthermore, Black discloses wherein the real surgical videos include the features annotated for the machine learning model; and after the pre-training, training the machine learning model with the real surgical videos (see Black TABLE 1 and col. 5, ll. 53 through col. 6, ll. 30, where “. . . the CNN 115 pre-trained with synthetic images and then fine-tuned with real images . . .”).
It would have been obvious to one of ordinary skill in the art at the time of filing to refine the machine learning model of Zisimopoulos in the manner taught and suggested by (see Black col. 6, ll. 23-27).  TABLE 1 further shows that Black’s training refinement also outperforms synthetic only training as well.  Accordingly, it is predictable that Black’s training refinement would improve the accuracy of Zisimopoulos’s segmentations.

Regarding claim 14, Zisimopoulos discloses wherein the controller includes additional instructions that when executed by the one or more processors cause the surgical simulator to perform further operations including: receiving, from the surgical video data store, the unlabeled real videos each representative of the surgical scenario in the real environment; identifying the features of the unlabeled real videos with the machine learning model; and annotating the unlabeled real videos with the machine learning model by labeling the features of the unlabeled real videos identified by the machine learning model (see Zisimopoulos Figs. 4-6, and paras. 0036, 0037, 0048, 0072, 0075, 0082, and 0083, where the model is applied to a real data set to produce surgical instrument segmentations).
Furthermore, Black discloses a database (see Black Figs. 1A-1C, and col. 4, ll. 34-37, and col. 16, ll. 33-44, where image data is stored in a database in a data store); and the machine learning model trained with the real surgical videos (see Black TABLE 1 and col. 5, ll. 53 through col. 6, ll. 30, where “. . . the CNN 115 pre-trained with synthetic images and then fine-tuned with real images . . .”).

Claim(s) 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zisimopoulos in view of Gillan as applied to claim 11 above, and in further view of Shrivastava.

Regarding claim 15, Zisimopoulos does not explicitly disclose wherein the controller includes additional instructions that when executed by the one or more processors cause the surgical simulator to perform further operations including: providing, to a refiner neural network, simulated images from the simulated surgical videos; and refining, via the refiner neural network, the simulated images to form refined surgical videos based, at least in part, on the simulated surgical videos, wherein the features annotated from the simulated surgical videos are included in the refined surgical videos, and wherein the refiner network is trained based on an adversarial loss from a discriminator neural network that compares the simulated images to unlabeled real images representative of the surgical scenario in the real environment.
However, Shrivastava discloses wherein the controller includes additional instructions that when executed by the one or more processors cause the surgical simulator to perform further operations including: providing, to a refiner neural network, simulated images from the simulated surgical videos; and refining, via the refiner neural network, the simulated images to form refined surgical videos based, at least in part, on the simulated surgical videos, wherein the features annotated from the simulated surgical videos are included in the refined surgical videos, and wherein the refiner network is trained based on an adversarial loss from a discriminator neural network that compares the simulated images to unlabeled real images representative of the surgical scenario in the real environment (see Shrivastava Figs. 1-6, and Algorithm 1, and pgs. 2244-2247 “2. S+U Learning with SimGAN,” where simulated images are refined using a refiner network until a discriminator network, applied to both refined simulated images and real images, minimizes the loss of Equation (2) and passes a threshold “Visual Turing Test”).
It would have been obvious to one of ordinary skill in the art at the time of filing to use Shrivastava’s GAN to refine the simulated videos of Zisimopoulos, because it is predictable that doing so would improve the quality of the simulated video by making the simulated video more realistic, and Shrivastava states “[w]e show a significant improvement over using synthetic images . . .” (see Shrivastava Abstract). 

Claim(s) 16 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zisimopoulos in view of Gillan and Shrivastava as applied to claim 15 above, and in further view of Black.

Regarding claim 16, Zisimopoulos discloses wherein the controller includes additional instructions that when executed by the one or more processors cause the surgical simulator to perform further operations including: pre-training the machine learning model with the features annotated from the refined surgical videos, wherein the pre-training configures the machine learning model to probabilistically identify the features from unlabeled simulated videos corresponding to the surgical scenario (see Zisimopoulos Figs. 4-6, and paras. 0074-0076, 0078, and 0079, where training is performed on the simulated videos “using Stochastic Gradient Descent” – a probabilistic algorithm – before validation and testing using additional simulated videos, thereby producing the following probabilistic statistics: “pixel accuracy, mean class accuracy, mean Intersection over Union (mean IU) and frequency weighted IU (fwIU)”); receiving, from the surgical video data store, real surgical videos representative of the surgical scenario in the real environment, wherein the training configures the machine learning model to probabilistically identify the features from unlabeled real videos corresponding to the surgical scenario in the real environment (see Zisimopoulos Figs. 4-6, and paras. 0036, 0037, 0048, 0072, 0075, 0082, and 0083, where the model is applied to a real data set to produce surgical instrument segmentations).
Zisimopoulos does not explicitly disclose a database; and wherein the real surgical videos include the features annotated for the machine learning model; and after the pre-training, training the machine learning model with the real surgical videos.
However, Black discloses a database (see Black Figs. 1A-1C, and col. 4, ll. 34-37, and col. 16, ll. 33-44, where image data is stored in a database in a data store).
It would have been obvious to one of ordinary skill in the art at the time of filing to use the databases of Black to store the images and/or videos of Zisimopoulos, because it is predictable that improving the organization of the images and/or video with a database would also improve the ease by which users can search for and find the images and/or videos via database commands.
Furthermore, Black discloses wherein the real surgical videos include the features annotated for the machine learning model; and after the pre-training, training the machine learning model with the real surgical videos (see Black TABLE 1 and col. 5, ll. 53 through col. 6, ll. 30, where “. . . the CNN 115 pre-trained with synthetic images and then fine-tuned with real images . . .”).
(see Black col. 6, ll. 23-27).  TABLE 1 further shows that Black’s training refinement also outperforms synthetic only training as well.  Accordingly, it is predictable that Black’s training refinement would improve the accuracy of Zisimopoulos’s segmentations.

Regarding claim 17, Zisimopoulos discloses wherein the controller includes additional instructions that when executed by the one or more processors cause the surgical simulator to perform further operations including: receiving, from the surgical video data store, the unlabeled real videos; identifying the features of the unlabeled real videos with the machine learning model; and annotating the unlabeled surgical videos with the machine learning model by labeling the features of the unlabeled real videos identified by the machine learning model (see Zisimopoulos Figs. 4-6, and paras. 0036, 0037, 0048, 0072, 0075, 0082, and 0083, where the model is applied to a real data set to produce surgical instrument segmentations).
Furthermore, Black discloses a database (see Black Figs. 1A-1C, and col. 4, ll. 34-37, and col. 16, ll. 33-44, where image data is stored in a database in a data store); and the machine learning model trained with the real surgical videos (see Black TABLE 1 and col. 5, ll. 53 through col. 6, ll. 30, where “. . . the CNN 115 pre-trained with synthetic images and then fine-tuned with real images . . .”).

Claim(s) 20, 21, and 23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zisimopoulos in view of Black.

Regarding claim 20, Zisimopoulos discloses a non-transitory computer-readable storage medium having stored thereon instructions which, when executed by one or more processing units, cause the one or more processing units to perform operations comprising (see Zisimopoulos Figs. 1 and 15, and paras. 0004, 0005, and 0077, where the method is performed by processors and computer readable storage media): obtaining simulated surgical videos from a surgical video data store, wherein each of the simulated surgical videos is representative of a simulation of a surgical scenario generated by a surgical simulator (see Zisimopoulos Figs. 4-6, and paras. 0026, 0027, 0036, 0037, 0048, and 0072-0075, where a surgical simulator is used to generate simulated video streams of a cataract surgery, and data is stored in a data store), wherein each of the simulated surgical videos includes simulated ground truth data corresponding to context information of at least one of a simulated surgical instrument, a simulated anatomical region, or a simulated surgical task (see Zisimopoulos Figs. 4-6, and paras. 0026, 0027, and 0072-0075, where the simulated video streams are translated into corresponding semantic instrument segmentations); annotating features of the simulated surgical videos based, at least in part, on the simulated ground truth data for training a machine learning model (see Zisimopoulos Figs. 4-6, and paras. 0026, 0027, and 0072-0075, where pixels are the features of the simulated video streams that are annotated as either belonging to a surgical instrument or as belonging to the background); pre-training the machine learning (see Zisimopoulos Figs. 4-6, and paras. 0074-0076, 0078, and 0079, where training is performed on the simulated videos “using Stochastic Gradient Descent” – a probabilistic algorithm – before validation and testing using additional simulated videos, thereby producing the following probabilistic statistics: “pixel accuracy, mean class accuracy, mean Intersection over Union (mean IU) and frequency weighted IU (fwIU)”); and saving the machine learning model and associated parameters configured by the pre-training to a data store included in the surgical simulator (see Zisimopoulos Figs. 1 and 15, and paras. 0004, 0005, 0036, 0037, 0048, and 0077, where data is stored to a data store).
Zisimopoulos does not explicitly disclose a database.
However, Black discloses a database (see Black Figs. 1A-1C, and col. 4, ll. 34-37, and col. 16, ll. 33-44, where image data is stored in a database in a data store).
It would have been obvious to one of ordinary skill in the art at the time of filing to use the databases of Black to store the images and/or videos of Zisimopoulos, because it is predictable that improving the organization of the images and/or video with a database would also improve the ease by which users can search for and find the images and/or videos via database commands.

Regarding claim 21, Zisimopoulos discloses wherein the instructions, which when executed by the one or more processing units, cause the one or more processing units to perform further operations comprising: receiving, from a surgical video data store, real surgical 
Zisimopoulos does not explicitly disclose a database; and wherein the real surgical videos include the features annotated for the machine learning model; and after the pre-training, training the machine learning model with the real surgical videos.
However, Black discloses a database (see Black Figs. 1A-1C, and col. 4, ll. 34-37, and col. 16, ll. 33-44, where image data is stored in a database in a data store); and wherein the real surgical videos include the features annotated for the machine learning model; and after the pre-training, training the machine learning model with the real surgical videos (see Black TABLE 1 and col. 5, ll. 53 through col. 6, ll. 30, where “. . . the CNN 115 pre-trained with synthetic images and then fine-tuned with real images . . .”).
It would have been obvious to one of ordinary skill in the art at the time of filing to refine the machine learning model of Zisimopoulos in the manner taught and suggested by Black, because Black states that “[w]hen the CNN 115 is pre-trained with synthetic images and then fine-tuned with real images . . ., the model predicts very accurate segmentations and outperforms the "Real" version of the CNN 115 by a large margin” (see Black col. 6, ll. 23-27).  TABLE 1 further shows that Black’s training refinement also outperforms synthetic only training as well.  Accordingly, it is predictable that Black’s training refinement would improve the accuracy of Zisimopoulos’s segmentations.

claim 23, Zisimopoulos discloses wherein the instructions, which when executed by the one or more processing units, cause the one or more processing units to perform further operations comprising: receiving, from the surgical video data store, real surgical videos representative of the surgical scenario in a real environment, wherein the training configures the machine learning model to probabilistically identify the features from unlabeled real videos corresponding to the surgical scenario in the real environment (see Zisimopoulos Figs. 4-6, and paras. 0036, 0037, 0048, 0072, 0075, 0082, and 0083, where the model is applied to a real data set to produce surgical instrument segmentations).
Zisimopoulos does not explicitly disclose a database; and wherein the real surgical videos include the features annotated for the machine learning model; and after the pre-training, training the machine learning model with the real surgical videos.
However, Black discloses a database (see Black Figs. 1A-1C, and col. 4, ll. 34-37, and col. 16, ll. 33-44, where image data is stored in a database in a data store); and wherein the real surgical videos include the features annotated for the machine learning model; and after the pre-training, training the machine learning model with the real surgical videos (see Black TABLE 1 and col. 5, ll. 53 through col. 6, ll. 30, where “. . . the CNN 115 pre-trained with synthetic images and then fine-tuned with real images . . .”).
It would have been obvious to one of ordinary skill in the art at the time of filing to refine the machine learning model of Zisimopoulos in the manner taught and suggested by Black, because Black states that “[w]hen the CNN 115 is pre-trained with synthetic images and then fine-tuned with real images . . ., the model predicts very accurate segmentations and outperforms the "Real" version of the CNN 115 by a large margin” (see Black col. 6, ll. 23-27).  .

Claim(s) 22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zisimopoulos in view of Black as applied to claim 20 above, and in further view of Shrivastava.

Regarding claim 22, Zisimopoulos discloses wherein the unlabeled real images are representative of the surgical scenario in the real environment (see Zisimopoulos Figs. 4-6, and paras. 0036, 0037, 0048, 0072, 0075, 0082, and 0083, where the model is applied to a real data set to produce surgical instrument segmentations).
Zisimopoulos does not explicitly disclose wherein the instructions, which when executed by the one or more processing units, cause the one or more processing units to perform further operations comprising: providing, to a refiner neural network, simulated images from the simulated surgical videos before the pre-training; and refining the simulated surgical videos with the refiner neural network, wherein the refiner neural network adjusts the simulated images until a discriminator neural network determines that the simulated images are comparable to unlabeled real images within a first threshold, wherein the features annotated from the simulated surgical videos are included after the refining.
However, Shrivastava discloses wherein the instructions, which when executed by the one or more processing units, cause the one or more processing units to perform further operations comprising: providing, to a refiner neural network, simulated images from the (see Shrivastava Figs. 1-6, and Algorithm 1, and pgs. 2244-2247 “2. S+U Learning with SimGAN,” where simulated images are refined using a refiner network until a discriminator network, applied to both refined simulated images and real images, passes a threshold “Visual Turing Test”).
It would have been obvious to one of ordinary skill in the art at the time of filing to use Shrivastava’s GAN to refine the simulated videos of Zisimopoulos, because it is predictable that doing so would improve the quality of the simulated video by making the simulated video more realistic, and Shrivastava states “[w]e show a significant improvement over using synthetic images . . .” (see Shrivastava Abstract). 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANDREW M MOYER whose telephone number is (571)272-9523.  The examiner can normally be reached on Monday-Friday 9-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ANDREW M MOYER/             Primary Examiner, Art Unit 2663