DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
2.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
3.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

4.	Claims 1-5,7,9-14,16-21 and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Yoshino et al. (US 2020/0229688) in view of McDowall et al. (US 2016/0089013).
Regarding claim 1, Yoshino teaches a method comprising; obtaining, from an endoscope (e.g.,
abstract) and between first and second output video frames of a video having a frame rate (e.g., controlling a focus position of an objective optical system, acquiring images sequentially captured by an image sensor, and combining the images in N frames, e.g., first and second output video frames, image sensor captures the images at the frame rate, N frame rate figs. 1,3 and 5, frame position, N frame rates, paragraphs 0042), a preliminary image of a scene during a surgical procedure (e.g., images in frames F1, F2, F3 and F4 are respectively captured at focus positions move the robot through which to perform surgery on a patient, the user operates the operation section of the control device to manipulate the scope via a robot and photograph (image) a surgical region, user operates the robot while seeing the images displayed on a display devices, figs. 1-19, paragraphs 0031,0150), the preliminary image comprising a plurality of pixels (e.g., images/output images comprises plurality of pixels, paragraph 0135), wherein the first and second output video frames are consecutive video frames in the real-time video (e.g., sequential outputs of images, paragraphs 0038,0082,0084), determining, for each pixel of the plurality of pixels, a depth within the scene (e.g., the best focused image of the images in N frames is selected in each local region, e.g., each pixel, of the image, and used to form the depth of field extended image, paragraph 0038), determining first image capture parameters for the scene based on a scene illumination setting and a first set of pixels within a first range of depths in the scene (e.g., the best focused image of the images in N frames is selected in each local region, e.g., each pixel, of the image, and used to form the depth of field extended image; when the illumination control section controls the illumination light, the brightness of the respective images in N frames to be combined into the depth of field extended image in one frame may vary (depth range), the quantity of tight emission when the image IB1 is captured may be calculated from the depth of field extended image combining the images IA1 and IA2, and the quantity of light emission when the image IB2 is captured may be calculated from the depth of field extended image combining the images IA2 and IB1, paragraphs 0038,
0071,0148), capturing, between the first and second output video frames, a first image using the first image capture parameters (e.g., acquiring images sequentially captured by an image sensor, and combining the images in N frames, illumination light control section changes a quantity of light emission of the illumination light (parameters) in a period between a period when the images in the first set of N frames are captured, abstract, paragraphs 0007), determining an illumination correction for a second set of pixels at a second range of depths within the scene (e.g., when the illumination control section controls the illumination tight, the brightness of the respective images in N frames to be combined into the depth of field extended image in one frame may vary (depth range), when the light adjustment is performed between the images IA2 and |B1, the images having different brightness are combined into the second depth of field extended image, in this case, the correction process for correcting the image brightness by the image processing is performed to equalize the brightness of the images |A2 and IB1; then, the resultant images are combined into the second depth of field extended image, the light adjustment correction section calculates an illumination correction coefficient, paragraphs 0071,0072,
0144), capturing, between the first and second output video frames, a second image using the illumination correction (e.g., illumination correction coefficient C for controlling the quantity of light emission E1 is calculated based on at least one of the images IA1 and IA2, paragraph 0146), generating a composite image based on the first and second images (e.g., combining images IA1 and IA2 in two frames (N=2) into a depth of field extended image EIA in one frame (composite image), N frames to be combined into the depth of field extended image in one frame (composite image), paragraphs 0034,0071), and outputting the composite image as the second output video frame (e.g., fig. 4, display section, also sequential output the digital images, paragraphs 0034,0082-0084).
	Yoshino is silent in regards to, a real-time video. 
However, McDowall clearly teaches a real-time video (image with the superimposed highlighted
fluorescence image is provided in real time to a surgeon performing a surgical operation, paragraphs 0011,0048).
In view of the above, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to include a real-time video as taught by McDowall into the system of Yoshino, for the purpose of providing processes that are performed for each frame so that the surgeon sees a real-time video image of tissue while performing the operation, as suggested by the reference.
Regarding claim 2, the combination of Yoshino and McDowall teach the method of claim 1, further comprising; determining a second illumination correction for a third set of pixels at a third range 
of depths within the scene (e.g., images in N frames that have undergone a correction process to make image brightness constant, into the depth of field extended image, N frames is selected in each local region, e.g., each pixel, of the image, when the illumination control section controls the illumination light, the brightness of the respective images in N frames to be combined into the depth of field extended image in one frame may vary (depth range), in abstract, paragraphs 0038,0071 of Yoshino), capturing, between the first and second output video frames, a third image using the second illumination correction (e.g., combines the images in N frames that have been controlled to receive a constant quantity of light emission of illumination light or the images in N frames that have undergone a correction process to make image brightness constant, into the depth of field extended image, abstract, paragraph 0071 of Yoshino), and wherein generating a composite image is further based on the third image (e.g., N frames to be combined into the depth of field extended image in one frame (composite image), sequentially outputs the digital images/output images sequentially, paragraph 007 1 of Yoshino).
	Regarding claim 3, the combination of Yoshino and McDowall teach the method of claim 1, wherein, obtaining the preliminary image comprises obtaining stereoscopic preliminary images (e.g., 
an image IA1 captured in the first light emission period TLA1 and an image IA2 captured in the second light emission period TLA2 are combined into a depth of field extended image EIA, an additional
image may be captured at another different focus position (stereoscopic image), paragraph 0048 of Yoshino, also paragraphs 0006-0007 of ‘013),  determining the depth within the scene comprises determining, for each pixel of each stereoscopic preliminary image, the depth within the scene (an image A1 captured in the first light emission period TLA1 and an image IA2 captured in the second light emission period TLA2 are combined into a depth of field extended image EIA, furthermore, an additional image may be captured at another different focus position and the images including the additional image in three or more frames may be combined into the depth of field extended image, paragraph 0048), capturing the first image comprises capturing stereoscopic first images (e.g., an image IA1 captured and an image IA2 captured in the second light emission period TLA2 are combined into a depth of field extended image EIA, furthermore, an additional image may be captured at another different focus position and the images including the additional image in three or more frames may be combined into the depth of field extended image; the light adjustment detection section calculates a light adjustment detection value (compensate for scene lighting) from the pixel value of the image output, paragraphs 0048,0143 of Yoshino), capturing the second image comprises capturing stereoscopic second images ( e.g. additional image may be captured at another different focus position (stereoscopic second images), paragraphs 0048 of Yoshino, also paragraphs 0006-0007 of ‘013), generating the composite image comprise generating composite stereoscopic images from the first and second stereoscopic images (an image IA1 captured and an image IA2 captured in the second light emission period TLA2 are combined into a depth of field extended image EIA, furthermore, an additional image may be captured at another different focus position and the images including the additional image in three or more frames may be combined into the depth of field extended image, paragraph 0048 of Yoshino, also figs. 2-3, paragraphs 0050-0051 of ‘013), and outputting the composite image comprises outputting the composite stereoscopic images (e.g., figs. 2-3 of ‘013).
	Regarding claim 4, the combination of Yoshino and McDowall teach the method of claim 1, wherein the illumination correction comprises one or more of (i) adjusting a light source, (ii) adjusting an image sensor integration time, or (iii) adjusting an image sensor gain (e.g., the illumination control section controls the illumination light to adjust the brightness of the depth of field extended
image, light adjustment control paragraphs 0040,0142 of Yoshino).
	Regarding claim 5, the combination of Yoshino and McDowall teach the method of claim 1 
further comprising projecting structured light onto the scene (e.g., combines the images in N frames that have been controlled to receive a constant quantity of light emission of illumination light, abstract, paragraph 0033 of Yoshino, also paragraphs 0073-0074 of ‘013), and wherein determining, for each pixel of the plurality of pixels, the depth within the scene is based on detecting the structured light in the preliminary image (e.g., images in N frames is selected in each local region, e.g., each pixel, of the image, and used to form the depth of field extended image, paragraph 0038 of Yoshino).
	Regarding claim 7, the combination of Yoshino and McDowall teach the method of claim 1 
further comprising determining the first and second ranges of depths based on a total number of pixels in the preliminary image (e.g., the depth of field extended image is an image whose depth of field is artificially extended based on a plurality of images having different focus positions, for example,
a best focused image of the images in N frames is selected in each local region (e.g., each pixel) of the image, and used to form the depth of field extended image, paragraph 0038 of Yoshino) and a number of images to be captured between the first and second output video frames (e.g., combines the images IA1 and IA2 in two frames (N=2) into a depth of field extended image EIA in one frame, paragraph 0034 of Yoshino).
	Regarding claims 9-10 and 17, the limitations claimed are substantially similar to claim 1 above, and has been addressed in the above claim 1. As for additional limitation, communication interface (please see fig. 4, interface of Yoshino).
 	Regarding claims11-14,16,18-21 and 23, the limitations claimed are substantially similar to claims 2-5 and 7 above, and has been addressed in the above claim 2-5 and 7.
5.	Claims 6,15 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Yoshino et al. (US 2020/0229688) in view of McDowall et al. (US 2016/0089013) and further in view of Douglas et al. (US 2015/0065803).
	Regarding claim 6, the combination of Yoshino and McDowall teach  the method of claim 1, including determining for each pixel of the plurality of pixels, the depth within the scene (e.g., a best focused image of the images in N frames is selected in each local region, e.g., each pixel, of the image,
and used to form the depth of field extended image, paragraph 0038 of ‘688). 
The combination is silent in regards to, using a machine learning model to estimate, for each pixel of the plurality of pixels, the depth within the scene.
However, Douglas in the same field of endeavor teaches the above, machine learning model to estimate, for each pixel of the plurality of pixels, the depth within the scene (e.g., calculating pixels values of regions from the image, features may include region area (number of pixels), features are fed into a pre-trained supervised machine learning classification model to predict whether the given connected region represents a region of increasing depth in the image, paragraphs 0075,0076,0315,
0316).
In view of the above, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to include a machine learning mode to estimate, for each pixel of the plurality of pixels, the depth within the scene as taught by Douglas into the system of Yoshino, for the purpose of analyzing the video recording to detect and track a region of increasing depth in order to provide proper guidance during the procedure, as suggested by the reference.
Regarding claims 15 and 22, the limitations claimed are substantially similar to claim 6 above, and has been addressed in the above claim 6.
6.	Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Yoshino et al. (US 2020/0229688) in view of McDowall et al. (US 2016/0089013) and further in view of Yaguchi (US 2017/0039709).
Regarding claim 8, combination of Yoshino and McDowall teach  the method of claim 1, but fails to teach wherein the preliminary image and the first image are the same image. 
However; Yaguchi in the same field of endeavor teaches the above claimed feature, wherein the preliminary image and the first image are the same image (e.g., the forward reference (preliminary) image is the first image, paragraphs 0332,0035).
In view of the above, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to include wherein the preliminary image and the first image are the same image as taught by the reference, in order reduce the number of images included in the output summary image sequence.
Contact Information
7.	Any inquiry concerning this communication or earlier communications from the examiner
should be directed to Behrooz Senfi whose telephone number is 571-272-7339. The examiner can
normally be reached on M-F 10:00-6:00.
	Examiner interviews are available via telephone, in-person, and video conferencing using a
USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use
the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
	If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor,
Kelley Christopher can be reached on 571-272-7331. The fax phone number for the organization where
this application or proceeding is assigned is 571 -273-8300.
	Information regarding the status of an application may be obtained from the Patent Application
Information Retrieval (PAIR) system. Status information for published applications may be obtained
from either Private PAIR or Public PAIR. Status information for unpublished applications is available
through Private PAIR only. For more information about the PAIR system, see
http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact
the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a
USPTO Customer Service Representative or access to the automated information system, call 800-786-
9199 (IN USA OR CANADA) or 571-272-1000.
/BEHROOZ M SENFI/Primary Examiner, Art Unit 2482