DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 5-11, 13-15, 17, 19-21, 24-30, 32-33, 35, and 37-38 are rejected under 35 U.S.C. 103 as being unpatentable over Shen (U.S PG-PUB NO. 20200175326 A1 (PRO NO. 62/775287)) in view of Yoo (U.S PG-PUB NO. 20180060648 A1), and further in view of Nobuhiro (JP 2018112890 A), in view of Kozitsky et al (U.S PG-PUB NO. 20170262723 A1).
-Regarding claim 1, Shen discloses an object detection method comprising (Abstract; FIG. 2; FIG. 4): setting a first window region (FIGS. 2-4, steps 204-206) and a second window region (FIGS. 2-4 steps 202) corresponding to partial regions of different sizes in an input image, wherein the second window region is larger than the first window region ([0034]-[0036], “crop of the image”; FIGS.2-4, steps 204-206); downsampling the second window region to generate a resized second window region (FIGS. 2-4, step 208; [0037]); detecting a first object candidate from the first window region (FIGS. 2-4, step 210; [0038]-[0039], “high resolution crop … to a detector”) FIGS. 2-4, step 210; [0038]-[0039], “the low resolution image to a detector”); and detecting an object included in the input image based on one or both of the first object candidate and the second object candidate (Abstract: FIG. 2, step 212; FIG. 4; [0016]; [0038]-[0039]; [0046]; [0069]; [0078]), the detecting of the object including removing an overlapping region from the first object candidate and the second object candidate, wherein the detecting of the first object candidate and the second object candidate comprises (Abstract; FIG. 2; FIG. 4): in a case in which the second object candidate is not detected in a second image extracted from the resized second window region, detecting the first object candidate by applying a first image extracted from the first window region to a first neural network ([0012]).
Shen is silent to teach a second window region corresponding to partial regions of different sizes in an input image
In the same field of endeavor, Yoo teaches a second window region (Yoo: FIG. 3B 335; FIG. 4B 410) corresponding to partial regions of different sizes in an input image (Yoo: FIG. 3A 310, 320; FIG. 330).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Shen with the teaching of Yoo by using partial regions of different sizes in an input image in order to reduce resource consumption and an amount of calculation.
Although Shen in view of Yoo does teach detecting the object included in the input image from one or both of the first object candidate and the second object candidate and remove duplicates by applying non-maximum suppression (NMS) to the Shen: FIGS 2-4 step 212; [0044]) and NMS is well-known to remove or suppress overlapping regions for object detection (Liu et al arXiv 1901.03796v1), Shen in view of Yoo is silent to teach detecting of the object including removing an overlapping region from the first object candidate and the second object candidate.
However, Nobuhiro is an analogous art pertinent to the problem to be solved in this application and also disclose detecting of the object including removing an overlapping region from the first object candidate and the second object candidate (Nobuhiro: (57) Overview, “excluding the overlapping area”; FIGS. 5-6, 11-12; [0044]-[0048]; [0058]-[0062]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of Yoo with the teaching of Nobuhiro by detecting of the object including removing an overlapping region from the first object candidate and the second object candidate in order to accurately detect the objects.
Shen in view of Yoo, and further in view of Nobuhiro is silent to teach that in a case in which the second object candidate is not detected in a second image extracted from the resized second window region, detecting the first object candidate by applying a first image extracted from the first window region to a first neural network.
However, Kozitsky is an analogous art pertinent to the problem to be solved in this application and disclose a system and method for object detection and classification (Kozitsky: Abstract; FIG. 2; FIG. 12). The captured image can then be classified according to a confidence driven classification based on classification criteria Kozitsky: Abstract; FIG. 2; FIG. 12). Kozitsky further discloses that in a case in which the second object candidate is not detected in a second image extracted from the resized second window region (Kozitsky: FIG. 2, image 32, weak classifier 36, decision block 40; [0027]-[0031], “two-stage approach”, “primary confidence test”,  “identify the reason for failure”), detecting the first object candidate by applying a first image extracted from the first window region to a first neural network (Kozitsky: FIGS. 2, 12, strong classifier 44; FIG.5; [0028]; [0030], “CNNs”; [0043], “test is satisfied … passed on to …”; [0044], “block 44 … entered only when … confidence test fails”; [0051]-[0053]; [0070]-[0071]). 
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of Yoo, and further in view of Nobuhiro with the teaching of Kozitsky by detecting the first object candidate by applying a first image extracted from the first window region to a first neural network in a case in which the second object candidate is not detected in a second image extracted from the resized second window region in order to quickly detect the objects, reduce cost, save computation power with required accuracy.
-Regarding claim 2, Shen in view of Yoo, and further and further in view of Nobuhiro, in view of Kozitsky discloses the method of claim 1.
Shen discloses wherein the setting of the first window region and the second window region comprises: setting the first window region (FIGS. 2-4 steps 204-206) and the second window region based on an attention point of a user in the input image ([0036], “priority FOV”).

In the same field of endeavor, Yoo teaches setting the second window region (Yoo: FIG. 3B 335; FIG. 4B 410) based on an attention point of a user in the input image (Yoo: [0085], “eye region 335”; “global region 410”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of Yoo, and further in view of Nobuhiro with the teaching of Kozitsky by setting the first window region and the second window region based on an attention point of a user in the input image in order to reduce resource consumption and an amount of calculation.
-Regarding claim 5, the modification further discloses wherein the downsampling of the second window region comprises: adjusting a second size of the second window region to be equal to a first size of the first window region by downsampling an image corresponding to the second window region (Shen: [0037], “downsampled image may beset according to the size of the cropped region”; [0051]).
-Regarding claim 6, the modification further discloses wherein the first size of the first window region and the second size of the second window region are determined based on one or more of a type of the object to be detected from the input image, a field of view (FoV), and camera distance information corresponding to the input image (Shen: FIG. 2 steps 204-206; [0017]; [0034]-[0035]).
-Regarding claim 7, the modification further discloses wherein the first window region is configured to recognize an object having a size less than a preset size (Shen: FIGS 2-4 206; [0018], “a center third of the image”), and a first image extracted from (Shen: [0036], “same resolution as the raw image”).
-Regarding claim 8, the modification further discloses wherein a second window region is configured to recognize an object having a size greater than a preset size (Shen: FIGS. 2-4 steps 202; [0018], “a center third of the image”), and a second image extracted from the resized second window region has a resolution lower than a resolution of the input image (Shen: [0037], “a low resolution version … downsampled field of vision).
-Regarding claim 9, the modification further discloses wherein the detecting of the object comprises: detecting the object included in the input image from one or both of the first object candidate and the second object candidate using non-maximum suppression (NMS) (Shen: FIGS 2-4 step 212; [0044], “applying non-maximum suppression (NMS) to the combined outputs”).
-Regarding claim 10, the modification further discloses wherein the detecting of the first object candidate and the second object candidate comprises: detecting the first object candidate based on whether the second object candidate is detected from the resized second window region (Shen: [0046], “estimate or determination … car … needed to properly match, merge, and/or de-duplicate said car from the cropped image with the same car detected in the full image”).
-Regarding claim 11, the modification further discloses wherein the detecting of the object comprises: adjusting a location of the first window region based on a location of the second object candidate (Shen: [0018], “a road”, “vanishing line”, “turning road”; [0024]-[0025], “adjusting … based on vehicle heading”, “image-only based rules, attention-based networks, …”; [0034]); detecting the first object candidate from the adjusted location of the first window region (Shen: [0018]; FIGS. 2-4 step 210); and detecting the object included in the input image from the second object candidate and the first object candidate detected from the adjusted location (Shen: [0018]; FIGS. 2-4 steps 210-212).
-Regarding claim 13, the modification further discloses wherein the detecting of the first object candidate and the second object candidate comprises: detecting the second object candidate by applying, to a second neural network (Shen: FIG.1, [0020]; [0038]-[0039], “a detector (e.g., running a deep learning neural network); FIG. 4”), a second image extracted from the resized second window region (Shen: FIG. 4, steps 210-212; [0045], “second bounding box”); and detecting the first object candidate by applying, to the first neural network (Shen: FIG.1, [0020]; [0038]-[0039], “a detector (e.g., running a deep learning neural network); FIG. 4”), the first image extracted from the first window region (Shen: FIG. 4, steps 210-212; [0045], “first bounding box”).
-Regarding claim 14, the modification further discloses wherein the detecting of the first object candidate and the second object candidate comprises (Shen: [0039], The images … can be fed into …, two parallel instances of the same detector, different detectors”): determining whether the second object candidate is detected from the second image extracted from the resized second window region by applying the second image to the second neural network (Shen: FIG.1, [0020]; [0038]-[0039], “a detector (e.g., running a deep learning neural network); FIG. 4”); and determining whether to apply a first image extracted from the first window region to the first neural Shen: [0046], “match … same car detected in the full image”; FIG.1, [0020]; [0038]-[0039], “a detector (e.g., running a deep learning neural network); FIG. 4”).
-Regarding claim 15, Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky discloses the method of claim 14.
Shen in view of Yoo, and further in view of Nobuhiro is silent to teach that in a case in which the second object candidate is detected from the second image, not applying the first image to the first neural network.
However, Kozitsky is an analogous art pertinent to the problem to be solved in this application and disclose a system and method for object detection and classification (Kozitsky: Abstract; FIG. 2; FIG. 12). The captured image can then be classified according to a confidence driven classification based on classification criteria determined by the weak classifier and the secondary strong classifier (Kozitsky: Abstract; FIG. 2; FIG. 12). Kozitsky further discloses that in a case in which the second object candidate is detected from the second image (Kozitsky: FIG. 2, image 32, weak classifier 36, decision block 40; [0027]-[0031], “two-stage approach”, “primary confidence test”; [0043], “test is satisfied”; [0052]), not applying the first image to the first neural network (Kozitsky: FIGS. 2, 12, strong classifier 44; FIG.5; [0028]; [0030], “CNNs”; [0043], “test is satisfied … passed on to …”; [0044], “block 44 … entered only when … confidence test fails”; [0051]-[0053]; [0070]-[0071]). 
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of 
-Regarding claim 17, the modification further discloses wherein the detecting of the first object candidate comprises (Shen: FIGS. 2-4; Abstract): estimating a location in the second image at which the object is estimated to be located (Shen: FIGS. 2-4, step 212; [0039], “parallel”, “labeled boxes … surrounding object detected”, “object location”); adjusting a location of the first window region using the estimated location (Shen: [0040]; [0041]-[0042], “scaling the detected object”, “include: … location”; [0043], “aligned”); extracting the first image from the adjusted location of the first window region (Shen: FIGS. 2-4, step 212); and detecting the first object candidate by applying, to the first neural network (Shen: Abstract; FIGS. 2-4, step 210;[0038]-[0039]), the first image extracted from the adjusted location (Shen: FIGS. 2-4, step 212).
-Regarding claim 19, the modification further discloses a non-transitory computer-readable storage medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform the object detection method (Shen: [0071]).
-Regarding claim 20, Shen discloses an object detection apparatus comprising (FIG.1 system 100): a communication interface configured to obtain an input image (FIG. 1; [0020]-[0022]; [0030]); and a processor configured to ([0019]-[0020]; [0032]; [0038]-[0039]): set a first window region (FIGS. 2-4 steps 204-206) and a second FIGS. 2-4 steps 202) larger than the first window region ([0034]-[0036], “crop of the image”; FIGS.2-4, steps 204-206) that correspond to partial regions of different sizes in the input image; downsample the second window region to generate a resized second window region (FIGS. 2-4 step 208; [0037]); detect a first object candidate from the first window region (FIGS. 2-4, step 210; [0038]-[0039], “high resolution crop … to a detector”) and a second object candidate from the resized second window region (FIGS. 2-4, step 210; [0038]-[0039], “the low resolution image to a detector”); and detect an object included in the input image based on one or both of the first object candidate and the second object candidate (Abstract: FIG. 2, step 212; FIG. 4; [0016]; [0038]-[0039]; [0046]; [0069]; [0078]), wherein the processor is configured to ([0019]-[0020]; [0032]; [0038]-[0039]): for the detecting of the object, remove an overlapping region from the first object candidate and the second object candidate; and in a case in which the second object candidate is not detected in a second image extracted from the resized second window region, detect the first object candidate by applying a first image extracted from the first window region to a first neural network ([0012]).
Shen is silent to teach a second window region corresponding to partial regions of different sizes in an input image
In the same field of endeavor, Yoo teaches an object detection apparatus comprising (Yoo: FIG. 8): a communication interface configured to obtain an input image (Yoo: FIG. 8, network 870 device 850 camera 830); and a processor (Yoo: FIG. 8 processor 810) configured to set a second window region (Yoo: FIG. 3B 335; FIG. 4B 410) corresponding to partial regions of different sizes in an input image (Yoo: FIG. 3A 310, 320; FIG. 330).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Shen with the teaching of Yoo by using partial regions of different sizes in an input image in order to reduce resource consumption and an amount of calculation.
Although Shen in view of Yoo does teach detecting the object included in the input image from one or both of the first object candidate and the second object candidate and remove duplicates by applying non-maximum suppression (NMS) to the combined outputs (Shen: FIGS 2-4 step 212; [0044]) and NMS is well-known to remove or suppress overlapping regions for object detection (Liu et al arXiv 1901.03796v1), Shen in view of Yoo is silent to teach detecting of the object including removing an overlapping region from the first object candidate and the second object candidate.
However, Nobuhiro is an analogous art pertinent to the problem to be solved in this application and also disclose detecting of the object including removing an overlapping region from the first object candidate and the second object candidate (Nobuhiro: (57) Overview, “excluding the overlapping area”; FIGS. 5-6, 11-12; [0044]-[0048]; [0058]-[0062]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of Yoo with the teaching of Nobuhiro by detecting of the object including removing an 
Shen in view of Yoo, and further in view of Nobuhiro is silent to teach that in a case in which the second object candidate is not detected in a second image extracted from the resized second window region, detecting the first object candidate by applying a first image extracted from the first window region to a first neural network.
However, Kozitsky is an analogous art pertinent to the problem to be solved in this application and disclose a system and method for object detection and classification (Kozitsky: Abstract; FIG. 2; FIG. 12). The captured image can then be classified according to a confidence driven classification based on classification criteria determined by the weak classifier and the secondary strong classifier (Kozitsky: Abstract; FIG. 2; FIG. 12). Kozitsky further discloses that in a case in which the second object candidate is not detected in a second image extracted from the resized second window region (Kozitsky: FIG. 2, image 32, weak classifier 36, decision block 40; [0027]-[0031], “two-stage approach”, “primary confidence test”,  “identify the reason for failure”), detecting the first object candidate by applying a first image extracted from the first window region to a first neural network (Kozitsky: FIGS. 2, 12, strong classifier 44; FIG.5; [0028]; [0030], “CNNs”; [0043], “test is satisfied … passed on to …”; [0044], “block 44 … entered only when … confidence test fails”; [0051]-[0053]; [0070]-[0071]). 
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of Yoo, and further in view of Nobuhiro with the teaching of Kozitsky by detecting the first 
-Regarding claim 21, Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky discloses the method of claim 20.
Shen discloses wherein the setting of the first window region and the second window region comprises: setting the first window region (FIGS. 2-4 steps 204-206) and the second window region based on an attention point of a user in the input image ([0036], “priority FOV”).
Shen is silent to teach setting the second window region based on an attention point of a user in the input image.
In the same field of endeavor, Yoo teaches setting the second window region (Yoo: FIG. 3B 335; FIG. 4B 410) based on an attention point of a user in the input image (Yoo: [0085], “eye region 335”; “global region 410”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen with the teaching of Yoo, and further in view of Nobuhiro with the teaching of Kozitsky by setting the first window region and the second window region based on an attention point of a user in the input image in order to reduce resource consumption and an amount of calculation.
-Regarding claim 24, the modification further discloses wherein the processor is configured to: adjust a second size of the second window region to be equal to a first (Shen: [0037], “downsampled image may beset according to the size of the cropped region”; [0051]).
-Regarding claim 25, the modification further discloses wherein the first size of the first window region and the second size of the second window region are determined based on one or more of a type of the object to be detected from the input image, a field of view (FoV), and camera distance information corresponding to the input image (Shen: FIG. 2 steps 204-206; [0017]; [0034]-[0035]).
-Regarding claim 26, the modification further discloses wherein the first window region is configured to recognize an object having a size less than a preset size (Shen: FIGS 2-4 206; [0018], “a center third of the image”), and a first image extracted from the first window region has a same resolution as the input image (Shen: [0036], “same resolution as the raw image”).
-Regarding claim 27, the modification further discloses wherein the second window region is configured to recognize an object having a size greater than a preset size (Shen: FIGS. 2-4 steps 202; [0018], “a center third of the image”), and a second image extracted from the resized second window region has a resolution lower than a resolution of the input image (Shen: [0037], “a low resolution version … downsampled field of vision).
-Regarding claim 28, the modification further discloses wherein the processor is configured to: detect the object included in the input image from one or both of the first object candidate and the second object candidate using non-maximum suppression Shen: FIGS 2-4 step 212; [0044], “applying non-maximum suppression (NMS) to the combined outputs”).
-Regarding claim 29, the modification further discloses wherein the processor is configured to: detect the first object candidate based on whether the second object candidate is detected from the resized second window region (Shen: [0046], “estimate or determination … car … needed to properly match, merge, and/or de-duplicate said car from the cropped image with the same car detected in the full image”).
-Regarding claim 30, the modification further discloses wherein the processor is configured to: adjusting a location of the first window region based on a location of the second object candidate (Shen: [0018], “a road”, “vanishing line”, “turning road”; [0024]-[0025], “adjusting … based on vehicle heading”, “image-only based rules, attention-based networks, …”; [0034]); detecting the first object candidate from the adjusted location of the first window region (Shen: [0018]; FIGS. 2-4 step 210); and detecting the object included in the input image from the second object candidate and the first object candidate detected from the adjusted location (Shen: [0018]; FIGS. 2-4 steps 210-212).
-Regarding claim 32, the modification further discloses wherein the processor is configured to: determine whether the second object candidate is detected from a second image extracted from the resized second window region (Shen: FIG. 4, steps 210-212; [0045], “second bounding box”) by applying the second image to a second neural network (Shen: FIG.1, [0020]; [0038]-[0039], “a detector (e.g., running a deep learning neural network); FIG. 4”); and determine whether to apply a first image extracted from the first window region (Shen: FIG. 4, steps 210-212; [0045], “first bounding box”) to the first neural network (Shen: FIG.1, [0020]; [0038]-[0039], “a detector (e.g., running a deep learning neural network); FIG. 4”) to detect the first object candidate based on whether the second object candidate is detected from the second image (Shen: [0046], “match … same car detected in the full image”).
-Regarding claim 33, Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky discloses the method of claim 32.
Shen in view of Yoo, and further in view of Nobuhiro is silent to teach that in a case in which the second object candidate is detected from the second image, not applying the first image to the first neural network.
However, Kozitsky is an analogous art pertinent to the problem to be solved in this application and disclose a system and method for object detection and classification (Kozitsky: Abstract; FIG. 2; FIG. 12). The captured image can then be classified according to a confidence driven classification based on classification criteria determined by the weak classifier and the secondary strong classifier (Kozitsky: Abstract; FIG. 2; FIG. 12). Kozitsky further discloses that in a case in which the second object candidate is detected from the second image (Kozitsky: FIG. 2, image 32, weak classifier 36, decision block 40; [0027]-[0031], “two-stage approach”, “primary confidence test”; [0043], “test is satisfied”; [0052]), not applying the first image to the first neural network (Kozitsky: FIGS. 2, 12, strong classifier 44; FIG.5; [0028]; [0030], “CNNs”; [0043], “test is satisfied … passed on to …”; [0044], “block 44 … entered only when … confidence test fails”; [0051]-[0053]; [0070]-[0071]). 
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of 
-Regarding claim 35, the modification further discloses to estimate a location in the second image at which the object is estimated to be located (Shen: FIGS. 2-4, step 212; [0039], “parallel”, “labeled boxes … surrounding object detected”, “object location”); adjust a location of the first window region using the estimated location (Shen: [0040]; [0041]-[0042], “scaling the detected object”, “include: … location”; [0043], “aligned”); extract the first image from the adjusted location of the first window region (Shen: FIGS. 2-4, step 212); and detect the first object candidate by applying, to the first neural network (Shen: Abstract; FIGS. 2-4, step 210;[0038]-[0039]), the first image extracted from the adjusted location (Shen: FIGS. 2-4, step 212).
-Regarding claim 37, Shen discloses an object detection apparatus comprising (FIG.1 system 100): a communication interface configured to obtain an input image (FIG. 1; [0020]-[0022]; [0030]); and a processor configured to ([0019]-[0020]; [0032]; [0038]-[0039]): set a first window region of a first size (FIGS. 2-4, steps 204-206) and a second window region of a second size (FIGS. 2-4 steps 202) larger than the first size corresponding to partial regions of the input image ([0034]-[0036], “crop of the image”; FIGS.2-4, steps 204-206); generate a resized second window region from the second window region (FIGS. 2-4 step 208; [0037]); determine whether a second object candidate is detected from a second image extracted from the resized second window region (FIG. 4, steps 210-212; [0045], “second bounding box”) by applying FIG.1, [0020]; [0038]-[0039], “a detector (e.g., running a deep learning neural network); FIG. 4”); determine whether to apply a first image extracted from the first window region to a first neural network (FIG.1, [0020]; [0038]-[0039], “a detector (e.g., running a deep learning neural network); FIG. 4”) to detect a first object candidate (FIG. 4, steps 210-212; [0045], “first bounding box”) based on whether the second object candidate is detected from the second image ([0046], “needed to properly match, merge, and/or de-duplicate said car from the cropped image with the same car detected in the full image”); and detect an object included in the input image based on one or both of the first object candidate and the second object candidate (Abstract: FIG. 2, step 212; FIG. 4; [0016]; [0038]-[0039]; [0046]; [0069]; [0078]), wherein the processor is configured to ([0019]-[0020]; [0032]; [0038]-[0039]): for the detecting of the object, remove an overlapping region from the first object candidate and the second object candidate; and in a case in which the second object candidate is not detected in the second image, detect the first object candidate by applying the first image to the first neural network ([0012]).
Shen is silent to teach setting the second window region based on an attention point of a user in the input image.
In the same field of endeavor, Yoo teaches an object detection apparatus comprising (Yoo: FIG. 8): a communication interface configured to obtain an input image (Yoo: FIG. 8, network 870 device 850 camera 830); and a processor (Yoo: FIG. 8 processor 810) configured to set the second window region (Yoo: FIG. 3B 335; FIG. 4B 410) based on an attention point of a user in the input image (Yoo: [0085], “eye region 335”; “global region 410”).

Although Shen in view of Yoo does teach detecting the object included in the input image from one or both of the first object candidate and the second object candidate and remove duplicates by applying non-maximum suppression (NMS) to the combined outputs (Shen: FIGS 2-4 step 212; [0044]) and NMS is well-known to remove or suppress overlapping regions for object detection (Liu et al arXiv 1901.03796v1), Shen in view of Yoo is silent to teach detecting of the object including removing an overlapping region from the first object candidate and the second object candidate.
However, Nobuhiro is an analogous art pertinent to the problem to be solved in this application and also disclose detecting of the object including removing an overlapping region from the first object candidate and the second object candidate (Nobuhiro: (57) Overview, “excluding the overlapping area”; FIGS. 5-6, 11-12; [0044]-[0048]; [0058]-[0062]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of Yoo with the teaching of Nobuhiro by detecting of the object including removing an overlapping region from the first object candidate and the second object candidate in order to accurately detect the objects.

However, Kozitsky is an analogous art pertinent to the problem to be solved in this application and disclose a system and method for object detection and classification (Kozitsky: Abstract; FIG. 2; FIG. 12). The captured image can then be classified according to a confidence driven classification based on classification criteria determined by the weak classifier and the secondary strong classifier (Kozitsky: Abstract; FIG. 2; FIG. 12). Kozitsky further discloses that in a case in which the second object candidate is not detected in a second image extracted from the resized second window region (Kozitsky: FIG. 2, image 32, weak classifier 36, decision block 40; [0027]-[0031], “two-stage approach”, “primary confidence test”,  “identify the reason for failure”), detecting the first object candidate by applying a first image extracted from the first window region to a first neural network (Kozitsky: FIGS. 2, 12, strong classifier 44; FIG.5; [0028]; [0030], “CNNs”; [0043], “test is satisfied … passed on to …”; [0044], “block 44 … entered only when … confidence test fails”; [0051]-[0053]; [0070]-[0071]). 
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of Yoo, and further in view of Nobuhiro with the teaching of Kozitsky by detecting the first object candidate by applying a first image extracted from the first window region to a first neural network in a case in which the second object candidate is not detected in a 
-Regarding claim 38, Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky discloses the method of claim 37.
Shen in view of Yoo, and further in view of Nobuhiro is silent to teach that in a case in which the second object candidate is detected from the second image, not applying the first image to the first neural network.
However, Kozitsky is an analogous art pertinent to the problem to be solved in this application and disclose a system and method for object detection and classification (Kozitsky: Abstract; FIG. 2; FIG. 12). The captured image can then be classified according to a confidence driven classification based on classification criteria determined by the weak classifier and the secondary strong classifier (Kozitsky: Abstract; FIG. 2; FIG. 12). Kozitsky further discloses that in a case in which the second object candidate is detected from the second image (Kozitsky: FIG. 2, image 32, weak classifier 36, decision block 40; [0027]-[0031], “two-stage approach”, “primary confidence test”; [0043], “test is satisfied”; [0052]), not applying the first image to the first neural network (Kozitsky: FIGS. 2, 12, strong classifier 44; FIG.5; [0028]; [0030], “CNNs”; [0043], “test is satisfied … passed on to …”; [0044], “block 44 … entered only when … confidence test fails”; [0051]-[0053]; [0070]-[0071]). 
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of Yoo, and further in view of Nobuhiro with the teaching of Kozitsky by detecting the first object candidate by not applying the first image to the first neural network in a case in .
Claims 3-4 and 22-23 are rejected under 35 U.S.C. 103 as being unpatentable over Shen (U.S. PG-PUB NO. 2020/0175326 A1 (PRO NO. 62/775287)) in view of Yoo (U.S. PG-PUB NO. 2018/0060648 A1), and further in view of Nobuhiro (JP 2018112890 A), in view of Kozitsky et al (U.S PG-PUB NO. 20170262723 A1), in view of Chhipa (U.S. PG-PUB NO. 2019/0221191 A1).
-Regarding claim 3 and claim 22, Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky discloses the method of claim 2 and claim 21. 
Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky is silent to teach further comprising determining the attention point based on one or more of gaze information of the user, voice information of the user, and a gesture performed by the user.
However, Chhipa is an analogous art pertinent to the problem to be solved in this application and further discloses determining the attention point based on one or more of gaze information of the user, voice information of the user, and a gesture performed by the user (Chhipa: Abstract; [0082], “track the eye-gaze … while … viewing objects”; [0096]; FIGS. 6B-7A).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky with the teaching of Chhipa by determining the attention point based on one or more of gaze information of the user, 
-Regaling claim 4 and claim 23, the modification further discloses wherein the setting of the first window region and the second window region comprises: setting the first window region of a first size based on the attention point and setting the second window region of a second size greater than the first size based on the attention point (Shen: FIG. 4 204, 206).
Claims 12 and 31 are rejected under 35 U.S.C. 103 as being unpatentable over Shen (U.S. PG-PUB NO. 2020/0175326 A1 (PRO NO. 62/775287)) in view of Yoo (U.S. PG-PUB NO. 2018/0060648 A1), and further in view of Nobuhiro (JP 2018112890 A), in view of Kozitsky et al (U.S PG-PUB NO. 20170262723 A1), in view of Gao (arXiv 1711.05187v2 27 Mar 2018).
-Regarding claim 12 and claim 31, Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky discloses the methods of claim 11 and claim 30 
Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky is silent to teach wherein the adjusting of the location of the first window region comprises: estimating a location in the input image at which the object is estimated to be located based on the location of the second object candidate; and adjusting the location of the first window region using the estimated location.
However, Gao is an analogous art pertinent to the problem to be solved in this application and further discloses to estimate a location (Gao: Figure 1, region 1 or region 2) in the input image at which the object is estimated to be located based on the location of the second object candidate (Gao: Figure 1, output image of Q-net, region step 1 or step 2); and adjust the location of the first window region using the estimated location (Gao: Figure 1, zoom region 1 or zoom region 2).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky with the teaching of Gao by adjusting the location of the first window region using the estimated location in order to improve detection with limited computation cost.
Claims 18 and 36 are rejected under 35 U.S.C. 103 as being unpatentable over Shen (U.S. PG-PUB NO. 2020/0175326 A1 (PRO NO. 62/775287)) in view of Yoo (U.S. PG-PUB NO. 2018/0060648 A1), and further in view of Nobuhiro (JP 2018112890 A), in view of Kozitsky et al (U.S PG-PUB NO. 20170262723 A1), in view of Lin (U.S. PG-PUB NO. 2020/0151448 A1).
-Regarding claim 18, Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky discloses the method of claim 1.
 Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky is silent to teach further comprising obtaining the input image and an attention point corresponding to the input image.
However, Lin is an analogous art pertinent to the problem to be solved in this application and further discloses comprising obtaining the input image and an attention point corresponding to the input image (Lin: [0039], “speaking … “wheel” ”; FIG.1 image 120).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of 
-Regarding claim 36, Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky discloses the method of claim 20. 
Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky is silent to teach wherein the communication interface is configured to obtain an attention point corresponding to the input image.
However, Lin is an analogous art pertinent to the problem to be solved in this application and further discloses wherein the communication interface is configured to obtain an attention point corresponding to the input image (Lin: [0039], “speaking … “wheel” ”; FIG.1 image 120, device 104-3, network 106; [0036]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Shen in view of Yoo, and further in view of Nobuhiro, in view of Kozitsky with the teaching of Lin by using a communication interface that is configured to obtain an attention point corresponding to the input image in order to quick determine user interested object among one or more of objects.
Claims 1, 5, 10, 13-15, 19-20, 24, 29, 32-33, and 37-38 are rejected under 35 U.S.C. 103 as being unpatentable over Dijkman et al (U.S PG-PUB NO. 20170011281 A1) in view of Porikli et al (U.S PG-PUB NO 20190213406 A1), and further in view of Nobuhiro (JP 2018112890 A).
Regarding claim 1, Dijkman discloses an object detection method comprising (Abstract; FIGS. 1-9; [0070]-[0105]): setting a first window region (FIG. 9, 920, 902, 910, 960) and a second window region (FIG.9, 920, 902, 904) corresponding to partial regions of different sizes in an input image (FIG.9, 920, 902), wherein the second window region is larger than the first window region (FIG. 9, 920, 902, 906; [0081], “crops and scales the image”; [0077], “In one example, … a 14×14 image”); downsampling the second window region to generate a resized second window region (FIG. 9, 950; [0078], “down sample … to a 7×7 image”); detecting a first object candidate from the first window region (FIG. 9, , 910, 912, 914; [0074], “local path 910 examines a portion … image”; [0081]; [0083]; [0089]-[0091]) and a second object candidate from the resized second window region (FIG. 9, 904, 906, 908; [0074], “global path 904 examines an entire image”; [0076]; [0078], “down sample”; [0079]-[0080]); and detecting an object included in the input image based on one or both of the first object candidate and the second object candidate (Abstract; FIGS. 1-9; [0074], [0076], “concurrently”; [0102], “cascaded”; [0100], “in another aspect, not all …”; [0101]; [0114]), the detecting of the object including removing an overlapping region from the first object candidate and the second object candidate, wherein the detecting of the first object candidate and the second object candidate comprises (Abstract; FIGS. 1-9; [0070]-[0105]): in a case in which the second object candidate is not detected in a second image extracted from the resized second window region, detecting the first object candidate by applying a first image extracted from the first window region to a first neural network ([0100], “For example, …“sunset” may not … “soccer game” may …”, “ decision may be made whether to run … local path”; [0102], “cascaded”; [0084]-[0085]; [0114], “steps and/or actions may be interchanged”; [0074]; [0090]-[0091]).
Dijkman does disclose while applying efficient power management that the context path 906 is run first to indicate whether specific objects are expected in a scene and then then a decision may be made whether to run the local path 910 (FIG. 9; [0099]-[0101]). 
Dijkman is silent to teach that in a case in which the second object candidate is not detected in a second image extracted from the resized second window region (FIG. 9, 920, 902, 904, 906, 908), detecting the first object candidate by applying a first image extracted from the first window region to a first neural network (FIG. 9, 920, 902, 910, 912, 914).
In the same field of endeavor, Porikli disclose a two-stage object detection method based on rough image detection with global region of interest (ROI) detector and detailed image object detection (Porikli: Abstract; FIGS. 2-9, 13; [0061]; [0080]-[0086]; [0093]-[0095]). Porikli further teaches that in a case in which the second object candidate is not detected in a second image extracted from the resized second window region (Porikli: FIG. 4, step 405-410, “NO”; [0015]; [0083], “threshold”; [0085]; [0086], “below the specified similarity threshold, hand detection is triggered”; [0094]-[0095]), detecting the first object candidate by applying a first image extracted from the first window region to a first neural network (Porikli: FIG. 4, step 420; FIGS. 2-3, 7-8; [0027]-[0029], [0066], “deep learning”; [0068]; [0085]-[0086]; [0092], “neural network”; [0093]; [0095]).

Dijkman in view of Porikli is silent to teach detecting of the object including removing an overlapping region from the first object candidate and the second object candidate.
However, Nobuhiro is an analogous art pertinent to the problem to be solved in this application and also disclose detecting of the object including removing an overlapping region from the first object candidate and the second object candidate (Nobuhiro: (57) Overview, “excluding the overlapping area”; FIGS. 5-6, 11-12; [0044]-[0048]; [0058]-[0062]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Dijkman in view of Porikli with the teaching of Nobuhiro by detecting of the object including removing an overlapping region from the first object candidate and the second object candidate in order to accurately detect the objects.
-Regarding claim 5, the combination further discloses wherein the downsampling of the second window region comprises: adjusting a second size of the second window region to be equal to a first size of the first window region by downsampling an image (Dijkman: FIG. 9, 920, 902, 950; [0078], “down sample … to a 7×7 image”; 960).
-Regarding claim 10, the combination further discloses wherein the detecting of the first object candidate and the second object candidate comprises: detecting the first object candidate based on whether the second object candidate is detected from the resized second window region (Dijkman: FIG. 9; [0100], “decision may be made whether to run … local path”; [0101]-[0102]; [0084]-[0085]; [0114], “steps and/or actions may be interchanged”; [0090]-[0091]).
-Regarding claim 13, the combination further discloses wherein the detecting of the first object candidate and the second object candidate comprises (Dijkman: Abstract; FIGS. 1-9): detecting the second object candidate by applying, to a second neural network (Dijkman: FIG. 9, 951, 952, 953), a second image extracted from the resized second window region (Dijkman: FIG. 9, 920, 902, 904, 950, 906, 908); and detecting the first object candidate by applying, to the first neural network (Dijkman: FIG. 9, 961, 962, 963), the first image extracted from the first window region (Dijkman: FIG. 9, 920, 902, 910, 960, 912, 914).
-Regarding claim 14, the combination further discloses wherein the detecting of the first object candidate and the second object candidate comprises (Dijkman: Abstract; FIGS. 1-9): determining whether the second object candidate is detected from the second image extracted from the resized second window region by applying the second image to the second neural network (Dijkman: FIG.9, 920, 902, 910, 906, 908, 950-953) and determining whether to apply a first image extracted from the first window region to the first neural network to detect the first object candidate based on Dijkman: FIG. 9, 920, 902, 910, 912, 914, 960-964; ([0100], “decision may be made whether to run … local path”; [0101]-[0102]; [0114], “steps and/or actions may be interchanged”).
-Regarding claim 15, the combination further discloses that in a case in which the second object candidate is detected from the second image, not applying the first image to the first neural network (Dijkman: FIG. 9; [0100], “For example, …“sunset” may not … “soccer game” may …”, “ decision may be made whether to run … local path”; [0101], “localization path 914 may be skipped when the goal is only to determine the presence or absence of a specific object in an image”, “dog is somewhere”; [0102]; [0114]).
-Regarding claim 19, the combination further discloses a non-transitory computer-readable storage medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform the object detection method (Dijkman: FIG. 1; [0012]; [0033]; [0120]).
-Regarding claim 20, Dijkman discloses an object detection apparatus comprising (FIG.1 system 100; FIG. 2): a communication interface configured to obtain an input image (FIG. 1, 110, 112, 114; [0032]; [0120]); and a processor configured to (FIG.1, 102, 104, 106, 108; [0031]-[0033]; FIG. 2): set a first window region (FIG. 9, 920, 902, 910, 960) and a second window region (FIG.9, 920, 902, 904) larger than the first window region that correspond to partial regions of different sizes in the input image (FIG. 9, 920, 902, 906; [0081], “crops and scales the image”; [0077], “In one example, … a 14×14 image”); downsample the second window region to generate a resized second window region (FIG. 9, 920, 902, 950; [0078], “down sample … to a 7×7 image”); detect a first object candidate from the first window region (FIG. 9, , 910, 912, 914; [0074], “local path 910 examines a portion … image”; [0081]; [0083]; [0089]-[0091]) and a second object candidate from the resized second window region (FIG. 9, 904, 906, 908; [0074], “global path 904 examines an entire image”; [0076]; [0078], “down sample”; [0079]-[0080]); and detect an object included in the input image based on one or both of the first object candidate and the second object candidate (Abstract; FIGS. 1-9; [0074], [0076], “concurrently”; [0102], “cascaded”; [0100], “in another aspect, not all …”; [0101]; [0114])), wherein the processor is configured to (FIG.1, 102, 104, 106, 108; [0031]-[0033]; FIG. 2): for the detecting of the object, remove an overlapping region from the first object candidate and the second object candidate; and  in a case in which the second object candidate is not detected in a second image extracted from the resized second window region, detect the first object candidate by applying a first image extracted from the first window region to a first neural network ([0100], “For example, …“sunset” may not … “soccer game” may …”, “ decision may be made whether to run … local path”; [0102], “cascaded”; [0084]-[0085]; [0114], “steps and/or actions may be interchanged”; [0074]; [0090]-[0091]).
Dijkman does disclose while applying efficient power management that the context path 906 is run first to indicate whether specific objects are expected in a scene and then then a decision may be made whether to run the local path 910 (FIG. 9; [0099]-[0101]). 
Dijkman is silent to teach that in a case in which the second object candidate is not detected in a second image extracted from the resized second window region (FIG. 9, 920, 902, 904, 906, 908), detecting the first object candidate by applying a first image extracted from the first window region to a first neural network (FIG. 9, 920, 902, 910, 912, 914).
In the same field of endeavor, Porikli disclose a two-stage object detection method based on rough image detection with global region of interest (ROI) detector and detailed image object detection (Porikli: Abstract; FIGS. 2-9, 13; [0061]; [0080]-[0086]; [0093]-[0095]). Porikli further teaches that in a case in which the second object candidate is not detected in a second image extracted from the resized second window region (Porikli: FIG. 4, step 405-410, “NO”; [0015]; [0083], “threshold”; [0085]; [0086], “below the specified similarity threshold, hand detection is triggered”; [0094]-[0095]), detecting the first object candidate by applying a first image extracted from the first window region to a first neural network (Porikli: FIG. 4, step 420; FIGS. 2-3, 7-8; [0027]-[0029], [0066], “deep learning”; [0068]; [0085]-[0086]; [0092], “neural network”; [0093]; [0095]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Dijkman with the teaching of Porikli by detecting the first object candidate by applying a first image extracted from the first window region to a first neural network in a case in which the second object candidate is not detected in a second image extracted from the resized second window region in order to quickly detect the objects and save computation power with required accuracy.

However, Nobuhiro is an analogous art pertinent to the problem to be solved in this application and also disclose detecting of the object including removing an overlapping region from the first object candidate and the second object candidate (Nobuhiro: (57) Overview, “excluding the overlapping area”; FIGS. 5-6, 11-12; [0044]-[0048]; [0058]-[0062]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Dijkman in view of Porikli with the teaching of Nobuhiro by detecting of the object including removing an overlapping region from the first object candidate and the second object candidate in order to accurately detect the objects.
-Regarding claim 24, the combination further discloses wherein the processor is configured to: adjust a second size of the second window region to be equal to a first size of the first window region by downsampling an image corresponding to the second window region (Dijkman: FIG. 9, 920, 902, 950; [0078], “down sample … to a 7×7 image”; 960).
-Regarding claim 29, the combination further discloses wherein the processor is configured to (Dijkman: FIG.1, 102, 104, 106, 108; [0031]-[0033]; FIG. 2): detect the first object candidate based on whether the second object candidate is detected from the resized second window region (Dijkman: FIG. 9; [0100], “decision may be made whether to run … local path”; [0101]-[0102]; [0084]-[0085]; [0114], “steps and/or actions may be interchanged”; [0090]-[0091]).
-Regarding claim 32, the combination further discloses wherein the processor is configured to (Dijkman: FIG.1, 102, 104, 106, 108; [0031]-[0033]; FIG. 2): determine whether the second object candidate is detected from the second image extracted from the resized second window region by applying the second image to the second neural network (Dijkman: FIG.9, 920, 902, 910, 906, 908, 950-953) and determine whether to apply a first image extracted from the first window region to the first neural network to detect the first object candidate based on whether the second object candidate is detected from the second image (Dijkman: FIG. 9, 920, 902, 910, 912, 914, 960-964; ([0100], “decision may be made whether to run … local path”; [0101]-[0102]; [0114], “steps and/or actions may be interchanged”).
-Regarding claim 33, the combination further discloses that in a case in which the second object candidate is detected from the second image, not applying the first image to the first neural network (Dijkman: FIG. 9; [0100], “For example, …“sunset” may not … “soccer game” may …”, “ decision may be made whether to run … local path”; [0101], “localization path 914 may be skipped when the goal is only to determine the presence or absence of a specific object in an image”, “dog is somewhere”; [0102]; [0114]).
-Regarding claim 37, Dijkman further discloses an object detection apparatus comprising (FIG.1 system 100; FIG. 2): a communication interface configured to obtain an input image (FIG. 1, 110, 112, 114; [0032]; [0120]); and a processor configured to (FIG.1, 102, 104, 106, 108; [0031]-[0033]; FIG. 2): set a first window region of a first FIG. 9, 920, 902, 910, 960) and a second window region of a second size (FIG.9, 920, 902, 904) larger than the first size corresponding to partial regions of the input image (FIG. 9, 920, 902, 906; [0081], “crops and scales the image”; [0077], “In one example, … a 14×14 image”); generate a resized second window region from the second window region (FIG. 9, 920, 902, 950; [0078], “down sample … to a 7×7 image”); determine whether a second object candidate is detected from a second image extracted from the resized second window region (FIG. 9, 920, 902, 904, 960, 906, 908) by applying the second image to a second neural network (FIG. 9, 951, 952, 953); determine whether to apply a first image extracted from the first window region to a first neural network (FIG. 9, 961, 962, 963) to detect a first object candidate (FIG. 9, 920, 902, 910, 960, 912, 914) based on whether the second object candidate is detected from the second image (FIG. 9, 920, 902, 910, 912, 914, 960-964; ([0100], “decision may be made whether to run … local path”; [0101]-[0102]; [0114], “steps and/or actions may be interchanged”); and detect an object included in the input image based on one or both of the first object candidate and the second object candidate (Abstract; FIGS. 1-9; [0074], [0076], “concurrently”; [0102], “cascaded”; [0100], “in another aspect, not all …”; [0101]; [0114]), wherein the processor is configured to (FIG.1, 102, 104, 106, 108; [0031]-[0033]; FIG. 2): for the detecting of the object, remove an overlapping region from the first object candidate and the second object candidate; and in a case in which the second object candidate is not detected in the second image, detect the first object candidate by applying the first image to the first neural network ([0100], “For example, …“sunset” may not … “soccer game” may …”, “ decision may be made whether to run … local path”; [0102], “cascaded”; [0084]-[0085]; [0114], “steps and/or actions may be interchanged”; [0074]; [0090]-[0091]).
Dijkman does disclose while applying efficient power management that the context path 906 is run first to indicate whether specific objects are expected in a scene and then then a decision may be made whether to run the local path 910 (FIG. 9; [0099]-[0101]). 
Dijkman is silent to teach that in a case in which the second object candidate is not detected in a second image extracted from the resized second window region (FIG. 9, 920, 902, 904, 906, 908), detecting the first object candidate by applying a first image extracted from the first window region to a first neural network (FIG. 9, 920, 902, 910, 912, 914).
In the same field of endeavor, Porikli disclose a two-stage object detection method based on rough image detection with global region of interest (ROI) detector and detailed image object detection (Porikli: Abstract; FIGS. 2-9, 13; [0061]; [0080]-[0086]; [0093]-[0095]). Porikli further teaches that in a case in which the second object candidate is not detected in a second image extracted from the resized second window region (Porikli: FIG. 4, step 405-410, “NO”; [0015]; [0083], “threshold”; [0085]; [0086], “below the specified similarity threshold, hand detection is triggered”; [0094]-[0095]), detecting the first object candidate by applying a first image extracted from the first window region to a first neural network (Porikli: FIG. 4, step 420; FIGS. 2-3, 7-8; [0027]-[0029], [0066], “deep learning”; [0068]; [0085]-[0086]; [0092], “neural network”; [0093]; [0095]).

Dijkman in view of Porikli is silent to teach detecting of the object including removing an overlapping region from the first object candidate and the second object candidate.
However, Nobuhiro is an analogous art pertinent to the problem to be solved in this application and also disclose detecting of the object including removing an overlapping region from the first object candidate and the second object candidate (Nobuhiro: (57) Overview, “excluding the overlapping area”; FIGS. 5-6, 11-12; [0044]-[0048]; [0058]-[0062]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Dijkman in view of Porikli with the teaching of Nobuhiro by detecting of the object including removing an overlapping region from the first object candidate and the second object candidate in order to accurately detect the objects.
-Regarding claim 38, the combination further discloses that in a case in which the second object candidate is detected from the second image, not applying the first image to the first neural network (Dijkman: FIG. 9; [0100], “For example, …“sunset” may not … “soccer game” may …”, “ decision may be made whether to run … local path”; [0101], “localization path 914 may be skipped when the goal is only to determine the presence or absence of a specific object in an image”, “dog is somewhere”; [0102]; [0114]).
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-15, 17-33 and 35-38 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nay Maung can be reached on (571) 272-7882. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/XIAO LIU/Examiner, Art Unit 2664                                                                                                                                                                                                        
/PING Y HSIEH/Primary Examiner, Art Unit 2664