DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Claims
Claims 1-20, as originally filed, are currently pending and have been considered below.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 1, 4-6, 9-11, 14-16, 19 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wong, Lai-Kuan, and Kok-Lim Low. "Saliency-enhanced image aesthetics class prediction." Image Processing (ICIP), 2009 16th IEEE International Conference on. IEEE, 2009, hereinafter, “Wong”, and further in view of Liu, Nian, et al. "Predicting eye fixations using convolutional neural networks." Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on. IEEE, 2015, hereinafter, “Liu”.

As per claim 1, Wong discloses a method comprising: 
identifying a ground truth bounding box corresponding to an object of interest in an image (Wong, page 998, Figure 1(a); Wong, page 998, Section 3. Our Approach, determine a set of salient locations L); 
generating a binary image comprising image data within the ground truth bounding box (Wong, page 998, Figure 1(b)).
Wong further discloses (Wong, page 998, 4. Salient Regions Extraction, For each image, we compute the saliency map and the salient locations using Itti’s visual saliency model [4], which is built upon a biologically plausible architecture that exploits multi-scaled intensity, color and orientation image features and learnt the salient locations using a Winner-Take-All (WTA) neural network framework) but does not explicitly disclose the following limitations as further recited however Liu discloses 
providing the binary image to a neural network configured to output a synthetic saliency map (Liu, Abstract, learn these two types of visual features from raw image data using a multiresolution convolutional neural network (Mr-CNN) for predicting eye fixations. The Mr-CNN is directly trained from image regions centered on fixation; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, In the training stage, we randomly sample fixation and non-fixation locations based on the saliency values in the ground truth density maps which are generated by applying Gaussian blur on the raw eye fixation point maps ... we extract image regions centered at the sampled fixation or non-fixation locations as the inputs of our Mr-CNN, together with their corresponding binary classification labels Liu, page 363, Introduction, a multiresolution convolutional neural network (Mr-CNN) which simultaneously learns early features, bottom-up saliency, top-down factors, and their integration from raw image data. To be specific, as shown in Figure 1 and Figure 2, we train a Mr-CNN directly from image regions centered on fixation and non-fixation locations over multiple resolutions, using raw image pixels as inputs and eye fixation attributes as labels … By using a Mr-CNN, we implement the learning of early features, bottom-up saliency, top-down factors, and their integration from image data; Liu, page 365, Section 3.1, Datasets); and 
predicting a scale and location of an object within the image based on the synthetic saliency map (Liu, page 364, Figure 1, Saliency map, the given image is rescaled to three scales, i.e. 150×150, 250×250 and 400×400 ... When testing, we just evenly sample 50×50 locations per image to estimate their saliency values to reduce computation cost. The obtained down-sampled saliency map is rescaled to the original size to achieve the final saliency map).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Wong to include the convolutional neural network as taught by Liu as an alternate means to learn visual features from image data in order to predict eye fixations (Liu, Abstract).  The motivation would be to provide an alternate means to learn visual features from raw image data in order to predict eye fixations.

As per claim 4, Wong and Liu disclose the method of claim 1, further comprising applying a Gaussian blur to the binary image and storing a low resolution version of the binary image as a label for the synthetic saliency map (Liu, page 364, Figure 1, Density Map, Saliency Map; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, the three scales used to rescale the input image are empirically chosen as 150×150, 250×250, and 400×400 … applying Gaussian blur on the raw eye fixation point maps … the obtained down-sampled saliency map is rescaled to the original size of the testing image to achieve the final saliency map; Liu, page 369, Section 3.6. Network structure analysis, three resolutions are used in our model).  The motivation would be the same as above in claim 1.

As per claim 5, Wong and Liu disclose the method of claim 4, wherein providing the binary image to the neural network further comprises providing the label for the synthetic saliency map to the neural network (Liu, Abstract, learn these two types of visual features from raw image data using a multiresolution convolutional neural network (Mr-CNN) for predicting eye fixations. The Mr-CNN is directly trained from image regions centered on fixation; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, In the training stage, we randomly sample fixation and non-fixation locations based on the saliency values in the ground truth density maps which are generated by applying Gaussian blur on the raw eye fixation point maps ... we extract image regions centered at the sampled fixation or non-fixation locations as the inputs of our Mr-CNN, together with their corresponding binary classification labels; Liu, page 363, Introduction, a multiresolution convolutional neural network (Mr-CNN) which simultaneously learns early features, bottom-up saliency, top-down factors, and their integration from raw image data. To be specific, as shown in Figure 1 and Figure 2, we train a Mr-CNN directly from image regions centered on fixation and non-fixation locations over multiple resolutions, using raw image pixels as inputs and eye fixation attributes as labels … By using a Mr-CNN, we implement the learning of early features, bottom-up saliency, top-down factors, and their integration from image data; Liu, page 365, Section 3.1, Datasets). The motivation would be the same as above in claim 1.

As per claim 6, Wong and Liu disclose the method of claim 1, wherein the synthetic saliency map mimics human perception for object detection (Liu, pages 362-363, Introduction, saliency information … for predicting eye fixations … the proposed Mr-CNN can learn both low-level features related to bottom-up saliency and high-level top-down factors to improve eye fixation prediction). The motivation would be the same as above in claim 1.

As per claim 9, Wong and Liu disclose the method of claim 1, further comprising generating a low-resolution version of the synthetic saliency map and storing the low-resolution version of the synthetic saliency map as a label for the synthetic saliency map (Liu, page 363, Introduction, a multiresolution convolutional neural network (Mr-CNN) which simultaneously learns early features, bottom-up saliency, top-down factors, and their integration from raw image data. To be specific, as shown in Figure 1 and Figure 2, we train a Mr-CNN directly from image regions centered on fixation and non-fixation locations over multiple resolutions, using raw image pixels as inputs and eye fixation attributes as labels; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, the three scales used to rescale the input image are empirically chosen as 150×150, 250×250, and 400×400 … applying Gaussian blur on the raw eye fixation point maps … the obtained down-sampled saliency map is rescaled to the original size of the testing image to achieve the final saliency map; Liu, page 369, Section 3.6. Network structure analysis, three resolutions are used in our model), and wherein predicting the scale and the location of the object within the image comprises predicting based on the low-resolution version of the synthetic saliency map to reduce computational load (Liu, page 364, Figure 1, Saliency map; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, applying Gaussian blur on the raw eye fixation point maps; Liu, page 369, Section 3.6. Network structure analysis, three resolutions are used in our model). The motivation would be the same as above in claim 1.

As per claim 10, Wong and Liu disclose the method of claim 1, further comprising testing the neural network by providing a sensor data frame to the neural network for the neural network to determine a classification, location, or orientation of an object of interest within the sensor data frame (Liu, page 363, 1. Introduction, we train a Mr-CNN directly from image regions centered on fixation and non-fixation locations over multiple resolutions, using raw image pixels as inputs; Liu, page 364, Figure , When testing, we just evenly sample 50×50 locations per  to estimate their saliency values to reduce computation cost. The obtained down-sampled saliency map; Liu, page 365, 2.2. Saliency detection using Mr-CNN, When testing, to reduce computation cost, we sample 2500 locations for each testing image as center locations to extract image regions, which is implemented by evenly sampling 50 locations along each side of the testing image. Then the activation value of the last layer in the Mr-CNN is obtained as the saliency value of each location to form the down-sampled saliency map). The motivation would be the same as above in claim 1.

As per claim 11, Wong discloses a processor that is programmable to execute instructions stored in non-transitory computer readable storage media, the instructions comprising 
identifying a ground truth bounding box corresponding to an object of interest in an image (Wong, page 998, Figure 1(a); Wong, page 998, Section 3. Our Approach, determine a set of salient locations L); 
generating a binary image comprising image data within the ground truth bounding box (Wong, page 998, Figure 1(b)).
Wong further discloses (Wong, page 998, 4. Salient Regions Extraction, For each image, we compute the saliency map and the salient locations using Itti’s visual saliency model [4], which is built upon a biologically plausible architecture that exploits multi-scaled intensity, color and orientation image features and learnt the salient locations using a Winner-Take-All (WTA) neural network framework) but does not explicitly disclose the following limitations as further recited however Liu discloses 
providing the binary image to a neural network configured to output a synthetic saliency map (Liu, Abstract, learn these two types of visual features from raw image data using a multiresolution convolutional neural network (Mr-CNN) for predicting eye fixations. The Mr-CNN is directly trained from image regions centered on fixation; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, In the training stage, we randomly sample fixation and non-fixation locations based on the saliency values in the ground truth density maps which are generated by applying Gaussian blur on the raw eye fixation point maps ... we extract image regions centered at the sampled fixation or non-fixation locations as the inputs of our Mr-CNN, together with their corresponding binary classification labels; Liu, page 363, Introduction, a multiresolution convolutional neural network (Mr-CNN) which simultaneously learns early features, bottom-up saliency, top-down factors, and their integration from raw image data. To be specific, as shown in Figure 1 and Figure 2, we train a Mr-CNN directly from image regions centered on fixation and non-fixation locations over multiple resolutions, using raw image pixels as inputs and eye fixation attributes as labels … By using a Mr-CNN, we implement the learning of early features, bottom-up saliency, top-down factors, and their integration from image data; Liu, page 365, Section 3.1, Datasets); and 
predicting a scale and location of an object within the image based on the synthetic saliency map (Liu, page 364, Figure 1, Saliency map, the given image is rescaled to three scales, i.e. 150×150, 250×250 and 400×400 ... When testing, we just evenly sample 50×50 locations per image to estimate their saliency values to reduce computation cost. The obtained down-sampled saliency map is rescaled to the original size to achieve the final saliency map).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Wong to include the convolutional neural network as taught by Liu as an alternate means to learn visual features from image data in order to predict eye fixations (Liu, Abstract).  The motivation would be to provide an alternate means to learn visual features from raw image data in order to predict eye fixations.

As per claim 14, Wong and Liu disclose the processor of claim 11, wherein the instructions further comprise applying a Gaussian blur to the binary image and storing a low resolution version of the binary image as a label for the synthetic saliency map (Liu, page 364, Figure 1, Density Map, Saliency Map; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, the three scales used to rescale the input image are empirically chosen as 150×150, 250×250, and 400×400 … applying Gaussian blur on the raw eye fixation point maps … the obtained down-sampled saliency map is rescaled to the original size of the testing image to achieve the final saliency map; Liu, page 369, Section 3.6. Network structure analysis, three resolutions are used in our model).  The motivation would be the same as above in claim 11.

As per claim 15, Wong and Liu disclose the processor of claim 14, wherein the instructions are such that providing the binary image to the neural network further comprises providing the label for the synthetic saliency map to the neural network (Liu, Abstract, learn these two types of visual features from raw image data using a multiresolution convolutional neural network (Mr-CNN) for predicting eye fixations. The Mr-CNN is directly trained from image regions centered on fixation; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, In the training stage, we randomly sample fixation and non-fixation locations based on the saliency values in the ground truth density maps which are generated by applying Gaussian blur on the raw eye fixation point maps ... we extract image regions centered at the sampled fixation or non-fixation locations as the inputs of our Mr-CNN, together with their corresponding binary classification labels; Liu, page 363, Introduction, a multiresolution convolutional neural network (Mr-CNN) which simultaneously learns early features, bottom-up saliency, top-down factors, and their integration from raw image data. To be specific, as shown in Figure 1 and Figure 2, we train a Mr-CNN directly from image regions centered on fixation and non-fixation locations over multiple resolutions, using raw image pixels as inputs and eye fixation attributes as labels … By using a Mr-CNN, we implement the learning of early features, bottom-up saliency, top-down factors, and their integration from image data; Liu, page 365, Section 3.1, Datasets).  The motivation would be the same as above in claim 11.

As per claim 16, Wong discloses non-transitory computer readable storage media storing instructions to be executed by one or more processors, the instructions comprising: 
identifying a ground truth bounding box corresponding to an object of interest in an image (Wong, page 998, Figure 1(a); Wong, page 998, Section 3. Our Approach, determine a set of salient locations L); 
generating a binary image comprising image data within the ground truth bounding box (Wong, page 998, Figure 1(b)).
Wong further discloses (Wong, page 998, 4. Salient Regions Extraction, For each image, we compute the saliency map and the salient locations using Itti’s visual saliency model [4], which is built upon a biologically plausible architecture that exploits multi-scaled intensity, color and orientation image features and learnt the salient locations using a Winner-Take-All (WTA) neural network framework) but does not explicitly disclose the following limitations as further recited however Liu discloses 
providing the binary image to a neural network configured to output a synthetic saliency map (Liu, Abstract, learn these two types of visual features from raw image data using a multiresolution convolutional neural network (Mr-CNN) for predicting eye fixations. The Mr-CNN is directly trained from image regions centered on fixation; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, In the training stage, we randomly sample fixation and non-fixation locations based on the saliency values in the ground truth density maps which are generated by applying Gaussian blur on the raw eye fixation point maps ... we extract image regions centered at the sampled fixation or non-fixation locations as the inputs of our Mr-CNN, together with their corresponding binary classification labels; Liu, page 363, Introduction, a multiresolution convolutional neural network (Mr-CNN) which simultaneously learns early features, bottom-up saliency, top-down factors, and their integration from raw image data. To be specific, as shown in Figure 1 and Figure 2, we train a Mr-CNN directly from image regions centered on fixation and non-fixation locations over multiple resolutions, using raw image pixels as inputs and eye fixation attributes as labels … By using a Mr-CNN, we implement the learning of early features, bottom-up saliency, top-down factors, and their integration from image data; Liu, page 365, Section 3.1, Datasets); and 
predicting a scale and location of an object within the image based on the synthetic saliency map (Liu, page 364, Figure 1, Saliency map, the given image is rescaled to three scales, i.e. 150×150, 250×250 and 400×400 ... When testing, we just evenly sample 50×50 locations per image to estimate their saliency values to reduce computation cost. The obtained down-sampled saliency map is rescaled to the original size to achieve the final saliency map).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Wong to include the convolutional neural network as taught by Liu as an alternate means to learn visual features from image data in order to predict eye fixations (Liu, Abstract).  The motivation would be to provide an alternate means to learn visual features from raw image data in order to predict eye fixations.

As per claim 19, Wong and Liu disclose the non-transitory computer readable storage media of claim 16, wherein the instructions further comprise generating a low-resolution version of the synthetic saliency map and storing the low-resolution version of the synthetic saliency map as a label for the synthetic saliency map (Liu, page 363, Introduction, a multiresolution convolutional neural network (Mr-CNN) which simultaneously learns early features, bottom-up saliency, top-down factors, and their integration from raw image data. To be specific, as shown in Figure 1 and Figure 2, we train a Mr-CNN directly from image regions centered on fixation and non-fixation locations over multiple resolutions, using raw image pixels as inputs and eye fixation attributes as labels; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, the three scales used to rescale the input image are empirically chosen as 150×150, 250×250, and 400×400 … applying Gaussian blur on the raw eye fixation point maps … the obtained down-sampled saliency map is rescaled to the original size of the testing image to achieve the final saliency map; Liu, page 369, Section 3.6. Network structure analysis, three resolutions are used in our model), and wherein the instructions are such that predicting the scale and the location of the object within the image comprises predicting based on the low-resolution version of the synthetic saliency map to reduce computational load (Liu, page 364, Figure 1, Saliency map; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, applying Gaussian blur on the raw eye fixation point maps; Liu, page 369, Section 3.6. Network structure analysis, three resolutions are used in our model).  The motivation would be the same as above in claim 16.

As per claim 20, Wong and Liu disclose the non-transitory computer readable storage media of claim 16, wherein the instructions further comprise testing the neural network by providing a sensor data frame to the neural network for the neural network to determine a classification, location, or orientation of an object of interest within the sensor data frame (Liu, page 363, 1. Introduction, we train a Mr-CNN directly from image regions centered on fixation and non-fixation locations over multiple resolutions, using raw image pixels as inputs; Liu, page 364, Figure , When testing, we just evenly sample 50×50 locations per  to estimate their saliency values to reduce computation cost. The obtained down-sampled saliency map; Liu, page 365, 2.2. Saliency detection using Mr-CNN, When testing, to reduce computation cost, we sample 2500 locations for each testing image as center locations to extract image regions, which is implemented by evenly sampling 50 locations along each side of the testing image. Then the activation value of the last layer in the Mr-CNN is obtained as the saliency value of each location to form the down-sampled saliency map). The motivation would be the same as above in claim 16.


Claim(s) 8 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wong, Lai-Kuan, and Kok-Lim Low. "Saliency-enhanced image aesthetics class prediction." Image Processing (ICIP), 2009 16th IEEE International Conference on. IEEE, 2009, hereinafter, “Wong”, in view of Liu, Nian, et al. "Predicting eye fixations using convolutional neural networks." Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on. IEEE, 2015, hereinafter, “Liu” as applied to claims 1 and 16 above, and further in view of Rajashekar, Umesh, et al. "GAFFE: A gaze-attentive fixation finding engine." IEEE transactions on image processing 17.4 (2008): 564-573, hereinafter, “Rajashekar.

As per claim 8, Wong and Liu disclose the method of claim 1, further comprising: 
applying a Gaussian blur to the binary image (Liu, page 364, Figure 1, Density Map, Saliency Map; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, we randomly sample fixation and non-fixation locations based on the saliency values in the ground truth density maps which are generated by applying Gaussian blur on the raw eye fixation point maps); 
storing a low resolution version of the binary image as a label for the image (Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, the three scales used to rescale the input image are empirically chosen as 150×150, 250×250, and 400×400 … applying Gaussian blur on the raw eye fixation point maps … the obtained down-sampled saliency map is rescaled to the original size of the testing image to achieve the final saliency map; Liu, page 369, Section 3.6. Network structure analysis, three resolutions are used in our model); and
fitting the Gaussian blur to improve prediction of the scale and the location of the object within the image (Liu, page 364, Figure 1, Saliency map; Liu, page 365, Section 2.2. Saliency detection using Mr-CNN, applying Gaussian blur on the raw eye fixation point maps).
Wong and Liu do not explicitly disclose the following limitation as further recited however Rajashekar discloses 
fitting the blur with ellipses to improve prediction of the location of the object within the image (Rajashekar, page 569-570, Section A. Qualitative Comparison of Fixation Selections, Ten clusters with the maximum density of fixations are shown as ellipses in Fig. 4. The fixation selection algorithm was used to select a sequence of ten fixations, each of which was represented by a 2-D Gaussian window, illustrated by the bright regions in Fig. 4; Rajashekar, page 570, Figure 4).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to include the ellipse blur areas of Rajashekar in the system of Wong and Liu in order to provide an alternate means to clearly illustrate the selected fixation areas for visualization purposes (Rajashekar, page 569, Section A. Qualitative Comparison of Fixation Selections).  The motivation would be to clearly illustrate the selected area for visualization purposes.

Regarding claim(s) 18: 
A corresponding reasoning as given earlier (see rejection of claim(s) 8) applies, mutatis mutandis, to the subject-matter of claim(s) 18, and therefore is/are also considered rejected under the grounds given in the rejection of claim(s) 8.


Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1, 2, 4, 6-12 and 14-20 of the current application are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-4 and 7-20 of U.S. Patent No. 11,087,186.  Claims 1, 2, 4-6, 8-12, 14-16 and 18-20 of the current application are rejected on the ground of nonstatutory double-patenting as being unpatentable over claims 1-18 of U.S. Patent No. 10,489,691.  Although the claims at issue are not identical, they are not patentably distinct from each other because claims 1, 2, 4, 6-12 and 14-20 of the current application are anticipated by claims 1-4 and 7-20 of U.S. Patent No. 11,087,186 in that claims 1-4 and 7-20 of U.S. Patent No. 11,087,186 contain all the limitations of claims 1, 2, 4, 6-12 and 14-20 of the current application and are therefore not patentably distinct from claims 1-4 and 7-20 of U.S. Patent No. 11,087,186.  
Similarly, claims 1, 2, 4-6, 8-12, 14-16 and 18-20 of the current application are anticipated by claims 1-18 of U.S. Patent No. 10,489,691 in that claims 1-18 of U.S. Patent No. 10,489,691 contain all the limitations of claims 1, 2, 4-6, 8-12, 14-16 and 18-20 of the current application and are therefore not patentably distinct from claims 1-18 of U.S. Patent No. 10,489,691.
Claims 1, 2, 4, 6-12 and 14-20 of the current application recite similar limitations as claims 1-4 and 7-20 of U.S. Patent No. 11,087,186 as follows:
Current Application No. 17/371,866
U.S. Patent No. 11,087,186
1. A method comprising: identifying a ground truth bounding box corresponding to an object of interest in an image; generating a binary image comprising image data within the ground truth bounding box; providing the binary image to a neural network configured to output a synthetic saliency map; and predicting a scale and location of an object within the image based on the synthetic saliency map.
2. The method of claim 1, further comprising executing a randomization algorithm to generate an intermediate image comprising one or more random points within a region corresponding to the ground truth bounding box of the image, wherein a quantity of random points is determined based on a size of the ground truth bounding box.
1. A method comprising: identifying a ground truth bounding box corresponding to an object of interest in a first image; executing a randomization algorithm to generate an intermediate image comprising one or more random points within a region corresponding to the ground truth bounding box of the first image, wherein a quantity of random points is determined based on a size of the ground truth bounding box; generating a blurred intermediate image by applying a blur to each of the one or more random points in the intermediate image; and storing the blurred intermediate image as a label image for the first image.
2. The method of claim 1, further comprising pairing the first image and the label image and feeding the pair of images to a modified deep neural network configured to output a synthetic saliency map to predict a location of an object within any image.
3. The method of claim 1, wherein the blur applied to each of the one or more random points in the intermediate image comprises an ellipses shaped blur configured to predict a scale and location of the object of interest.
4. The method of claim 1, further comprising applying a Gaussian blur to the binary image and storing a low resolution version of the binary image as a label for the synthetic saliency map.
9. The method of claim 1, further comprising generating a low-resolution version of the synthetic saliency map and storing the low-resolution version of the synthetic saliency map as a label for the synthetic saliency map, and wherein predicting the scale and the location of the object within the image comprises predicting based on the low-resolution version of the synthetic saliency map to reduce computational load
4. The method of claim 1, further comprising downsizing the blurred intermediate image to generate a low-resolution version of the blurred intermediate image, and wherein storing the blurred intermediate image as the label image comprises storing the low-resolution version of the blurred intermediate image.
9. The method of claim 1, wherein generating the blurred intermediate image comprises applying a Gaussian blur to the intermediate image
6. The method of claim 1, wherein the synthetic saliency map mimics human perception for object detection.
7. The method of claim 6, wherein the synthetic saliency map mimics human perception for object detection.
7. The method of claim 1, wherein generating the binary image comprises applying bright pixel values or dark pixel values to random points within the ground truth bounding box, and wherein the method further comprises calculating how many random points to generate within the ground truth bounding box based on a size of the ground truth bounding box.
8. The method of claim 1, further comprising calculating the quantity of random points to generate within the ground truth bounding box based on the size of the ground truth bounding box.
8. The method of claim 1, further comprising: applying a Gaussian blur to the binary image; storing a low resolution version of the binary image as a label for the image; fitting the Gaussian blur with ellipses to improve prediction of the scale and the location of the object within the image.
3. The method of claim 1, wherein the blur applied to each of the one or more random points in the intermediate image comprises an ellipses shaped blur configured to predict a scale and location of the object of interest.


Claim 6 of the current application corresponds to claims 7, 13 and 19 of U.S. Patent No. 11,087,186. Claims 10 and 20 of the current application correspond to claim 12 of U.S. Patent No. 11,087,186. Claims 11, 12, 16 and 17 of the current application correspond to claims 10, 15, 16 and 20 of U.S. Patent No. 11,087,186.  Claims 14, 18 and 19 of the current application correspond to claims 9 and 18 of U.S. Patent No. 11,087,186.  Claim 15 of the current application corresponds to claim 17 of U.S. Patent No. 11,087,186.
Similarly, claims 1, 4, 8-10, 11, 14, 16 and 18-20 of the current application correspond to independent claims 1, 8 and 14 of U.S. Patent No. 10,489,691.  Claims 2 and 12 of the current application correspond to claims 7, 13 and 18 of U.S. Patent No. 10,489,691. Claims 5 and 15 of the current application correspond to claims 2, 9 and 15 of U.S. Patent No. 10,489,691. Claim 6 of the current application corresponds to claims 6, 11 and 17 of U.S. Patent No. 10,489,691. Claims 8 and 18 of the current application correspond to claims 3, 12 and 16 of U.S. Patent No. 10,489,691.  Claims 10 and 20 of the current application correspond to claim 10 of U.S. Patent No. 10,489,691. Claims 5 and 15 of the current application correspond to claims 4 and 5 of U.S. Patent No. 10,489,691.
The table above shows that, although the corresponding claims are not identical, claims 1, 2, 4-12 and 14-20 of the current application are not patentably distinct from the reference claims because the claims of the current application would be anticipated over the reference claims.



Allowable Subject Matter
Claims 2, 3, 7, 12, 13 and 17 are objected to as being dependent upon a rejected base claim, but would be allowable if the double patenting rejection were overcome and if they were rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter: Claims 2, 3, 7, 12, 13 and 17 would be allowable if the double patenting rejection were overcome and if the claims were rewritten in independent form including all of the limitations of the base claim and any intervening claims.  As enumerated above, the prior art discloses various methods of determining and predicting eye fixations and salient regions in images, the prior art does not disclose at least the limitations “executing a randomization algorithm to generate an intermediate image comprising one or more random points within a region corresponding to the ground truth bounding box of the image, wherein a quantity of random points is determined based on a size of the ground truth bounding box.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TRACY MANGIALASCHI whose telephone number is (571)270-5189. The examiner can normally be reached M-F, 9:30AM TO 6:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached on (571) 272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/TRACY MANGIALASCHI/Examiner, Art Unit 2668               
/VU LE/Supervisory Patent Examiner, Art Unit 2668