Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claim 3 is rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Claim 3 recites obtaining a “distribution region” and utilizing distribution regions in Lines 5-7 of Claim 3. The detailed specifications do not sufficiently describe what distributions regions are or how they are obtained within the scope of the claimed invention. The specifications only mention distribution regions when discussing the integration of partial images/regions with no further description of what distribution regions are or how they are obtained. In the image analysis art, distribution region could refer to a variety of concepts including but not limited to probability distributions and medical imaging distributions. Such a lack of description would not enable one having ordinary skill in the art to ascertain the manner in using the claimed invention.
For prior art purposes, a distribution region will be interpreted as a region of the input image containing the set of regional images and default frames as described in Claim 1.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 9 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 9 recites the limitation "the approximation” and “the approximation threshold" in Lines 6-7 and Lines 8, respectively.  There is insufficient antecedent basis for this limitation in the claim.
 
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 3-5, and 7 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Liu et al. (US 2019/0205643 A1; hereafter: Liu).
Regarding Claim 1, Liu teaches: a method for detecting an object image using a convolutional neural network (¶6: “Various embodiments are disclosed for simultaneous object localization and attribute classification using multitask deep neural networks.”), the steps include: providing an input image to a host by an image capture unit, the input image including at least one detected object image and one background image (¶7: “obtaining, by a processing circuit, an image from an image capture device in an environment, the image including a target object in the environment”); converting the input image into a plurality of characteristic values (¶36: “AS the input image 303 goes through the layers of the network trunk 401, spatial resolution of the input image 303 decreases progressively, and produces a set of intermediate feature maps of different spatial sizes. Each of these feature maps summarizes the salient visual patterns in the raw input image 303, and can be utilized as a middle-level feature representation for inferring the high-level semantic inferences 305-307. Multiple feature maps of different sizes are chosen as the feature representations, which allows multitask deep neural network 304 to achieve scale-invariant inferences.”; conversion of the input image to characteristic values is functionally the same as generating feature maps of raw input image data, and feature maps are construed as characteristic values) and comparing the characteristic values with a plurality of convolution kernels to obtain at least one partial or full object image corresponding to some of the characteristic values by using the host, the convolution kernels containing the characteristic values of plural partial images and the adjacent background image in at least one object image (¶39: “A 3x3xp small kernel is applied to produce the shape offset relative to the default box coordinates of the anchor boxes 505a-505c. For each anchor box 505a-505c at each cell of the feature map 503, 4 offsets relative to the original shape of anchor box 502 are computed.”); capturing at least one regional image according to the region where the characteristic values corresponding to the partial or full detected object image and generating at least one default frame based on the edge of at least one regional image and overlapping the default frame on the input image, by using the host (¶39: “A 3x3xp small kernel is applied to produce the shape offset relative to the default box coordinates of the anchor boxes 505a-505c. For each anchor box 505a-505c at each cell of the feature map 503, 4 offsets relative to the original shape of anchor box 502 are computed.”; Figure 5B, element 505a-505c shows the generation of anchor boxes (interpreted as default boxes of the claimed invention), based on the kernels applied to the feature map); capturing and comparing a first center point of the default frames with a second center point of a boundary frame on the input image to obtain a center offset between the default frame and the boundary frame by using the host (¶39: “For each anchor box 505a-505c at each cell of the feature map 503, 4 offsets relative to the original shape of anchor box 502 are computed.”; original anchor box 502 is the ground truth box and is being interpreted as the boundary frame of the claimed invention.”); performing a regression operation according to the center offset to position the object image in the default frame on the input image by using the host (¶39: “The bounding box regression branch 403 regresses the shape offsets of a set of predefined default anchor boxes with respect to a ground truth box.”); comparing the object image with at least one sample image to produce a comparison result by using the host (¶34: “The offline training is performed with a large set of manually annotated training data. During training, the multitask deep neural network 304 receives as input: (i) the input image 303; (ii) ground truth bounding boxes for each target; (iii) class labels for each target; and (iv) attribute labels for each target”; training a neural network requires training data or sample images for accurate classification; the training data function as the sample image for the comparison done by the neural network); and classifying the input image as a target object image or a non-target object image according to the comparison result by using the host (¶6: “a multitask deep neural network is used to localize targets of interest with bounding boxes, classify the targets into semantic categories and simultaneously predict attribute labels for each target.”).
Regarding Claim 3, Liu teaches the method for detecting an object image with a convolutional neural network of Claim 1, wherein the step of capturing at least one region image according to the region where the characteristic values corresponding to the partial or full detected object image, the host integrates the regions where the characteristic values are located, obtains at least one distribution region of the input image, and establishes the default frame with at least one distribution region (Figure 5: element 504; Element 504 of Figure 5 is a cell of the feature map. The anchor boxes 505a-505c of Liu are generated based on the cell of the feature maps and the ground truth box. The set of cells that corresponds to the ground truth box can be construed as the distribution region disclosed by the present invention and the anchor boxes can be construed as the default frames being generated based on the distribution region).
Regarding Claim 4, Liu teaches: the method for detecting an object image with a convolutional neural network of Claim 1, wherein the boundary frame corresponds to the input image, and the default frame corresponds to the detected object image (Figure 5B: element 502 corresponds to the boundary frame and Figure 5B: element 505a-505c corresponds to the default frames).
Regarding Claim 5, Liu teaches: the method for detecting an object image with a convolutional neural network of Claim 1, wherein in the step of converting the input image into a plurality of characteristic values and comparing the characteristic values with a plurality of convolution kernels to obtain at least one partial or full object image corresponding to some of the characteristic values by using the host, the host convolutes each pixel of input image according to a single shot multibox detector model to detect the characteristic values (¶30 discloses the methodology for single shot multibox detection and refers to a publication for the methodology).
Regarding Claim 7, Liu teaches: the method for detecting an object image with a convolutional neural network of Claim 1, wherein in the step of comparing the detected object image with at least one sample image by using the host, the host performs classified comparison at a fully connected layer (¶6: “a multitask deep neural network is used to localize targets of interest with bounding boxes, classify the targets into semantic categories and simultaneously predict attribute labels for each target.”; ¶35: “The building blocks 406 of multitask deep neural network 304, hereinafter all referred to as “layers”, perform operations of convolution, pooling or non-linear activation.”; fully connected layers are a common components of neural networks built for classification purposes and would be have been inherent in the layers disclosed by Liu).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2 and 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Liu as applied to claims above, and further in view of Sundaresan et al. (US 2019/0114804 A1; hereafter: Sundaresan).
Regarding Claim 2, Liu teaches: the method for detecting an object image with a convolutional neural network of Claim 1, wherein the step of converting the input image into a plurality of characteristic values and comparing characteristic values with a plurality of convolution kernels to obtain at least one partial or full object image corresponding to some of the characteristic values by using the host, the host sets the detection boundary of convolution cores to 3x3xp (¶39: “A 3x3xp small kernel is applied to produce the shape offset relative to the default box coordinates of the anchor boxes 505a-505c.”) but does not explicitly teach normalizes a plurality of pixel values of input image to a plurality of pixel normal values; the host obtains the characteristic values in a convolution layer by having the convolution kernels multiplying the pixel normal values.
In a related art, Sundaresan teaches: normalizes a plurality of pixel values of input image to a plurality of pixel normal values (¶54: “a deep learning neural network can be used to determine whether an object in an image or video frame is a person. In such an example, nodes in an input layer of the network can include normalized values for pixels of an image (e.g., with one node representing one normalized pixel value), nodes in a hidden layer can be used to determine whether certain common features of a person are present”); the host obtains the characteristic values in a convolution layer by having the convolution kernels multiplying the pixel normal values (¶123: “The convolution nature of the convolutional hidden layer 622a is due to each node of the convolutional layer being applied to its corresponding receptive field. For example, a filter of the convolutional hidden layer 622a can begin in the top-left corner of the input image and can convolve around the input image.”) for more accurate detection and classification of an object with normalized values.
Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified Liu with the above teachings of Sundaresan to incorporate the normalization of image pixels. The motivation in doing so would lie in more accurate detection and classification of objects.
Regarding Claim 8, Liu, in view of Sundaresan, further teaches: the method for detecting an object image with a convolutional neural network of Claim 1, wherein in the step of classifying the input image as a target object or a non-target object image based on a comparison result, when the host fails to identify the object image in the default frame that matches at least one sample image, the host classifies the input image as the non-target object image, else, the host classifies the input image as the target object image (Sundaresan: ¶54: “Based on the predetermined high-level features, the deep learning network can classify the object as being a person or not (e.g., based on a probability of the object being a person relative to a threshold value)”; person can be construed as a target object and a non-target object can be construed as objects not identified as a person and failure to meet the threshold value would classify the detected object as a non-target object) for explicitly classifying detected objects as target objects and non-target objects.
Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have further modified Liu, in view of Sundaresan, with the above teachings of Sundaresan to incorporate the classification of a detected object as a target object or a non-target object. The motivation in doing so would lie in a clearer classification of an intended target object and a non-target object.
Regarding Claim 9, Liu, in view Sundaresan, teaches: the method for detecting an object image with a convolutional neural network of Claim 1, wherein in the step of classifying the input image as a target object image or a non-target object image according to a comparison result, when the host classifies the input image as the non-target object image, the host secondly compares at least one sample image with the object image; when the host judges that the approximation of one of the detected object image is greater than the approximation threshold, it classifies the input image into the target object image; else, the host classifies the input image into the non-target object image (Sundaresan: ¶54: “Based on the predetermined high-level features, the deep learning network can classify the object as being a person or not (e.g., based on a probability of the object being a person relative to a threshold value)”; person can be construed as a target object and a non-target object can be construed as objects not identified as a person and failure to meet the threshold value would classify the detected object as a non-target object).


Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Liu as applied to claims above, and further in view of Zhang et al. (US 2020/0027207 A1; hereafter: Zhang).
Regarding Claim 6, Liu teaches: the method for detecting an object image with a convolutional neural network of Claim 1, wherein in the step of performing a regression operation based on the center offset, the host performs the regression operation with a first position of the default frame, a second position of the boundary frame (¶39: “The bounding box regression branch 403 regresses the shape offsets of a set of predefined default anchor boxes with respect to a ground truth box.”) but does not explicitly teaching including a zooming factor to position the detected object image.
In a related art, Zhang teaches: a regression operation including a zooming factor (¶42: “each bounding box introduced by the RPN 340 (e.g., 342, 344) is processed by a bounding box regression loss module 360, which is configured to refine the center point of the bounding box and its dimensions to better capture the object therein.”; dimensional regression corresponds to the zooming factor of the claimed invention) for refining a bounding frame to better fit and identify objects.
Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified Liu with the above teachings of Zhang to incorporate a zooming factor. The motivation in doing so would lie a better fitted bounding box for easier detection and classification of objects.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Shi et al. (US 2021/0082181 A1), Hu (US 2022/0036548 A1), Saliou (US 10,755,142 B2).
M. Liu, J. Jiang and Z. Wang, "Colonic Polyp Detection in Endoscopic Videos With Single Shot Detection Based Deep Convolutional Neural Network," in IEEE Access, vol. 7, pp. 75058-75066, 2019, doi: 10.1109/ACCESS.2019.2921027.
Liu, W. et al. (2016). SSD: Single Shot MultiBox Detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds) Computer Vision – ECCV 2016. ECCV 2016. Lecture Notes in Computer Science(), vol 9905. Springer, Cham. https://doi.org/10.1007/978-3-319-46448-0_2
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JULIUS CHAI whose telephone number is (571)272-4209. The examiner can normally be reached Monday-Friday 8:00 AM EST - 4:30PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached on (571) 272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JULIUS CHAI/Examiner, Art Unit 2668                                                                                                                                                                                                        
/VU LE/Supervisory Patent Examiner, Art Unit 2668