DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claims 1-10 are pending.


Claim Interpretation - 35 USC § 112(f)
The following is a quotation of 35 U.S.C. 112(f):
	(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
	An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as "configured to" or "so that"; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or preAIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are:

Claim 10:
“the extraction unit is configured to…”;
“the determination unit is configured to…”;
“the segmentation unit configured to…”; and
“ the estimation unit configured to…”.

Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or preAIA  35 U.S.C. 112, sixth paragraph.


Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

Examiner’s notes: the corresponding text descriptions of any figure(s)  and table(s) cited from the prior art are incorporated herein for further details associated with the examiner’s review comments on the corresponding claims below.

Claim(s) 1-2, 4-5 and 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Qi et al (US2019/0147245) in view of Knittel (US2019/0180149).

Regarding claims 1 and 10, Qi teaches a three-dimensional object detection method based on weighted channel features of a point cloud, comprising:
(Qi, Figs. 1-2)
extracting a target in a two-dimensional image by a pre-trained deep convolutional neural network to obtain a plurality of target objects;
(Qi, Fig. 2, CNN => 2d region proposals; “receiving, at a processor, two-dimensional image data from an optical camera; generating, by the processor, an attention region in the two-dimensional image data, the attention region marking an object of interest”, [0043])
determining a point cloud frustum in a three-dimensional point cloud space corresponding to each target object of the plurality of target objects based on the each target object;
(Qi, Figs. 2 and 8; “receiving, at the processor, three-dimensional depth data from a depth sensor, the depth data corresponding to the image data; extracting, by the processor, a three-dimensional frustum from the depth data corresponding to the attention region”, [0043])
segmenting a point cloud in the point cloud frustum based on a point cloud segmentation network to obtain a point cloud of interest; and
(Qi, Figs. 2 and 8; “3D instance segmentation using PointNet-based network on point clouds in frustums”, [0130]; “3D Instance Segmentation PointNet. The network takes a point cloud in the frustum and predicts a probability score for each point that indicates how likely the point belongs to the object of interest. Note that each frustum contains exactly one object of interest”, [0132])
estimating parameters of a 3D box in the point cloud of interest based on a network with the 
(Qi, Figs. 2 and 8; “The segmentation network predicts the 3D mask of the object of interest (a.k.a. instance segmentation); and the regression network estimates the amodal 3D bounding box (covering the entire object even if only part of it is visible)”, [0118]; “A 3D bounding box is parameterized by its center (ex, cg, ez), size (h, w, l) and heading angle 9 (along up-axis)”, [0140]; “the three-dimensional depth data comprises RGB-D data comprising a red-green-blue (RBG) image with the corresponding three-dimensional depth data comprising an image channel in which each pixel relates to a distance between the image plane and the corresponding object in the RGB image”, [0051]; “Given RGB-D data as input, the goal is to classify and localize objects in 3D space… Each object is represented by a class (one among k predefined classes) and an amodal 3D bounding box”, [0125])
	Qi does not expressly disclose but Knittel teaches:
	… a network with the weighted channel features…
(Knittel, “feature channels representing visual properties”, [0045]; “third plurality of feature responses, each of the third plurality of feature responses being generated based on one of the first plurality of feature responses from the first channel and one of the second plurality of feature responses from the second channel, and a weighted combination of associated temporal and spatial position values of the corresponding one of the received first and second plurality of feature responses”, [0007, 0019]; Fig. 7, “The node response value r_0 750 represents a third feature response generated from the input values 710 to 717 corresponding to two channels. The node response value r_0 750 is generated by the attenuation function 745 using the weighted combination of the spatial and temporal location information of the first and second channels using Equations (1) to (8)”, [0094]; Fig. 10, [0046-0048])
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate the teachings of Knittel into the system or method of Qi in order to use different features of a 3D object for accurate object location estimation by combing weighted feature channels such as spatial properties of features. The combination of Qi and Knittel also teaches other enhanced capabilities.

Regarding claim 2, the combination of Qi and Knittel teaches its/their respective base claim(s).
The combination further teaches the three-dimensional object detection method according to claim 1,
wherein, the plurality of target objects are obtained by the following formula:

    PNG
    media_image1.png
    25
    197
    media_image1.png
    Greyscale

wherein,
	I represents the two-dimensional image, and
	Net represents the pre-trained deep convolutional neural network; and
	coordinates (x, y) of a center point, a length h and a width w of a 2D box represent a position of the each target object.
(Qi, “A 3D bounding box is parameterized by its center (ex, cg, ez), size (h, w, l) and heading angle 9 (along up-axis)”, [0140]; Fig. 2, the box position and size dimensions are determined from a neural network and other connected modules)

Regarding claim 4, the combination of Qi and Knittel teaches its/their respective base claim(s).
The combination further teaches the three-dimensional object detection method  according to claim 1, wherein, the step of segmenting the point cloud in the point cloud frustum based on the point cloud segmentation network to obtain the point cloud of interest specifically comprises:
calculating a probability that the point cloud in the point cloud frustum belongs to a point cloud of interest based on the point cloud segmentation network by the following formula:

    PNG
    media_image2.png
    23
    128
    media_image2.png
    Greyscale

wherein,
	x_i represents an i_th point cloud in the point cloud frustum,
	theta represents a network training parameter,
	p_i represents a probability that the i_th point cloud x_i belongs to the point cloud of interest, and
	f represents the point cloud segmentation network; and
determining and obtaining the point cloud of interest according to the probability that each point cloud in the point cloud frustum belongs to the point cloud of interest and a predetermined probability threshold.
(Qi, “3D Instance Segmentation PointNet. The network takes a point cloud in the frustum and predicts a probability score for each point that indicates how likely the point belongs to the object of interest”, [0132])

Regarding claim 5, the combination of Qi and Knittel teaches its/their respective base claim(s).
The combination further teaches the three-dimensional object detection method according to claim 4, wherein, the step of determining and obtaining the point cloud of interest according to the probability that the each point cloud in the point cloud frustum belongs to the point cloud of interest and the predetermined probability threshold, specifically comprises:
determining that the point cloud is the point cloud of interest if the probability that the point cloud belongs to the point cloud of interest is greater than 0.5;
determining that the point cloud is not the point cloud of interest if the probability that the point cloud belongs to the point cloud of interest is less than or equal to 0.5:

    PNG
    media_image3.png
    63
    204
    media_image3.png
    Greyscale

wherein,
	Mask_i represents a mask of the i_th point cloud and takes a value of 1 or 0, and
	pi represents the probability that the i_th point cloud x_i belongs to the point cloud of interest point cloud of interest.
(Qi, Fif. 2; “3D instance segmentation PointNet” block, “Similar to Mask-RCNN that achieves instance segmentation by binary classification of pixels in image regions, the systems, methods, and devices herein implement 3D instance segmentation using PointNet-based network on point clouds in frustums”, [0130])


Allowable Subject Matter
Claim(s) 3 and 6-9 is/are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening Claim(s).

The following is a statement of reasons for the indication of allowable subject matter:

Claim(s) 3 and 6-9 recite(s) limitation(s) related to detailed steps of determining a point cloud frustum and determining 3D segmentation loss. There are no explicit teachings to the above limitation(s) found in the prior art cited in the rejection to its/their base claim(s).


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JIANXUN (JAMES) YANG whose telephone number is (571)272-9874. The examiner can normally be reached on MON-FRI: 8AM-5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nay Maung can be reached on (571)272-7882. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/JIANXUN YANG/
Primary Examiner, Art Unit 2664				6/22/2022