DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-4, 6, 9-13, 15, and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Hou et al, US 2021/0158043 in view of Xiang et al, “ThunderNet: A Turbo Unified Network for Real-Time Semantic Segmentation”.
 	Regarding claim 1, Hou discloses a system for object detection and semantic segmentation, the system comprising a computing device, the computing device comprising a processor and a non-volatile memory storing computer executable code, wherein the computer executable code, when executed at the processor (fig. 4A-4B; Abstract; para 0004; a system comprises one or more processors and a memory communicably coupled to the one or more processors. The memory stores a neural network module including instructions that when executed by the one or more processors cause the one or more processors to perform semantic segmentation and object detection on an input image), is configured to: 
(figs. 3B and 4A, element 310; para 0038-0039; an input image is input to semantic segmentation and object detection processes, which are performed by neural network module); 
 	process the image using a neural network backbone (fig. 4A, element 400; para 0047-0048; a neural network); to obtain a feature map (para 0048; a unified panoptic head that is shared by each of the multi-scale feature maps produced by the backbone (ResNet-50-FPN)); 
 	process the feature map using an object detection module to obtain object detection result of the image (fig. 4A, element 415; para 0047-0048; the output (i.e., feature map) of ResNet-50-FPN is input to a set of panoptic heads, whose outputs are fed to object-detection (levelness) elements); and 
 	process the feature map using a semantic segmentation module to obtain semantic segmentation result of the image (fig. 4A, element 420; para 0047-0048; the output of ResNet-50-FPN (i.e., feature map) is input to a set of panoptic heads, whose outputs are fed to semantic-segmentation elements),
 	wherein the object detection module and the semantic segmentation module are trained using a same loss function comprising an object detection component and a semantic segmentation component (figs. 4A-4B, element 410; a set of panoptic heads (i.e., an object detection component and a semantic segmentation component); para 0047-0048; a unified panoptic head that is shared by each of the multi-scale feature maps produced by the backbone (ResNet-50-FPN). On each feature map, two feature towers, localization tower and semantics tower, are applied).
th block as claimed.
 	However, Xiang discloses a ResNet18 backbone truncated away from its 4th block (page 1792, fig. 3).
 	Therefore, taking the combined disclosures of Hou and Xiang as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate a ResNet18 backbone truncated away from its 4th block as taught by Xiang into the invention of Hou for the benefit of fast and efficient network for semantic segmentation (page 1794, Section 5. Conclusion).
 	Regarding claim 2, the system of claim 1, Hou in the combination further disclose wherein training data for the object detection module and the semantic segmentation module comprises: a training image (figs. 3B and 4A, element 310; para 0038-0039), at least one bounding box defined in the training image (fig. 3B; para 0031 and 0039), label of the at least one bounding box (para 0036), and mask of the training image (para 0034-0035).
 	Regarding claim 3, the system of claim 1, Hou in the combination further disclose wherein the object detection module is a single shot detector (SSD) (para 0018).
 	Regarding claim 4, the system of claim 1, Hou in the combination further disclose wherein the object detection module consists of sequentially: five convolution layers; a detection layer; and a non-maximum suppression (NMS) layer (fig. 4A; para 0040 and 0046-0047).
Regarding claim 6, the system of claim 1, Xiang in the combination further disclose wherein the semantic segmentation module is a pyramid pooling module (page 1790, lines 1-2 and page 1792, fig. 3).
	Regarding claim 9, the system of claim 1, Hou in the combination further disclose wherein the computer executable code is further configured to control an operative device in the scene based on the object detection result and the semantic segmentation result (fig. 1; para 0020-0021).
 	Regarding claim 10, this claim recites substantially the same limitations that are performed by claim 1 above, and it is rejected for the same reasons.
 	Regarding claim 11, this claim recites substantially the same limitations that are performed by claim 2 above, and it is rejected for the same reasons.
 	Regarding claim 12, this claim recites substantially the same limitations that are performed by claim 3 above, and it is rejected for the same reasons.
 	Regarding claim 13, this claim recites substantially the same limitations that are performed by claim 4 above, and it is rejected for the same reasons.
 	Regarding claim 15, this claim recites substantially the same limitations that are performed by claim 6 above, and it is rejected for the same reasons.
 	Regarding claim 17, this claim recites substantially the same limitations that are performed by claim 9 above, and it is rejected for the same reasons.
 	Regarding claim 18, this claim recites substantially the same limitations that are performed by claim 1 above, and it is rejected for the same reasons.
 	Regarding claim 19, this claim recites substantially the same limitations that are performed by claim 3 above, and it is rejected for the same reasons.

Claims 5 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Hou et al, US 2021/0158043 in view of Xiang et al, “ThunderNet: A Turbo Unified Network for Real-Time Semantic Segmentation” and further in view of Long et al, “Fully Convolutional Networks for Semantic Segmentation”.
 	Regarding claim 5, the system of claim 4, Hou and Xiang in the combination disclose wherein for a 512x512 resolution of the image (Xiang: page 1793, Backbone Selection).
	Hou and Xiang in the combination disclose claim 5 as enumerated above, but Hou and Xiang in the combination do not explicitly disclose the neural network backbone convolutionally adds 64x64 information and 32x32 information to the detection layer, and the five convolutional layers respectively add 16x16 information, 8x8 information, 4x4 information, 2x2 information, and 1x1 information to the detection layer as claimed.
 	However, Long discloses the neural network backbone convolutionally adds 64x64 information and 32x32 information to the detection layer, and the five convolutional layers respectively add 16x16 information, 8x8 information, 4x4 information, 2x2 information, and 1x1 information to the detection layer (page 6, fig. 3 and page 9, A. Upper Bounds on IU).
 	Therefore, taking the combined disclosures of Hou, Xiang, and Long as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the neural network backbone convolutionally adds 64x64 information and 32x32 information to the detection layer, (Long: Abstract).
 	Regarding claim 14, this claim recites substantially the same limitations that are performed by claim 5 above, and it is rejected for the same reasons.

Claims 7-8, 16, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Hou et al, US 2021/0158043 in view of Xiang et al, “ThunderNet: A Turbo Unified Network for Real-Time Semantic Segmentation” and further in view of Zhao et al, “Pyramid Scene Parsing Network”
 	Regarding claim 7, the system of claim 1, Hou and Xiang in the combination do not explicitly disclose wherein the semantic segmentation module consists sequentially: a pooling module pooling the feature map to obtain pooled features at different sizes; a plurality of convolution layers each convoluting one of the pooled features at different sizes to obtain convoluted features at different sizes; an upsample module receiving the convoluted features at different sizes to obtain upsampled feature; a concatenation layer receiving the upsampled feature and feature from the neural network backbone to obtain concatenated feature; and a convolution layer convoluting the concatenated feature to obtain per-pixel prediction as the semantic segmentation result as claimed.
 	However, Zhao disclose the semantic segmentation module consists sequentially: a pooling module pooling the feature map to obtain pooled features at different sizes; a plurality of convolution layers each convoluting one of the pooled (fig. 3; Section 3.2. Pyramid Pooling Module and Section 3.3. Network Architecture).
 	Therefore, taking the combined disclosures of Hou, Xiang, and Long as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the semantic segmentation module consists sequentially: a pooling module pooling the feature map to obtain pooled features at different sizes; a plurality of convolution layers each convoluting one of the pooled features at different sizes to obtain convoluted features at different sizes; an upsample module receiving the convoluted features at different sizes to obtain upsampled feature; a concatenation layer receiving the upsampled feature and feature from the neural network backbone to obtain concatenated feature; and a convolution layer convoluting the concatenated feature to obtain per-pixel prediction as the semantic segmentation result as taught by Zhao into the inventions of Hou and Xiang for the benefit of effectively producing good quality results on the scene parsing task (Zhao: Abstract).
 	Regarding claim 8, the system of claim 7, Zhao in the combination further disclose wherein the pooled features are at sizes of 1x1, 2x2, 3x3, and 6x6 (Section 3.2. Pyramid Pooling Module).
Regarding claim 16, this claim recites substantially the same limitations that are performed by claim 7 above, and it is rejected for the same reasons.
 	Regarding claim 20, this claim recites substantially the same limitations that are performed by claim 7 above, and it is rejected for the same reasons.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VAN D HUYNH whose telephone number is (571)270-1937. The examiner can normally be reached 8AM-6PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Edward F Urban can be reached on (571) 272-7899. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 





/VAN D HUYNH/Primary Examiner, Art Unit 2665