Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION


Reasons for Allowance

Claims 1-5,8-12 and 14-16     are Allowed over prior art.
Claims 6,7 and 13 are Cancelled by the Applicant. 

The following is an examiner’s statement of reasons for allowance: 
Prior art made of record fails to teach the underline limitations within the independent claims,

Regarding Claim 1,
A method to recover an object from a cluttered image by artificial neural networks, the method comprising : generating a normal map  from the cluttered image  by a trained image generator, recovering the object from the normal map by a trained task-specific recognition unit, and  outputting the result to an output unit wherein both the image generator  and the recognition unit have been trained by an artificial neural network, wherein the training of the image generator comprises: receiving synthetic cluttered images as input, wherein the cluttered images are the output of an augmentation pipeline which augments synthetic normal maps into synthetic cluttered images; giving a normal map as output; comparing the output of the image generator with the respective normal map given as input to the augmentation pipeline; optimizing the neural network of the image generator such that a deviation between the output and the normal map given as input to the augmentation pipeline is minimum, and wherein the training of the recognition unit comprises: receiving synthetic normal maps as input, wherein the synthetic  normal maps are obtained from a texture-less CAD model; recognizing the object as output, comparing the output of the recognition unit with the respective property of the object as represented in the normal map, and optimizing the neural network of the recognition unit such that a deviation between the output and the respective property of the input is minimum.  

Regarding Claim 12,
 A recognition system for recognizing an object from a cluttered image by artificial neural networks, the recognition system comprising: a trained image generator  for generating a normal map from the cluttered image, a trained task-specific recognition unit  for recovering the object from the normal map, and  an output unit for outputting the result, wherein both the image generator  and the recognition unit comprise an artificial neural network, wherein the artificial neural network of the image generator is configured to be trained by: receiving synthetic cluttered images as input, wherein the cluttered images are the output of an augmentation pipeline which augments synthetic normal maps into synthetic cluttered images; giving a normal map as output, comparing the output of the image generator with the respective normal map given as input to the augmentation pipeline, optimizing the neural network of the image generator such that a deviation between the output and the normal map given as input to the augmentation pipeline is minimum, and wherein the artificial neural network of the recognition unit is configured to be trained by: receiving synthetic normal maps as input, wherein the synthetic  normal maps are obtained from a texture-less CAD model; recognizing the object as output, comparing the output of the recognition unit with the respective property of the object as represented in the normal map, and optimizing the neural network of the recognition unit such that a deviation between the output and the respective property of the input is minimum.  




Regarding Claim 14,
A  non-transitory computer-readable storage medium comprising instructions which, when executed by a computer, cause the computer to generate a normal map from the cluttered image by a trained image generator, recover the object from the normal map by a trained task-specific recognition unit; and output the result to an output unit, wherein both the image generator and the recognition unit have been trained by an artificial neural network, wherein the training of the image generator comprises: receiving synthetic cluttered images as input, wherein the cluttered images are the output of an augmentation pipeline which augments synthetic normal maps into synthetic cluttered images; giving a normal map as output;  comparing the output of the image generator with the respective normal map given as input to the augmentation pipeline, optimizing the neural network of the image generator such that the deviation between the output and the normal map given as input to the augmentation pipeline is minimum, and wherein the training of the recognition unit comprises: receiving synthetic normal maps as input, wherein the synthetic normal maps are obtained from a texture-less CAD model, recognizing the object as output, comparing the output of the recognition unit with the respective property of the object as represented in the normal map; and optimizing the neural network of the recognition unit such that the deviation between the output and the respective property of the input is minimum.  

4.	Regarding Claim 1: 
The following prior arts  KAUFHOLD et al.  (USPUB 20190080205) and Jimmy Wu ( NPL Doc: “Real-Time Object Pose Estimation with Pose Interpreter Networks”, 07 January 2019 , 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) ,Pages 6798-6802)  and  Aayush Bansal (NPL Doc.: “ Marr Revisited: 2D-3D Alignment via Surface Normal Prediction” , June 2016, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 5965-5972) teaches A method to recover an object from a cluttered image by artificial neural networks ( Abstract- “…systems to recognize certain objects  within a given image …” AND Paragraph [0016]- “…other information in the image that clutters the image…” AND Paragraph [0040]- “…artificial neural network is a mathematical approximation …” AND Figure 6 & 7 ) , the method comprising : generating a normal map  from the cluttered image  by a trained image generator ( Figure 5 and 7 shows Generator ( 503 and 709) AND Paragraphs [0099] and [0101]) , recovering the object from the normal map by a trained task-specific recognition unit  ( Recognizer of object  ( 517 & 1017 ) shown within Figure 5 and 10  AND Paragraphs [0099] and [0104-0105]) ) , and  outputting the result to an output unit wherein both the image generator  and the recognition unit have been trained by an artificial neural network ( output from the generator and the recognizer  taught within Paragraphs [0129-0130]) , wherein the training of the image generator comprises: receiving synthetic cluttered images as input ( input of  synthetic images  and the training with the Generator taught within Paragraphs [0125-0126]) , Within analogous art ,   Jimmy Wu teaches wherein the cluttered images are the output of an augmentation pipeline which augments synthetic normal maps into synthetic cluttered images ( Page 6799- Col. 1 – “… whereas PoseCNN uses a large annotated pose dataset augmented with synthetic images. Additionally, our system runs in real-time and uses neural network forward passes to directly output pose estimates,...” AND Page 6801, Col. 1- “…Synthetic Image Dataset…”) ; Within analogous art, Aayush Bansal giving a normal map as output ( Page 5967, Col. 1- “…Our goal is, given a single 2D image I, to output a predicted surface normal map n for the image….”) ; comparing the output of the image generator with the respective normal map given as input to the augmentation pipeline ( Page 5969- Col. 1- “…We generate a training set of sampled rendered views and surface normal maps {(Ii, ˆni)}Ni=1 for viewing angles{φi}Ni=1 for all CAD models in the library. We generate surface normals for each pixel by ray casting to the modelfaces, which allows us to compute view-based surface normalsˆ…” AND Col. 2 – “…We found that due to the differences in appearance of natural images and rendered views of CAD models, simply concatenating the pool5 CaffeNet features hurt performance. We augmented the data similar to [45] by compositing our rendered views over backgrounds sampled from natural images during training, which improved performance….”)  ;within claim 1, but does not teach the limitations, nor render obvious the following limitations : “optimizing the neural network of the image generator such that a deviation between the output and the normal map given as input to the augmentation pipeline is minimum, and wherein the training of the recognition unit comprises: receiving synthetic normal maps as input, wherein the synthetic  normal maps are obtained from a texture-less CAD model; recognizing the object as output, comparing the output of the recognition unit with the respective property of the object as represented in the normal map, and optimizing the neural network of the recognition unit such that a deviation between the output and the respective property of the input is minimum.”


Regarding Claim 12: 
The following prior arts  KAUFHOLD et al.  (USPUB 20190080205) and Jimmy Wu ( NPL Doc: “Real-Time Object Pose Estimation with Pose Interpreter Networks”, 07 January 2019 , 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) ,Pages 6798-6802)  and  Aayush Bansal (NPL Doc.: “ Marr Revisited: 2D-3D Alignment via Surface Normal Prediction” , June 2016, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 5965-5972) teaches  A recognition system for recognizing an object from a cluttered image by artificial neural networks ( Abstract- “…systems to recognize certain objects  within a given image …” AND Paragraph [0016]- “…other information in the image that clutters the image…” AND Paragraph [0040]- “…artificial neural network is a mathematical approximation …” AND Figure 6 & 7 ) , the recognition system comprising: a trained image generator  for generating a normal map from the cluttered image ( Figure 5 and 7 shows Generator ( 503 and 709) AND Paragraphs [0099] and [0101]) , a trained task-specific recognition unit  for recovering the object from the normal map ( Recognizer of object  ( 517 & 1017 ) shown within Figure 5 and 10  AND Paragraphs [0099] and [0104-0105]) ) , and  an output unit for outputting the result, wherein both the image generator  and the recognition unit comprise an artificial neural network ( output from the generator and the recognizer  taught within Paragraphs [0129-0130]) , wherein the artificial neural network of the image generator is configured to be trained by: receiving synthetic cluttered images as input ( input of  synthetic images  and the training with the Generator taught within Paragraphs [0125-0126]), Within analogous art ,   Jimmy Wu teaches wherein the cluttered images are the output of an augmentation pipeline which augments synthetic normal maps into synthetic cluttered images ( Page 6799- Col. 1 – “… whereas PoseCNN uses a large annotated pose dataset augmented with synthetic images. Additionally, our system runs in real-time and uses neural network forward passes to directly output pose estimates,...” AND Page 6801, Col. 1- “…Synthetic Image Dataset…”) ; Within analogous art, Aayush Bansal giving a normal map as output ( Page 5967, Col. 1- “…Our goal is, given a single 2D image I, to output a predicted surface normal map n for the image….”) ; comparing the output of the image generator with the respective normal map given as input to the augmentation pipeline ( Page 5969- Col. 1- “…We generate a training set of sampled rendered views and surface normal maps {(Ii, ˆni)}Ni=1 for viewing angles{φi}Ni=1 for all CAD models in the library. We generate surface normals for each pixel by ray casting to the modelfaces, which allows us to compute view-based surface normalsˆ…” AND Col. 2 – “…We found that due to the differences in appearance of natural images and rendered views of CAD models, simply concatenating the pool5 CaffeNet features hurt performance. We augmented the data similar to [45] by compositing our rendered views over backgrounds sampled from natural images during training, which improved performance….”)  , within claim 12, but does not teach the limitations, nor render obvious the following limitations : “optimizing the neural network of the image generator such that a deviation between the output and the normal map given as input to the augmentation pipeline is minimum, and wherein the artificial neural network of the recognition unit is configured to be trained by: receiving synthetic normal maps as input, wherein the synthetic  normal maps are obtained from a texture-less CAD model; recognizing the object as output, comparing the output of the recognition unit with the respective property of the object as represented in the normal map, and optimizing the neural network of the recognition unit such that a deviation between the output and the respective property of the input is minimum.”

Regarding Claim 14: 
The following prior arts  KAUFHOLD et al.  (USPUB 20190080205) and Jimmy Wu ( NPL Doc: “Real-Time Object Pose Estimation with Pose Interpreter Networks”, 07 January 2019 , 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) ,Pages 6798-6802)  and  Aayush Bansal (NPL Doc.: “ Marr Revisited: 2D-3D Alignment via Surface Normal Prediction” , June 2016, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 5965-5972) teaches  A non-transitory computer-readable storage medium  ( Paragraphs [0147] and [0152]) comprising instructions which, when executed by a computer ( Abstract- “…systems to recognize certain objects  within a given image …” AND Paragraph [0016]- “…other information in the image that clutters the image…” AND Paragraph [0040]- “…artificial neural network is a mathematical approximation …” AND Figure 6 & 7 and computer readable storage taught within Paragraph [0152] ) , cause the computer to generate a normal map from the cluttered image by a trained image generator ( Figure 5 and 7 shows Generator ( 503 and 709) AND Paragraphs [0099] and [0101]) , recover the object from the normal map by a trained task-specific recognition unit ( Recognizer of object  ( 517 & 1017 ) shown within Figure 5 and 10  AND Paragraphs [0099] and [0104-0105]) ) , and output the result to an output unit, wherein both the image generator and the recognition unit have been trained by an artificial neural network ( output from the generator and the recognizer  taught within Paragraphs [0129-0130]) , wherein the training of the image generator comprises: receiving synthetic cluttered images as input ( input of  synthetic images  and the training with the Generator taught within Paragraphs [0125-0126]), Within analogous art ,   Jimmy Wu teaches wherein the cluttered images are the output of an augmentation pipeline which augments synthetic normal maps into synthetic cluttered images ( Page 6799- Col. 1 – “… whereas PoseCNN uses a large annotated pose dataset augmented with synthetic images. Additionally, our system runs in real-time and uses neural network forward passes to directly output pose estimates,...” AND Page 6801, Col. 1- “…Synthetic Image Dataset…”) ; Within analogous art, Aayush Bansal  giving a normal map as output ( Page 5967, Col. 1- “…Our goal is, given a single 2D image I, to output a predicted surface normal map n for the image….”) ; comparing the output of the image generator with the respective normal map given as input to the augmentation pipeline ( Page 5969- Col. 1- “…We generate a training set of sampled rendered views and surface normal maps {(Ii, ˆni)}Ni=1 for viewing angles{φi}Ni=1 for all CAD models in the library. We generate surface normals for each pixel by ray casting to the modelfaces, which allows us to compute view-based surface normalsˆ…” AND Col. 2 – “…We found that due to the differences in appearance of natural images and rendered views of CAD models, simply concatenating the pool5 CaffeNet features hurt performance. We augmented the data similar to [45] by compositing our rendered views over backgrounds sampled from natural images during training, which improved performance….”)  , within claim 14, but does not teach the limitations, nor render obvious the following limitations :  “optimizing the neural network of the image generator such that the deviation between the output and the normal map given as input to the augmentation pipeline is minimum, and wherein the training of the recognition unit comprises: receiving synthetic normal maps as input, wherein the synthetic normal maps are obtained from a texture-less CAD model, recognizing the object as output, comparing the output of the recognition unit with the respective property of the object as represented in the normal map; and optimizing the neural network of the recognition unit such that the deviation between the output and the respective property of the input is minimum.”


5.	The examiner found no suggestions or motivations to combine similar teachings from prior art made of record to overcome the limitations as discussed above. 

6.	Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Conclusion

7. 	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Refer to PTO-892, Notice of Reference Cited for a listing of analogous art.
8. 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to OMAR S ISMAIL whose telephone number is (571) 272-9799 and FAX number (571) 273-9799.  The examiner can normally be reached on M-F 9:00am-6:00pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David C. Payne can be reached on (571) 272-3024.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-3024.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/OMAR S ISMAIL/
Primary Examiner, Art Unit 2637