DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 12 is rejected because the claim recites the limitation “the overlay pixels”.  There is insufficient antecedent basis for this limitation in the claim.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 9, 13-17 are rejected under 35 U.S.C. 103 as being unpatentable over Molchanov et al. (US 2017/0206405) in view of Yang et al. (US 2018/0373985).

Regarding claim 1, Molchanov teaches method for real-time recognition, using one or more multi-threaded processors, of a gesture communicated by a subject, the method comprising: receiving, by a first thread of the one or more multi-threaded processors, a first set of image frames associated with the gesture, the first set of image frames captured during a first time interval (see figure 1C, figure 2D, where Molchanov discusses image frames of different time intervals undergoing 3D-CNN and RNN processing);
performing, by the first thread, pose estimation on each frame of the first set of image frames including eliminating background information from each frame to obtain one or more areas of interest (see figure 1C, figure 2D, where Molchanov discusses image frames of different time intervals undergoing 3D-CNN and RNN processing);
storing information representative of the one or more areas of interest in a shared memory accessible to the one or more multi-threaded processors (see figure 6, figure 2D, para. 0068, 0072, 0081, where Molchanov discusses detecting and classifying gestures and processor memory);  and
performing, by a second thread of the one or more multi-threaded processors, a gesture recognition operation on a second set of image frames associated with the gesture, the second set of image frames captured during a second time interval that is different from the first time interval (see figure 1C, figure 2D, where Molchanov discusses image frames of different time intervals undergoing 3D-CNN and RNN processing),
 wherein performing the gesture recognition operation comprises: using a first processor of the one or more multi-threaded processors that implements a first three-dimensional convolutional neural network (3D CNN) to perform an optical flow operation on the information representative of the one or more areas of interest that is accessed from the shared memory (see figure 1C, figure 2C, figure 2D, where Molchanov discusses 3D-CNN, RNN, and optical flow operation),
using a second processor of the one or more multi-threaded processors that implements a second 3D CNN to perform spatial and color processing operations on the information representative of the one or more areas of interest that is accessed from the shared memory (see figure 1C, figure 2C, figure 2D, where Molchanov discusses 3D-CNN, RNN, and optical flow operation);
fusing results of the optical flow operation and results of the spatial and color processing operations to produce an identification of the gesture (see figure 1C, where Molchanov discusses combing the optical flow and color sensor results to detect and label a gesture); and
using a recurrent neural network (RNN) to determine that the identification corresponds to a singular gesture across at least the first and second sets of image frames (see figure 2D, para. 0043, 0048, where Molchanov discusses using a recurrent neural network (RNN) to classify gesture across two or more clips from a data stream).
Yang teaches wherein the optical flow operation is enabled to recognize a motion associated with the gesture (see para. 0024, 0027, where Yang discusses detecting optical flow related to gesture motion).
Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov with Yang to derive at the invention of claim 1.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  
The determination of obviousness is predicated upon the following:  One skilled in the art would have been motivated to modify Molchanov in this manner in order to improve gesture recognition by determining the motion of the gesture using an optical flow process that detects the motion of objects by analyzing the relative motion in image data. Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in this manner explained using known engineering design, interface and/or programming techniques, without changing a fundamental operating principle of Molchanov, while the teaching of Yang continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of calculating the optical flow to determine the motion of an object to properly perform gesture recognition.  The Molchanov and Yang systems perform object recognition, therefore one of ordinary skill in the art would have reasonable expectation of success in the combination.  It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding claim 9, Molchanov and Yang teach wherein the first set of image frames is received concurrently as the gesture recognition operation is performed on the second set of image frames (see figure 1C, figure 2C, figure 2D, where Molchanov discusses 3D-CNN, RNN, and optical flow operation on multiple images; see para. 0024, where Yang discusses detecting optical flow related to gesture motion).
The same motivation of claim 1 is applied to claim 9.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov with Yang to derive at the invention of claim 9.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  

Regarding claim 13, Molchanov and Yang teach wherein the spatial and color processing operations comprise recognizing one or more characteristics of the gesture in data corresponding to a single image frame of the second set of image frames (see figure 1C, figure 2C, figure 2D, where Molchanov discusses 3D-CNN, RNN, and optical flow operation on multiple images; see para. 0024, where Yang discusses detecting optical flow related to gesture motion).
The same motivation of claim 1 is applied to claim 13.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov with Yang to derive at the invention of claim 13.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  

Regarding claim 14, Molchanov teaches wherein the information representative of the one or more areas of interest are accessed by the first 3D CNN and the second 3D CNN from the shared memory without copying data corresponding to the information representative of the one or more areas of interest to any other memory location (see figure 6, figure 2D, para. 0068, 0072, 0081, where Molchanov discusses detecting and classifying gestures using different 3D CNN data from the processor’s memory).
The same motivation of claim 1 is applied to claim 14.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov with Yang to derive at the invention of claim 14.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  

Regarding claim 15, Molchanov teaches wherein each of the first set of image frames and the second set of image frames comprises a frame number or a Society of Motion Picture and Television Engineers (SMPTE) timecode (see para. 0054, where Molchanov discusses frame data including frame number).
The same motivation of claim 1 is applied to claim 15.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov with Yang to derive at the invention of claim 15.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  

Regarding claim 16, Yang teaches wherein the RNN comprises one or more long short-term memory (LSTM) units (see para. 0027, where Yang discusses RNN with long short term memory).
The same motivation of claim 1 is applied to claim 16.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov with Yang to derive at the invention of claim 16.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  

Claim 17 is rejected as applied to claim 1 as pertaining to a corresponding apparatus.

Claims 2, 3, 18 are rejected under 35 U.S.C. 103 as being unpatentable over Molchanov et al. (US 2017/0206405) in view of Yang et al. (US 2018/0373985) in view Somanath et al. (US 2019/0066733).

Regarding claim 2, Molchanov and Yang do not expressly teach wherein the first set of images frames are captured using a set of visual sensing devices that include multiple apertures oriented with respect to the subject to receive optical signals corresponding to the gesture from multiple angles.  However, Somanath teaches wherein the first set of images frames are captured using a set of visual sensing devices that include multiple apertures oriented with respect to the subject to receive optical signals corresponding to the gesture from multiple angles (see para. 0065, where Somanath discusses multiple cameras capturing multiple viewpoints).
Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov and Yang with Somanath to derive at the invention of claim 2.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  
The determination of obviousness is predicated upon the following:  One skilled in the art would have been motivated to modify Molchanov and Yang in this manner in order to improve gesture recognition by determining the motion of the gesture by analyzing the relative motion in image data from multiple angles. Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in this manner explained using known engineering design, interface and/or programming techniques, without changing a fundamental operating principle of Molchanov and Yang, while the teaching of Somanath continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of determine the motion of an object using multiple angles to properly perform gesture recognition.  The Molchanov, Yang, and Somanath systems perform object recognition, therefore one of ordinary skill in the art would have reasonable expectation of success in the combination.  It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding claim 3, Molchanov teaches further comprising: collecting depth information corresponding to the gesture in one or more planes perpendicular to an image plane captured by the set of visual sensing devices, wherein eliminating the background information is further based on the depth information (see claim 8, para. 0035, where Molchanov discusses depth values corresponding to the hand gesture, not the background regions).
The same motivation of claim 2 is applied to claim 3.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov and Yang with Somanath to derive at the invention of claim 3.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  

Claim 18 is rejected as applied to claim 2 as pertaining to a corresponding apparatus.

Claims 4-8, 19, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Molchanov et al. (US 2017/0206405) in view of Yang et al. (US 2018/0373985) in view Gausebeck (US 2019/0026956).

Regarding claim 4, Molchanov teaches wherein the first 3D CNN has been trained on a limited set of training data, and wherein generating the limited set of training data comprises: generating a 3D scene that includes a 3D model (see para. 0102, where Molchanov discusses 3D model data); using a value indicative of the total number of images in the limited set of training data to determine a plurality of variations of the 3D scene (see para. 0068, where Molchanov discusses a plurality of frames used for training data); applying each of plurality of variations to the 3D scene to produce a plurality of modified 3D scenes (see para. 0102, where Molchanov discusses processing different data from the same scene in a pipelined fashion until all of the model data for the scene has been rendered to the frame buffer).
Molchanov and Yang do not expressly teach capturing an image of each of the plurality of modified 3D scenes to generate the limited set of training data.  However, Gausebeck teaches capturing an image of each of the plurality of modified 3D scenes to generate the limited set of training data (see para. 0249, where Gausebeck discusses multiple images in the training data).
Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov and Yang with Gausebeck to derive at the invention of claim 4.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  
The determination of obviousness is predicated upon the following:  One skilled in the art would have been motivated to modify Molchanov and Yang in this manner in order to improve gesture recognition by determining the motion of the gesture by modifying 3D data to properly identify the structure of the gesture object. Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in this manner explained using known engineering design, interface and/or programming techniques, without changing a fundamental operating principle of Molchanov and Yang, while the teaching of Gausebeck continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of determine the motion of an object modifying 3D data to properly perform gesture recognition.  The Molchanov, Yang, and Gausebeck systems perform object recognition, therefore one of ordinary skill in the art would have reasonable expectation of success in the combination.  It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding claim 5, Gausebeck teaches further comprising: generating, for each image of the limited set of training data, a label that corresponds to a feature of interest, wherein the label comprises one or more bounding lines that delineates a precise boundary of the feature of interest (see para. 0249, where Gausebeck discusses multiple images in the training data).
The same motivation of claim 4 is applied to claim 5.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov and Yang with Gausebeck to derive at the invention of claim 5.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  

Regarding claim 6, Gausebeck teaches wherein the precise boundary of the feature of interest is generated based on a group of polygons that collectively form the feature of interest in the 3D model (see para. 0070, 0080, where Gausebeck discusses summing polygons of the 3D model data).
The same motivation of claim 4 is applied to claim 6.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov and Yang with Gausebeck to derive at the invention of claim 6.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  

Regarding claim 7, Gausebeck teaches wherein determining the plurality of variations of the 3D scene is based on a set of parameters that specify at least one of: a position of the 3D model, an angle of 3D model, a position of a camera, an orientation of a camera, a lighting attribute, a texture of a subsection of the 3D model, or a background of the 3D scene (see para. 0070, 0080, where Gausebeck discusses defining a boundary of the object or feature).
The same motivation of claim 4 is applied to claim 7.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov and Yang with Gausebeck to derive at the invention of claim 7.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  

Regarding claim 8, Molchanov teaches further comprising: obtaining, after generating the limited set of training data, an evaluation of the gesture recognition operation (see figure 6, figure 2D, para. 0068, 0072, 0081, where Molchanov discusses detecting and classifying gestures and processor memory); and re-generating another limited set of training data upon a determination that the gesture recognition operation fails to meet one or more predetermined criteria (see para. 0054, where Molchanov discusses updating parameters to reduce errors).
The same motivation of claim 4 is applied to claim 8.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov and Yang with Gausebeck to derive at the invention of claim 8.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  

Claim 19 is rejected as applied to claim 4 as pertaining to a corresponding apparatus.
Claim 20 is rejected as applied to claim 5 as pertaining to a corresponding apparatus.

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Molchanov et al. (US 2017/0206405) in view of Yang et al. (US 2018/0373985) in view of Hiramatsu (US 2018/0018529).

Regarding claim 10, Molchanov and Yang do not expressly teach wherein the optical flow operation comprises sharpening, line, edge, comer and shape enhancements.  However, Hiramatsu teaches wherein the optical flow operation comprises sharpening, line, edge, comer and shape enhancements (see para. 0149, 0194, 0200, where Hiramatsu discusses edge segmentation of object using optical flow).
Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov and Yang with Hiramatsu to derive at the invention of claim 10.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  
The determination of obviousness is predicated upon the following:  One skilled in the art would have been motivated to modify Molchanov and Yang in this manner in order to improve gesture recognition by determining the motion of the gesture using an optical flow process that detects the motion of objects by analyzing the relative motion in image data. Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in this manner explained using known engineering design, interface and/or programming techniques, without changing a fundamental operating principle of Molchanov and Yang, while the teaching of Hiramatsu continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of optical flow of an object to properly perform gesture recognition.  The Molchanov, Yang, and Hiramatsu systems perform object recognition, therefore one of ordinary skill in the art would have reasonable expectation of success in the combination.  It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.


Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Molchanov et al. (US 2017/0206405) in view of Yang et al. (US 2018/0373985) in view of Dai et al. (US 2017/0153711).

Regarding claim 11, Molchanov and Yang do not expressly teach wherein performing the pose estimation produces overlay pixels corresponding to the body, fingers and face of the subject. However, Dai teaches wherein performing the pose estimation produces overlay pixels corresponding to the body, fingers and face of the subject (see claim 1, claim 11, para. 0049, where Dai discusses facial and human gesture recognition using color pixels).
Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov and Yang with Dai to derive at the invention of claim 11.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  
The determination of obviousness is predicated upon the following:  One skilled in the art would have been motivated to modify Molchanov and Yang in this manner in order to improve gesture recognition by determining the pixel motion of the gesture to properly identify the structure of the gesture object. Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in this manner explained using known engineering design, interface and/or programming techniques, without changing a fundamental operating principle of Molchanov and Yang, while the teaching of Dai continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of determine the motion of an object to properly perform gesture recognition.  The Molchanov, Yang, and Dai systems perform object recognition, therefore one of ordinary skill in the art would have reasonable expectation of success in the combination.  It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Molchanov et al. (US 2017/0206405) in view of Yang et al. (US 2018/0373985) in view Gausebeck (US 2019/0026956) in view of Simon et al. (US 2010/0044121).

Regarding claim 12, Molchanov, Yang, and Gausebeck do not expressly teach wherein the overlay pixels comprise pixels with different colors for each finger of the subject.  However, Simon teaches wherein the overlay pixels comprise pixels with different colors for each finger of the subject (see para. 0191, where Simon discusses finger recognition using different colors for fingers).
Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Molchanov, Yang, and Gausebeck with Simon to derive at the invention of claim 12.  The result would have been expected, routine, and predictable in order to perform gesture recognition.  
The determination of obviousness is predicated upon the following:  One skilled in the art would have been motivated to modify Molchanov, Yang, and Gausebeck in this manner in order to improve gesture recognition by determining the pixel motion of the gesture to properly identify the structure of the gesture object. Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in this manner explained using known engineering design, interface and/or programming techniques, without changing a fundamental operating principle of Molchanov, Yang, and Gausebeck, while the teaching of Simon continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of determine the motion of an object to properly perform gesture recognition.  The Molchanov, Yang, Gausebeck, and Simon systems perform object recognition, therefore one of ordinary skill in the art would have reasonable expectation of success in the combination.  It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

Yang et al. (US 2018/0032846) discusses 3D modalities include multiple frames of spatial data (color) and multiple frames of optical flow data and multiple CNNs to generate classification output data.

	


Contact Information

Any inquiry concerning this communication or earlier communications from the examiner should be directed to KENNY A CESE whose telephone number is (571) 270-1896.  The examiner can normally be reached on Monday – Friday, 9am – 4pm.
If attempts to reach the primary examiner by telephone are unsuccessful, the examiner’s supervisor, Claire Wang can be reached on (571) 270-1051.  The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Kenny A Cese/
Primary Examiner, Art Unit 2663