DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Status
2.	This action is in response to the application filed on 2/8/2021. 
		Claims 1-20 are presented for examination. 

Claim Rejections - 35 USC § 102
3.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
4.	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

5.	Claim 20 is rejected under 35 U.S.C. 102(a)(1) as being anticipated by Yeh (US 2020/0320791 A1).
Regarding claim 20, Yeh (Fig. 1) discloses a computer-implemented method comprising: 
receiving from a mobile client (par [0044], [0045], mobile devices) an image from a camera (par [0042], camera) view of an environment, the image depicting a portion of a body of a user (par [0045], body part of user 108); 
providing the image to a machine learning model configured to identify a formation of the portion of the body in the image (Fig. 5, par [0090], a set of 12 training images (item 500 which was identified from Step 402), where each image depicts a sample face. Each face has depicted thereon a set of dots outlining the features of the users face, that are detected and modeled via the toolkit software executed in Step 404); and 
providing for display on the mobile client, based on an identification of the formation by the machine learning model, an augmented reality (AR) object (1006a) in the camera view of the environment (par [0114], item 1002 is the user's image. Item 1004 is the modelling of the user's ear features, which involves identifying the ear features and determining an average of their skin points. The points and arrows (item 1004a) illustrated in item 1004 demonstrate the determination of the user's ear shape area. Then based on these determinations, the location of the user's ear is identified, which also provides the ear's shape area, which as illustrated in item 1006, can be utilized to properly locate the position on the user's ear to display the selected earring item 1006a (e.g., attached to the earlobe)). 
Claim Rejections - 35 USC § 103
6.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
7.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

8.	Claims 1-3, 11-14 and 16-18 are rejected under 35 U.S.C. 103 as being unpatentable Ge (US 2021/0225077 A1; hereinafter Ge) in view of Anderson (US 2019/0043260 A1).
Regarding claim 1, Ge (Figs. 6, 7) discloses a non-transitory computer readable storage medium comprising stored instructions (par [0107]), the instructions when executed by a processor cause the processor to: 
receive from a mobile client (par [0108]) an image from a camera view of an environment, the image depicting a hand of a user (Fig. 6, operation 601, par [0017], [0021], [0080], a trained machine learning techniques network 410 receives, from a client device 102, an RGB image of a user's hand);
apply a machine learning model to the image, the machine learning model trained on training image data representative of a plurality of hand formations (Fig. 6, operation 602, par [0081], the hand shape and pose estimation system 124 extracts one or more features of the monocular image using a plurality of machine learning techniques), the machine learning model configured to identify a formation of the hand in the image as one of the plurality of hand formations (Fig. 7, operation 701, par [0085], the hand shape and pose estimation system 124 obtains a first plurality of input images that include synthetic representations of a hand).
Ge does not teach: 
provide for display on the mobile client, responsive to identification of the formation of the hand as a first hand formation of the plurality of hand formations, an augmented reality (AR) object in a first state from an AR engine in the camera view of the environment; and
provide for display on the mobile client, responsive to the formation of the hand identified as a second hand formation of the plurality of hand formations, the AR object in a second state from the AR engine in the camera view of the environment.
Anderson (Fig. 6) teaches:
provide for display on the mobile client, responsive to identification of the formation of the hand as a first hand formation of the plurality of hand formations (par [0095], performing a tap-hold gesture using an index finger), an augmented reality (AR) object (binding 635) in a first state from an AR engine in the camera view of the environment (par [0031] disclose using camera  and par [0095], at stage 1 the user 608 (represented by the hand in FIG. 6) may cause a binding 635 to be generated by the AR platform 105 to bind two physical objects 630P-1 and 630P-2 by performing a tap-hold gesture using an index finger at a first physical object 630P-1, and dragging the tip of the index finger from the first physical object 630P-1 towards a second physical object 630P-2. Once the index finger of the user 608 reaches the second physical object 630P-2, the AR platform 105 may connect an end of the binding 635 to the second physical object 630P-2, and the user 608 may release the hold gesture); and
provide for display on the mobile client, responsive to the formation of the hand identified as a second hand formation of the plurality of hand formations (par [0096], perform a grab and hold gesture on the binding and then perform a pulling gesture to stretch the binding 635 towards the user 608), the AR object (binding 635) in a second state from the AR engine in the camera view of the environment (par [0096], At stage 2, the user 608 may perform a grab and hold gesture on the binding and then perform a pulling gesture to stretch the binding 635 towards the user 608. As the user 608 stretches the binding 635, multiple instances of the binding 635 may be generated by the AR platform 105, and displayed in such a way that the binding 635 appears to exhibit various physical properties).
Before the effective of filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Ge with Anderson to teach provide for display on the mobile client, responsive to identification of the formation of the hand as a first hand formation of the plurality of hand formations, an augmented reality (AR) object in a first state from an AR engine in the camera view of the environment; and provide for display on the mobile client, responsive to the formation of the hand identified as a second hand formation of the plurality of hand formations, the AR object  in a second state from the AR engine in the camera view of the environment. The suggestion/motivation would have been to project virtual objects to perform various actions in response to different user gestures.
Regarding claim 2, Ge and Anderson disclose the non-transitory computer readable storage medium of claim 1.
Ge (Figs. 4, 6, 7) further teaches wherein the machine learning model is a first machine learning model (first machine learning technique module 412), and wherein the instructions further comprise instructions that when executed by the processor cause the processor (processors) to identify a location of the hand (par [0017], pose of hand), the instructions to identify location of the hand further comprising instructions that when executed by the processor cause the processor to: 
apply a second machine learning model (second machine learning technique module 416) to the received image (par [0021], the AR/VR application 105 applies various trained machine learning techniques on the captured image of the hand to generate a 3D hand model representation of the hand that includes the pose (e.g., the joint positions) and the shape (e.g., the surface features and textures) of the hand)), the second machine learning model configured to classify real-world objects in the environment, the real-world objects including the hand (par [0021], [0068]).
Ge does not teach:
receive a plurality of feature points associated with the real-world objects in the environment;
generate a three-dimensional (3D) virtual coordinate space based the plurality of feature points; and
identify, based on a classification of the hand by the second machine learning model, the location of the hand associated with corresponding coordinates in the generated 3D virtual coordinate space.
Anderson (Figs. 3, 6, 7) teaches:
receive a plurality of feature points (physical objects 630P-1, 630P-3, 630P-3) associated with the real-world objects (user’s hand 680) in the environment;
generate a three-dimensional (3D) virtual coordinate space based the plurality of feature points (par [0073], the modeling engine 301 may direct the processor 302 to generate a three-dimensional (3D) model 303 of the physical environment 115 for an AR environment based on the first sensor data); and
identify, based on a classification of the hand by the second machine learning model (3D/skeletal-based model or an appearance-based model), the location of the hand associated with corresponding coordinates in the generated 3D virtual coordinate space (par [0075]).
Before the effective of filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Ge with Anderson to teach receive a plurality of feature points associated with the real-world objects in the environment; generate a three-dimensional (3D) virtual coordinate space based the plurality of feature points; and identify, based on a classification of the hand by the second machine learning model, the location of the hand associated with corresponding coordinates in the generated 3D virtual coordinate space. The suggestion/motivation would have been to project virtual objects to perform various actions in response to different user gestures.
Regarding claim 3, Ge and Anderson disclose the non-transitory computer readable storage medium of claim 2.
 Ge does not teach wherein the AR engine rendered object is provided for display based on the identified location.
Anderson teaches wherein the AR engine rendered object is provided for display based on the identified location (par [0026], [0030], [0075 and 0077]).
Before the effective of filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Ge with Anderson to teach wherein the AR engine rendered object is provided for display based on the identified location. The suggestion/motivation would have been to project virtual objects to perform various actions in response to different user gestures.
Regarding claim 11, Ge and Anderson disclose the non-transitory computer readable storage medium of claim 1.
Ge further teaches wherein the AR engine is a game engine (par [0021], the AR/VR application 105 presents various content (e.g., messages, games, advertisements, and so forth)).
Regarding claim 12, it is a computer system claim of claim 1. Therefore, it is analyzed as claim 1.
Regarding claim 13, it is a computer system claim of claim 2. Therefore, it is analyzed as claim 2.
Regarding claim 14, it is a computer system claim of claim 3. Therefore, it is analyzed as claim 3.
Regarding claim 16, it is a computer-implemented method claim of claim 1. Therefore, it is analyzed as claim 1.
Regarding claim 17, it is a computer-implemented method claim of claim 2. Therefore, it is analyzed as claim 2.
Regarding claim 18, it is a computer-implemented method claim of claim 3. Therefore, it is analyzed as claim 3.
9.	Claims 4-7, 9-10, 15 and 19 are rejected under 35 U.S.C. 103 as being unpatentable Ge in view of Anderson and further in view of Dong et al. (US 2021/0081029 A1; hereinafter Dong).
Regarding claim 4, Ge and Anderson disclose the non-transitory computer readable storage medium of claim 2. 
 Ge and Anderson do not teach wherein the instructions further comprise instructions that when executed by the processor cause the processor to remove data of a given image from memory responsive to the second machine learning model classifying an absence of any hand depicted within the given image.
Dong teaches wherein the instructions further comprise instructions that when executed by the processor cause the processor to remove data of a given image from memory responsive to the second machine learning model classifying an absence of any hand depicted within the given image (par[ 0047] discloses a cloud storage for storing detection, hand recognition and par [0026], “if the dynamic recognition event is terminated or not completed, the systems terminate or suspends the message execution until a successful dynamic hand-shape recognition event occurs”.  Thus, hand recognition event is not stored or removed from memory until dynamic hand-shape recognition event is completed).
Before the effective of filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Ge and Anderson with Dong to teach wherein the instructions further comprise instructions that when executed by the processor cause the processor to remove data of a given image from memory responsive to the second machine learning model classifying an absence of any hand depicted within the given image. The suggestion/motivation would have been to detected direction of the movement of the gesture in some systems that may be determined by the application software it communicates with.
Regarding claim 5, Ge and Anderson disclose the non-transitory computer readable storage medium of claim 1.
Ge (Figs. 6, 7) further teaches wherein the instructions further comprise instructions that when executed by the processor cause the processor to train the machine learning model using the training image data (par [0018]), the instruction to train the machine learning model further comprising instructions that when executed by the processor cause the processor to (par [0107]): 
receive a plurality of images of the plurality of hand formations (Fig. 7, operation 701, par [0085], the hand shape and pose estimation system 124 obtains a first plurality of input images that include synthetic representations of a hand).
Ge and Anderson do not teach apply a respective label to each of the plurality of images of the plurality of hand formations, the training image data comprising the labeled plurality of images.
Dong (Figs. 3, 4) teaches apply a respective label to each of the plurality of images of the plurality of hand formations, the training image data comprising the labeled plurality of images (par [0029], the system detects and labels the finger location as the user points to the camera 1616 or a predetermined designated area on a screen 1606 as highlighted by the circle enclosing an exemplary user's hand).
Before the effective of filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Ge and Anderson with Dong to teach apply a respective label to each of the plurality of images of the plurality of hand formations, the training image data comprising the labeled plurality of images. The suggestion/motivation would have been to detected direction of the movement of the gesture in some systems that may be determined by the application software it communicates with.
Regarding claim 6, Ge, Anderson and Dong disclose the non-transitory computer readable storage medium of claim 5.
Ge and Anderson do not teach wherein each respective label corresponds to a computer executable command.
Dong teaches wherein each respective label corresponds to a computer executable command (par [0030], a useful gesture (i.e., a gesture that has a message or a command association) shown as an exemplary finger pointing is detected).
Before the effective of filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Ge and Anderson with Dong to teach wherein each respective label corresponds to a computer executable command. The suggestion/motivation would have been to detected direction of the movement of the gesture in some systems that may be determined by the application software it communicates with.
Regarding claim 7, Ge, Anderson and Dong disclose the non-transitory computer readable storage medium of claim 5.
 Ge and Anderson do not teach wherein the plurality of hand formations includes a user-customized hand formation.
Dong teaches wherein the plurality of hand formations includes a user-customized hand formation (par [0025], In some systems, only a fixed number of static and/or dynamic gestures are recognized; in other systems, a plurality of gestures is recognized. Some may be customized by a user).
Before the effective of filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Ge and Anderson with Dong to teach wherein the plurality of hand formations includes a user-customized hand formation. The suggestion/motivation would have been to detected direction of the movement of the gesture in some systems that may be determined by the application software it communicates with.
Regarding claim 9, Ge and Anderson disclose the non-transitory computer readable storage medium of claim 1.
Ge and Anderson do not teach wherein the machine learning model is further configured to output a confidence score associated with the identified formation of the hand, and wherein providing for display the AR engine rendered object in the first state or in the second state is further responsive to the confidence score exceeding a threshold confidence score.
Dong teaches wherein the machine learning model is further configured to output a confidence score associated with the identified formation of the hand [par [0044], the machine learning algorithms train gesture classifiers 1626 that detect hand key points and mark the capture of hands in motion and render confidence scores as the system's video is processed), and wherein providing for display the AR engine rendered object in the first state or in the second state is further responsive to the confidence score exceeding a threshold confidence score (par [0044-0045], [0046]).
Before the effective of filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Ge and Anderson with Dong to teach wherein the machine learning model is further configured to output a confidence score associated with the identified formation of the hand, and wherein providing for display the AR engine rendered object in the first state or in the second state is further responsive to the confidence score exceeding a threshold confidence score. The suggestion/motivation would have been to detected direction of the movement of the gesture in some systems that may be determined by the application software it communicates with.
Regarding claim 10, Ge, Anderson and Dong disclose the non-transitory computer readable storage medium of claim 9.
Ge and Anderson do not teach wherein the instructions further comprise instructions that when executed by the processor cause the processor to receive a selection of a user-specified threshold confidence score.
Dong teaches wherein the instructions further comprise instructions that when executed by the processor cause the processor to receive a selection of a user-specified threshold confidence score (par [0045], [0046]).
Before the effective of filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Ge and Anderson with Dong to teach wherein the instructions further comprise instructions that when executed by the processor cause the processor to receive a selection of a user-specified threshold confidence score. The suggestion/motivation would have been to detected direction of the movement of the gesture in some systems that may be determined by the application software it communicates with.
Regarding claim 15, it is a computer system claim of claim 4. Therefore, it is analyzed as claim 4.
Regarding claim 19, it is a computer-implemented method claim of claim 4. Therefore, it is analyzed as claim 4.
10.	Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable Ge in view of Anderson, Dong and further in view of Bradski et al. (US 2019/0094981 A1; hereinafter Bradski).
Regarding claim 8, Ge, Anderson and Dong disclose the non-transitory computer readable storage medium of claim 7.
Ge, Anderson and Dong do not teach wherein the instructions further comprise instructions that when executed by the processor cause the processor to prompt the user to provide a user-specified state of the AR engine rendered object, wherein an identification of the user-customized hand formation by the machine learning model indicates that the AR engine rendered object is to be provided for display in the user-specified state.
Bradski (Fig. 89F) teaches wherein the instructions further comprise instructions that when executed by the processor cause the processor to prompt the user to provide a user-specified state of the AR engine rendered object (par [1448], the AR system may render an animated character 8950 (e.g., friendly monster) in the field of view of at least the child. The AR system may render the animated character so as to appear to be climbing out of a box (e.g., cereal box). The sudden appearance of the animated character may prompt the child to start a game (e.g., Monster Battle). The child can animate or bring the character to life with a gesture. For example, a flick of the wrist may cause the AR system to render the animated character bursting through the cereal boxes), wherein an identification of the user-customized hand formation by the machine learning model (par [1225], various applications may be associated with their own types of virtual UI. Alternatively or additionally, the user may customize the UI to create one that he/she may be most comfortable with. For example, the user may simply “draw” a virtual UI in space using a motion of his hands, and various applications or functionalities may automatically populate the drawn virtual UI) indicates that the AR engine rendered object is to be provided for display in the user-specified state (par [1448]).
Before the effective of filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Ge, Anderson and Dong with Bradski to teach wherein the instructions further comprise instructions that when executed by the processor cause the processor to prompt the user to provide a user-specified state of the AR engine rendered object, wherein an identification of the user-customized hand formation by the machine learning model indicates that the AR engine rendered object is to be provided for display in the user-specified state. The suggestion/motivation would have been to interact with game system.
Inquiries
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NGAN T. PHAM-LU whose telephone number is (571)270-1889. The examiner can normally be reached M-F 7:30am - 4:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CHANH D. NGUYEN can be reached on (571)272-7772. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NGAN T. PHAM-LU/Examiner, Art Unit 2691 

/CHANH D NGUYEN/Supervisory Patent Examiner, Art Unit 2691