DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
2. 	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

Response to Amendment
3.	Applicant’s amendments filed on January 07, 2021 have been entered. Claims 1, 8, and 15 have been amendedClaims 1-20 are pending in this application, with claims 1, 8, and 15 being independent.

Claim Rejections - 35 USC § 103
4.	The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103(a) are summarized as follows:
1.	Determining the scope and contents of the prior art.
2.	Ascertaining the differences between the prior art and the claims at issue.
3.	Resolving the level of ordinary skill in the pertinent art.
4.	Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-4, 6-11, 13-18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Jun et al., (“Jun”) [EP-3651055-A1] in view of Vendrow et al., (“Vendrow”) [US-2017/0228135-A1], further in view of Friend et al., (“Friend”) [US-2011/0154266-A1]
Regarding claim 1, Jun discloses a computer-implemented method for improving digital capturing of gestures (Jun- ¶0009, a final gesture recognition result may be obtained based on a gesture motion trend indicated by gesture recognition results of a plurality of consecutive video segments, to eliminate impact exerted on the finally obtained gesture recognition result by an erroneous gesture performed by the user in the short period of time, thereby improving gesture recognition accuracy), the method comprising:
using a camera, detecting a gesture during a video stream;
using a computing device, generating  an image that corresponds to the gesture (Jun- ¶0014-0015, the optical flow information and the color information [an image] of the video segment are extracted based on the M images, gesture recognition is performed separately based on the extracted optical flow information and color information [image that corresponds to the gesture], and then the recognized gesture recognition results are combined; Fig. 7 and ¶0023, a gesture recognition device is provided) and storing the  image in a database as a gesture layer (Jun- ¶0093, a phase gesture recognition result corresponding to each video segment  [the image as a gesture layer], and saves the ;
using the computing device, combining the gesture layer with the video stream to generate a gesture visualization (Jun- ¶0007, M images in each video segment in the video stream are obtained, gesture recognition is performed on the M images by using the deep learning algorithm, to obtain a gesture recognition result corresponding to the video segment, and finally gesture recognition results of N consecutive video segments including the video segment are combined [combining the gesture layer with the video stream], to obtain a gesture recognition result [gesture visualization] of the N consecutive video segments; Fig. 7 and ¶0023, a gesture recognition device is provided); and
using the computing device, causing the gesture visualization to be displayed in one or more displays of one or more other computing devices (Jun- ¶0095, when performing result combination on the gesture recognition results of the N consecutive video segments, the gesture recognition device may input the gesture recognition results of the N consecutive video segments into a pre-trained first machine learning model, to obtain the combined gesture recognition result. The first machine learning model is used to determine an overall gesture motion trend including the input N consecutive gesture recognition results, and to output a gesture corresponding to the overall gesture motion trend as the combined gesture recognition result [causing the gesture visualization to be displayed]; ¶0118, the output device 75 may be a display configured to display information […] The output device 75 may further include an output controller, to provide output for the display).
Wang fails to explicitly disclose wherein the gesture is separate from the video stream;
However, Vendrow discloses
the gesture is separate from the video stream (Vendrow- ¶0063, multimedia electronic devices (e.g., devices 120A-E) can indirectly measure participant (e.g., participant 130A-E) reactions to contributions to the conference providing implicit feedback from the interpret a participant's 130A reactions to what is being presented in a conference. In this example, laughter or smiling can be interpreted as a positive reaction. The positive reaction can be transmitted by the multimedia electronic device to scoring module 410 via conference bridge 110 and used to adjust the score of the participant eliciting the reaction. Similarly, the camera can be used to monitor behavior and the system can use this to detect disapproving reactions by a participant, such as recognizing a face palm gesture or other negatively connoted gesture by the participant – suggests gesture is separate from the video stream; ¶0064, certain participant gestures can be detectable by analysis of the audio that can be undetectable merely from analysis of the conference video streams or other inputs, such as throat clearing gestures, sighing, breathing affectations, tone of voice, specific patterns in speech, and others. Moreover, the sensors can exist in separate devices associated with the same participant. For example, the participant's multimedia electronic device can monitor audio while a smart watch or other wearable device can measure motion; ¶0065, Behavioral characteristics can include, among other things, typing rhythm, gait, vocal fluctuations, hand gestures, and/or physical mannerisms);
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention  to have modified Jun to incorporate the teachings of Vendrow, and apply the user's reactions or gestures into a video stream, as taught by Jun, for using a camera, detecting a gesture during a video stream, wherein the gesture is separate from the video stream.
Doing so would provide an enhanced method for dynamically changing a conference graphical user interface.
The prior art fails to explicitly disclose, but Friend discloses
generating a digital drawing that corresponds to the gesture (Friend- ¶0036, the gestures may control anything display on a screen […] For example, consider that display 222 shows a virtual chalk board or dry erase board, the user's gestures may be recognized to draw or write letters to the screen [digital drawing that corresponds to the gesture] […] The user may gesture to add a bullet to a word document; ¶0039, the movement of the user's finger may simulate the movement of a laser pointer, and the system may recognize the gesture and display a spot of light on the screen that corresponds to the finger movement) and storing the digital drawing in a database (Friend- ¶0036, the gestures may control anything display on a screen […] For example, consider that display 222 shows a virtual chalk board or dry erase board, the user's gestures may be recognized to draw or write letters to the screen [digital drawing]; ¶0068-0069, the system may store information representative of motions of a tracked user, stored in a motion capture file […] a multimedia response library may include options for representing a user's gestures as recognized by the system; Fig. 4 and ¶0133, the computing environment 212 may include a gestures library 192 [database]; ¶0139, the computing environment 212 may use the gestures library 192 along with a gesture profile 205 such as that shown in FIG. 4 to interpret movements of the skeletal model and to control an application based on the movements); 
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention  to have modified Jun/Vendrow to incorporate the teachings of Friend, and apply the user's gestures may be recognized to draw or write letters to the screen into an image that corresponds to the gesture, as taught by Jun/Vendrow, for generating a digital drawing that corresponds to the gesture and storing the digital drawing in a database as a gesture layer.
Doing so would provide the immersion of the users of the system, including both presenters and observers, provides a virtual relationship between users that is more interactive than a simple display of the information.

Regarding claim 2, Jun in view of Vendrow and Friend, discloses the method of claim 1, and further teaches wherein detecting the gesture comprises detecting a gesture trigger using a trained machine learning model (Jun- ¶0006, performing gesture recognition on the M images by using a deep learning algorithm, to obtain a gesture recognition result corresponding to the first video segment; ¶0098, a gesture operation that the user expects to perform is to raise a hand up. The user performs the gesture operation of raising the hand up [a gesture trigger] in Is, while the user does not raise the hand up in a quite short period of time (such as 0.2s) within the Is, but slightly presses the hand down, and the user continues to raise the hand up after the quite short period of time).

Regarding claim 3, Jun in view of Vendrow and Friend, discloses the method of claim 1, and discloses the method further comprising:
detecting a gesture completion using a trained machine learning model (Jun- ¶0035, a complete gesture action is divided into a plurality of phase actions. The phase actions are recognized by using the deep learning algorithm, and finally the recognized phase actions are combined as the complete gesture action).

Regarding claim 4, Jun in view of Vendrow and Friend, discloses the method of claim 1, and further discloses wherein generating the digital drawing is responsive to detecting a gesture completion (Jun- ¶0035, a complete gesture action is divided into a plurality of phase actions. The phase actions are recognized by using the deep learning algorithm, and finally the recognized phase actions are combined as the complete gesture action [detecting a gesture completion]; Friend- ¶0036, the gestures may control anything display on a screen […] For example, consider that display 222 shows a virtual chalk board or dry erase board, the user's gestures may be recognized to draw or write letters to the screen [generating the digital drawing]).

The same motivation that was utilized in the rejection of claim 1 applies equally to this claim.

Regarding claim 6, Jun in view of Vendrow and Friend, discloses the method of claim 1, and discloses the method further comprising:
using the computing device, causing the gesture visualization to be displayed in a display associated with the computing device (Jun- ¶0095, when performing result combination on the gesture recognition results of the N consecutive video segments, the gesture recognition device may input the gesture recognition results of the N consecutive video segments into a pre-trained first machine learning model, to obtain the combined gesture recognition result. The first machine learning model is used to determine an overall gesture motion trend including the input N consecutive gesture recognition results, and to output a gesture corresponding to the overall gesture motion trend as the combined gesture recognition result [causing the gesture visualization to be displayed in a display]; ¶0118, the gesture recognition device 70 may further include an output device 75 […] the output device 75 may be a display configured to display information […] The output device 75 may further include an output controller, to provide output for the display).

Regarding claim 7, Jun in view of Vendrow and Friend, discloses the method of claim 1, and further discloses wherein the gesture comprises movement from a human or an instrument (Friend- ¶0031, the system may identify the user's hand gesture for movement of a tab, where the user's hand in the physical space is virtually aligned with a tab in the application .
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention  to have modified Jun/Vendrow to incorporate the teachings of Friend, and apply the movement of the user's gestures may be recognized to draw or write letters to the screen into the gesture, as taught by Jun/Vendrow, so the gesture comprises movement from a human or an instrument.
The same motivation that was utilized in the rejection of claim 1 applies equally to this claim.

Regarding claims 8-11 and 13-14, all claim limitations are set forth as claims 1-4 and 6-7, in a non-transitory computer-readable medium storing a set of instructions and rejected as per discussion for claim 1-4 and 6-7.

Regarding claim 8, Jun in view of Vendrow and Friend, discloses a non-transitory computer-readable medium storing a set of instructions that, when executed by a processor (Jun- ¶0120-0121, the methods, computer readable media, and systems of the presently disclosed subject matter, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, where, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the subject matter. In the case of program code execution on programmable computers, the computing environment may generally include a processor, a , cause to perform the method of claim 1.

The system of claims 15-18 and 20 are similar in scope to the functions performed by the method of claims 1-4 and 6 and therefore claims 15-18 and 20 are rejected under the same rationale.

Regarding claim 15, Jun in view of Vendrow and Friend, discloses a system for improving digital capturing of gestures (Jun- Fig. 7 is a schematic structural diagram of a gesture recognition device; ¶0009,a final gesture recognition result may be obtained based on a gesture motion trend indicated by gesture recognition results of a plurality of consecutive video segments, to eliminate impact exerted on the finally obtained gesture recognition result by an erroneous gesture performed by the user in the short period of time, thereby improving gesture recognition accuracy), the system comprising:
a processor; a memory operatively connected to the processor and storing instructions that, when executed by the processor (Jun- Fig. 7 and ¶0108-0110, the gesture recognition device 70 may include a processor 71 and a memory 73 […] The memory 73 may be configured to store a software program, and the software program may be executed by the processor 71; ¶0120-0121, the methods, computer readable media, and systems of the presently disclosed subject matter, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, where, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the subject matter. In the case of program code execution on programmable computers, the computing environment may generally include a processor, a , cause to perform the method of claim 1.


6.	Claims 5, 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Jun in view of Vendrow, further in view of Friend, further in view of Minnen, (“Minnen”) [US-2012/0268364-A1], still further in view of Zhao et al., (“Zhao”) [US-2016/0154469-A1]
Regarding claim 5, Jun in view of Vendrow and Friend, discloses the method of claim 1, and though the prior art discloses generating a first frame and a second frame of the video stream (Jun- ¶0067, the gesture recognition device may find a correspondence between a previous image and a current image by using a change of a pixel in an image sequence in time domain and a correlation between adjacent frames [a first frame and a second frame], to obtain motion information of an object between the two images through calculation) and generating the digital drawing (Friend- ¶0036, the user's gestures may be recognized to draw or write letters to the screen), the prior art fails to explicitly teach, but Minnen teaches the method further comprising:
generating a first grid for a first frame of the video stream featuring the gesture (Minnen- Fig. 2 shows grids with the markings for the frames of the video stream; ¶0046, to detect fingertips directly from a single video frame […] Each video frame 104 is converted into a binary image with the goal of marking foreground pixels, i.e., those pixels that correspond to the fingers, hand, and arm, with a value of one, while all other pixels are marked with a zero; ¶0096-0097 teach the markers on the tags in one embodiment are affixed at a subset of regular grid locations […] a number of tags 201A-201E (left hand) and 202A-202E (right hand) are shown. Each tag is rectangular and consists in this embodiment of a 5×7 grid array […] Markers (represented by the black dots of FIG. 7) are disposed at certain points in the grid array to provide information);
generating a first mark on the first grid, wherein the first mark represents a first placement of the gesture for the first frame (Minnen- Fig. 2 shows grids with the markings for the frames of the video stream; ¶0092, the use of marker tags on one or more fingers of the user so that the system can locate the hands of the user, identify whether it is viewing a left or right hand, and which fingers are visible. This permits the system to detect the location, orientation, and movement of the user's hands; ¶0096-0097, the markers on the tags in one embodiment are affixed at a subset of regular grid locations […] a number of tags 201A-201E (left hand) and 202A-202E (right hand) are shown. Each tag is rectangular and consists in this embodiment of a 5×7 grid array […] Markers (represented by the black dots of FIG. 7) are disposed at certain points in the grid array to provide information);
generating a second grid for a second frame of the video stream featuring the gesture (Minnen- Fig. 2 shows grids with the markings for the frames of the video stream; ¶0046, to detect fingertips directly from a single video frame […] Each video frame 104 is converted into a binary image with the goal of marking foreground pixels, i.e., those pixels that correspond to the fingers, hand, and arm, with a value of one, while all other pixels are marked with a zero; ¶0096-0097, the markers on the tags in one embodiment are affixed at a subset of regular grid locations […] a number of tags 201A-201E (left hand) and 202A-202E (right hand) are shown. Each tag is rectangular and consists in this embodiment of a 5×7 grid array […] Markers (represented by the black dots of FIG. 7) are disposed at certain points in the grid array to provide information);
generating a second mark on the second grid, wherein the second mark represents a second placement of the gesture for the second frame (Minnen- Fig. 2 shows grids with the markings for the frames of the video stream; ¶0092 teaches the use of marker tags on one or more fingers of the user so that the system can locate the hands of the user, identify whether it is viewing a left or right hand, and which fingers are visible. This permits the system to detect the location, orientation, and movement of the user's hands; ¶0096-0097, the markers on the tags in one embodiment are affixed at a subset of regular grid locations […] a number of tags 201A-201E (left hand) and 202A-202E (right hand) are shown. Each tag is rectangular and consists in this embodiment of a 5×7 grid array […] Markers (represented by the black dots of FIG. 7) are disposed at certain points in the grid array to provide information); 
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention  to have modified Jun/Vendrow/Friend to incorporate the teachings of Minnen, and apply the markers are disposed at certain points in the grid array into the generated first frame and second frame of the video stream, as taught by Jun/Vendrow/Friend, for generating a first mark on the first grid, wherein the first mark represents a first placement of the gesture for the first frame; generating a second mark on the second grid, wherein the second mark represents a second placement of the gesture for the second frame.
Doing so would provide an enhanced fast fingertip detection with robust local hand tracking, continuous hand detection and track initialization to support multiple targets and to recover from errors.
The prior art fails to explicitly disclose wherein generating the digital drawing comprises connecting the first mark to the second mark.
However, Zhao discloses
connecting the first mark to the second mark (Zhao- ¶0017, ¶0036, ¶0077, by using the position of the fingertip in a second frame of the gesture images as a start point, each time the position of the fingertip in a frame of the gesture images is acquired, connecting a point of the position of the fingertip and a point of the position of the fingertip in a previous frame of the gesture images [connecting the first mark to the second mark]).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention  to have modified Jun/Vendrow/Friend/Minnen to incorporate the teachings of Zhao, and apply connecting a point of the position of the fingertip and a point of the position of the fingertip in a previous frame of the gesture images into generating the digital 
Doing so would improve a gesture recognition speed, and reduce a gesture recognition delay.

Regarding claim 12, all claim limitations are set forth as claim 5, in a non-transitory computer-readable medium storing a set of instructions and rejected as per discussion for claim 5.

The system of claim 19 is similar in scope to the functions performed by the method of claim 5 and therefore claim 19 is rejected under the same rationale.

Response to Arguments
6.	Applicant's arguments, filed January 07, 2021, with respect to pending claims 1-20 and amended claims 1, 8, and 15, have been fully considered but they are moot in view of the new ground(s) of rejection.
In light of the current Office Action, the Examiner respectfully submits that the combination of Vendrow et al. (US-2017/0228135-A1) discloses or suggests all the elements of independent claims 1, 8, and 15. All the element exists in the prior art of Jun in view of Vendrow and Friend made obvious the invention as a whole. Furthermore, there must be some suggestion or teaching in the art that would motivate one of ordinary skill in the art to arrive at the claimed invention.
7.	On pages 6 of Applicant's Remarks, the Applicant argues that the dependent claims are not taught by the prior art, insomuch as they depend from claims that are not taught by the prior art. Examiner respectfully disagrees with these arguments, for the reasons discussed below.


Conclusion
8.	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
9.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL LE whose telephone number is (571)272-5330.  The examiner can normally be reached on 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark Zimmerman can be reached on (571) 272-7653.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR 



/MICHAEL LE/Primary Examiner, Art Unit 2619