DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Response to Amendment
2.	Acknowledgement is made of amendment filed on March 23, 2022, in which claims 1, 3 and 7 are amended, claim 2 is canceled, and claims 1 and 3-8 are still pending.

Response to Arguments
3.	Applicant's arguments, filed on March 23, 2022, with respect to Claims 1 and 3-8 have been fully considered but they are not persuasive.  
4.	With regards to arguments for independent claims 1 and 7, applicants argue that Wilson et al. (US 2015/0030236 A1) and Perlin (US 2012/0299934 A1) fail to disclose while the actor is performing the standby operation, an idle image sequence of a first image showing a first standby posture of the actor and a second image showing a second standby posture of the actor is repeatedly displayed. The examiner respectfully agrees and moots in view of the new grounds of rejections regarding claims 1 and 7, since in Kim et al. (US 2015/0092981 A1) teaches (“what is recognized is what an action composed of a group of a series of postures taken by the human body (actor) is. Each action to be recognized is learned in advance on a problem-area basis, and the previously learned action model and a label of the relevant action may be stored in an action learning database.” [0053] “Gesture recognition: recognizing the meaning of gesture in a sequence image as a nonverbal communication shape (for example, a movement to raise and swing hands back and forth repeatedly means a calling gesture). … Posture recognition: recognizing a specific shape made of joints and bones in a still image (for example, a posture in which a golf club impacts a ball when performing a golf swing, a posture of raising both hands up).” [0039-0040] “it is checked whether the action of the human body is ended in block 506. By repeatedly performing the process of blocks 502 to 506, all depth images from when the human body starts to take an action until it ends to take an action are accumulated so that a 3-dimensional action volume may be generated.” [0074]) Kim teaches a series of postures taken by the actor can repeatedly perform until the action end.  Therefore, Kim teaches the arguments of the limitations for claims 1 and 7 as it is recited.

Claim Rejections - 35 USC § 103
5.	The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

6.	The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103(a) are summarized as follows:
1.	Determining the scope and contents of the prior art.
2.	Ascertaining the differences between the prior art and the claims at issue.
3.	Resolving the level of ordinary skill in the pertinent art.
4.	Considering objective evidence present in the application indicating obviousness or nonobviousness.

7.	Claims 1, 3 and 7 are rejected under 35 U.S.C. 103(a) as being unpatentable over Wilson et al. (US 2015/0030236 A1) in view of Kim et al. (US 2015/0092981 A1) and Perlin (US 2012/0299934 A1).
8.	With reference to claim 1, Wilson teaches A method of providing interactive virtual reality content performed by a computing device, (“the example system 100 may be implemented by capturing interaction with the actor 132 via a depth camera such as a MICROSOFT KINECT camera. According to an example embodiment, a video of the gestures of the actor 132 may be recorded at a resolution of 640.times.480 pixels with depth information at 30 Hz. For example, the spatial object management engine 102 may be implemented via a computing device running WINDOWS 7 Ultimate, powered by an INTEL CORE2 Duo 2.13 GHz processor and 6 GB of random access memory (RAM). " [0161] "the predefined update 3-D item may include one or more of a 3-D inventory item, a 3-D gaming object, a 3-D real-world item, or a 3-D virtual reality environment object." [0069]) Wilson also teaches displaying option items and an actor performing a standby operation, (“a display 128 may provide a visual, audio, and/or tactile medium for the user 124 (e.g., a store employee or system administrator) to monitor his/her input to and responses from the spatial object management engine 102.” [0031] “the integrated model generator 140, discussed above, may be configured to select portions of the received 3-D sensor data 106 for integration based on comparing the threshold time value 146 with values indicating lengths of time spent by the at least one 3-D moving object within a plurality of 3-D regions during the free-form movements, as discussed further below. … As shown in FIG. 5A, an actor 502 may mentally envision a 3-D object 504. For example, the 3-D object may include a three-legged stool that includes a seat 506 and angled legs 508. The actor 502 may indicate dimensions of the 3-D object 504, for example, by flattening and moving his/her hands to positions 510 indicating a distance separating the hands, to indicate a height, width, and/or depth associated with the 3-D object 504. As shown in FIG. 5A, the actor 502 may spatially describe, or use natural gestures to mime, the description of the 3-D object 504 in range of a sensing device 512 (e.g., an overhead depth camera). As shown in FIG. 5B, the actor 502 may flatten and move his/her hands in a rotating motion 514 to visualize the actor's mental image of the seat 506 of the stool (e.g., the 3-D object 504). As shown in FIG. 5C, the actor 502 may form fists and move his/her hands in angled vertical motions 516 to visualize the actor's mental image of the angled legs 508 of the stool (e.g., the 3-D object 504).” [0099-0101])

    PNG
    media_image1.png
    697
    472
    media_image1.png
    Greyscale

Wilson further teaches while the actor is performing the standby operation, receiving, from a user, a selection for one option item from the options items;
after the receiving of the selection, checking a first posture of the actor at a time point at which the selection is received, identifying a branched image according to the selection, and checking a second posture of the actor at a start time point of the branched image; (“plurality of sequential 3-D spatial representations that each include 3-D spatial map data corresponding to a 3-D posture and position of the at least one hand at sequential instances of time during the free-form movements may be determined, based on the received 3-D spatial image data (304). For example, the spatial representation engine 136 that may determine the plurality of sequential 3-D spatial representations 138 that each include 3-D spatial map data corresponding to a 3-D posture and position of the at least one hand at sequential instances of time during the free-form movements, based on the received 3-D spatial image data, as discussed above.” [0083] “the integrated model generator 140, discussed above, may be configured to select portions of the received 3-D sensor data 106 for integration based on comparing the threshold time value 146 with values indicating lengths of time spent by the at least one 3-D moving object within a plurality of 3-D regions during the free-form movements, as discussed further below. … As shown in FIG. 5A, the actor 502 may spatially describe, or use natural gestures to mime, the description of the 3-D object 504 in range of a sensing device 512 (e.g., an overhead depth camera). As shown in FIG. 5B, the actor 502 may flatten and move his/her hands in a rotating motion 514 to visualize the actor's mental image of the seat 506 of the stool (e.g., the 3-D object 504). As shown in FIG. 5C, the actor 502 may form fists and move his/her hands in angled vertical motions 516 to visualize the actor's mental image of the angled legs 508 of the stool (e.g., the 3-D object 504).” [0099-0101]) Wilson teaches according to the checked first posture of the actor, displaying the selected connection image so that the first posture of the actor at the time point at which the selection is received is smoothly connected to the second posture of the actor at the start time point of the branched image; and after the displaying the selected connection image, displaying the branched image. (“a display 128 may provide a visual, audio, and/or tactile medium for the user 124 (e.g., a store employee or system administrator) to monitor his/her input to and responses from the spatial object management engine 102.” [0031] (“a predefined 3-D model 168a, 168b, 168c associated with a database object 170a, 170b, 170c that matches the integrated 3-D model 144, wherein the natural gesture motions may emulate an appearance of a predetermined three-dimensional (3-D) item. For example, the database objects 170a, 170b, 170c may be stored in association with a database 172. For example, the predefined models 168a, 168b, 168c may represent physical 3-D objects. According to an example embodiment, the volume determination engine 162 may be configured to determine a volume associated with one of the hands of the actor 132 based on tracking visible portions of the one of the hands over time, based on the received 3-D spatial image data 110.” [0049-0050] “plurality of sequential 3-D spatial representations that each include 3-D spatial map data corresponding to a 3-D posture and position of the at least one hand at sequential instances of time during the free-form movements may be determined, based on the received 3-D spatial image data (304). For example, the spatial representation engine 136 that may determine the plurality of sequential 3-D spatial representations 138 that each include 3-D spatial map data corresponding to a 3-D posture and position of the at least one hand at sequential instances of time during the free-form movements, based on the received 3-D spatial image data, as discussed above.” [0083] “a volume associated with one of the hands of the actor may be determined based on tracking visible portions of the one of the hands over time, based on the received 3-D spatial image data (308). For example, the volume determination engine 162 may determine the volume associated with the one of the hands of the actor 132 based on tracking visible portions of the one of the hands over time, based on the received 3-D spatial image data 110,” [0085] “For each moment in time, the actor's hands may leave a footprint in the virtual representation (e.g., the spatial representation 138 and/or the integrated model 144) whose position and orientation corresponds to those of the actor's hands in the real world. In other words, the orientation and posture of the hand at each instant in time may determine a volume of the component added to the virtual representation (i.e., a flat, tilted hand may make a flat, slanted small-sized impact on the virtual representation).” [0136] “an identity of an actor-described object may be determined based on an example database of matching candidate objects in voxel representation (e.g., the database 172). According to an example embodiment, data miming techniques, or gesturing techniques discussed herein may select the most closely matching object from the database. As discussed herein, for each candidate object, the generated 3-D model (e.g., the integrated model 144) may be aligned with the predefined database model (e.g., the predefined 3-D models 168a, 168b, 168c) for comparison and measurement of similarity.” [0140]) 
Wilson does not explicitly teach while the actor is performing the standby operation, an idle image sequence of a first image showing a first standby posture of the actor and a second image showing a second standby posture of the actor is repeatedly displayed; selecting one connection image from a first connection image showing the actor moving from the first standby posture to a second posture, and a second connection image showing the actor moving from the second standby posture to the second posture; This is what Kim teaches.  Kim teaches while the actor is performing the standby operation, an idle image sequence of a first image showing a first standby posture of the actor and a second image showing a second standby posture of the actor is repeatedly displayed; (“what is recognized is what an action composed of a group of a series of postures taken by the human body (actor) is. Each action to be recognized is learned in advance on a problem-area basis, and the previously learned action model and a label of the relevant action may be stored in an action learning database.” [0053] “Gesture recognition: recognizing the meaning of gesture in a sequence image as a nonverbal communication shape (for example, a movement to raise and swing hands back and forth repeatedly means a calling gesture). … Posture recognition: recognizing a specific shape made of joints and bones in a still image (for example, a posture in which a golf club impacts a ball when performing a golf swing, a posture of raising both hands up).” [0039-0040] “it is checked whether the action of the human body is ended in block 506. By repeatedly performing the process of blocks 502 to 506, all depth images from when the human body starts to take an action until it ends to take an action are accumulated so that a 3-dimensional action volume may be generated.” [0074]) Kim teaches a series of postures taken by the actor can repeatedly perform until the action end.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Kim into Wilson, in order to provide an intelligent application service model of a higher level based on an explicit activity recognition.
The combination of Wilson and Kim does not explicitly teach selecting one connection image from a first connection image showing the actor moving from the first standby posture to a second posture, and a second connection image showing the actor moving from the second standby posture to the second posture; This is what Perlin teaches (“In the sequence shown in FIG. 4, a walking virtual actor has been directed first to crouch backwards, and then to turn sideways with feet spread, before continuing on his way. To create this behavior, the artist simply specifies a keyframe (corresponding to the fourth image of FIG. 4) in which the actors pelvis should be lowered and rotated by 180, and then a keyframe (corresponding to the eighth image of FIG. 4) in which the actors feet should be spread and the pelvis rotated 90. In the course of executing this transition, the actor produces a trajectory for the feet and for all body joints (seen in images two and three, and images five through seven) that results in reasonable foot placement and no foot sliding.” [0096])

    PNG
    media_image2.png
    537
    482
    media_image2.png
    Greyscale

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Perlin into the combination of Wilson and Kim, in order to accommodate dependencies between computations for different parts of the actor.
9.	With reference to claim 3, Wilson teaches the checking of the first posture of the actor includes checking the first posture of the actor by using a first time stamp when the actor in the idle image is in the first standby posture, a time interval between the first standby posture and the second standby posture, and a second time stamp at the time point at which the selection is received. (“plurality of sequential 3-D spatial representations that each include 3-D spatial map data corresponding to a 3-D posture and position of the at least one hand at sequential instances of time during the free-form movements may be determined, based on the received 3-D spatial image data (304). For example, the spatial representation engine 136 that may determine the plurality of sequential 3-D spatial representations 138 that each include 3-D spatial map data corresponding to a 3-D posture and position of the at least one hand at sequential instances of time during the free-form movements, based on the received 3-D spatial image data, as discussed above. An integrated 3-D model may be generated, via the spatial object processor, based on incrementally integrating the 3-D spatial map data included in the determined sequential 3-D spatial representations and comparing a threshold time value with model time values indicating numbers of instances of time spent by the at least one hand occupying a plurality of 3-D spatial regions during the free-form movements (306). For example, the integrated model generator 140 may generate, via the spatial object processor 142, the integrated 3-D model 144 based on incrementally integrating the 3-D spatial map data included in the determined sequential 3-D spatial representations 138 and comparing the threshold time value 146 with model time values indicating numbers of instances of time spent by the at least one hand occupying a plurality of 3-D spatial regions during the free-form movements, as discussed above” [0083-0084] “the integrated model generator 140, discussed above, may be configured to select portions of the received 3-D sensor data 106 for integration based on comparing the threshold time value 146 with values indicating lengths of time spent by the at least one 3-D moving object within a plurality of 3-D regions during the free-form movements, as discussed further below. … As shown in FIG. 5A, the actor 502 may spatially describe, or use natural gestures to mime, the description of the 3-D object 504 in range of a sensing device 512 (e.g., an overhead depth camera). As shown in FIG. 5B, the actor 502 may flatten and move his/her hands in a rotating motion 514 to visualize the actor's mental image of the seat 506 of the stool (e.g., the 3-D object 504). As shown in FIG. 5C, the actor 502 may form fists and move his/her hands in angled vertical motions 516 to visualize the actor's mental image of the angled legs 508 of the stool (e.g., the 3-D object 504).” [0099-0101])
10.	Claim 7 is similar in scope to the combination of claim 1, and thus is rejected under similar rationale. Wilson additionally teaches the device comprising: a memory; and a processor, (“As shown in FIGS. 1A-1B, a spatial object management engine 102 may include a sensor data receiving engine 104 that may be configured to receive sensor data 106. For example, the sensor data receiving engine 104 may receive sensor data 106 from one or more sensing devices. A memory 108 may be configured to store information including the sensor data 106.” [0029] “The spatial object management engine 102 may include an integrated model generator 140 that may be configured to generate, via a spatial object processor 142,” [0035])

Allowable Subject Matter
11.	Claims 4-6 and 8 are allowed.
Prior art in the record, e.g., existing prior art Wilson et al. (US 2015/0030236 A1), Kim et al. (US 2015/0092981 A1) and Perlin (US 2012/0299934 A1), alone or combined do not teach the claim features “the actor in the idle image moves from a first standby posture to a second standby posture and then returns to the first standby posture from the second standby posture, even when the posture of the actor in the idle image is more similar to the second standby posture than the first standby posture at the time point at which the selection is received, the actor in the idle image returns to the first standby posture”, and the examiner has not discovered prior art reference teaching the cited limitations during the application prosecution. Thus, it is believed a unique feature in the invention and is suggested to be allowed with the condition set forth above.

Conclusion
12.	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michelle Chin whose telephone number is (571)270-3697.  The examiner can normally be reached on 8:00 - 4:30 PM, Monday - Friday.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reach on (571)272-7667.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MICHELLE CHIN/
Primary Examiner, Art Unit 2619