DETAILED ACTION
This office action is in response to the initial filing of Application no. 16/586892 on 09/27/2019.
Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 

Claim Objections
Claim 19 is objected to because of the following informalities:  There are 2, claim 19.  Appropriate correction is required.

Allowable Subject Matter
Aside from the non-prior art rejections, the prior art fails to teach or suggest in reasonable combination the limitations recited in claims 26 and 27.
Claims 28 - 30 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims since the prior art fails to teach or suggest in reasonable combination the limitations recited in claims 28 – 30.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1 – 27 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 – 51 of U.S. Patent No. 10,448,161. Although the claims at issue are not identical, they are not patentably distinct from each other.

The claim mapping is as follows.
Current Application
1. A device comprising: a memory configured to store a sequence of images; and one or more processors coupled to the memory, the one or more processors configured to: recognize at least one movement of a user, represented in the sequence of images, within a sound field produced by an array of loudspeakers; apply the recognized movements of the user to determine an indicated change to the sound field; synthesize a modified sound field produced by the array of loudspeakers to implement the indicated change.

2. The device of claim 1, wherein the recognized movements of the user represent a gesture, where the gesture is interpreted as one of a plurality of separate patterns, and decisions by the one or more processors are made to synthesize the corresponding sound field associated with the separate patterns.



4. The device of claim 3, wherein the regions of interest include the user's eyes.

5. The device of claim 3, wherein the regions of interest include the user's hands.

6. The device of claim 3, wherein the regions of interest include the user's mouth.

7. The device of claim 3, wherein the regions of interest include the user's body.

8. The device of claim 3, further comprising one or more processors configured to identify orientations of the one or more features or changes over time within the sequence of images representing a trajectory of a hand, a trajectory of each hand, a rotation of a head, or a tilt of a head.

9. The device of claim 1, wherein the one or more processors are configured to control the array of loudspeakers to generate beams in different directions to support gesture control independently for different users located in the different directions.

10. The device of claim 1, wherein a voice command is used to enter a gesture control mode.

11. The device of claim 1, wherein the analyzed recognized movements of the user include face recognition, voice recognition, or both face recognition and voice recognition for user identification or user location.

12. The device of claim 1, wherein the recognized movements of the user represent a gesture, wherein the gesture is mapped to a command, and a command interpreter integrated into the one or more processors is configured to disable changes to a current sound field configuration, or enable changes to the current sound field configuration.

13. The device of claim 1, further comprising an infrared light detector configured to capture the sequence of images based on a laser distance measurement.

14. The device of claim 1, wherein the one or more processors are configured to recognize the movement based on depth information.



16. The device of claim 14, further comprising a projector configured to project a pattern of stripes, a pattern of dots, or both a pattern of stripes and dots onto a part of the user and estimate depths of surface points of the part of the user.

17. The device of claim 16, further comprising a laser to emit light and a diffraction grating to impose the pattern on the emitted light, and an image detector to capture an image of an illuminated part of the user.

18. The device of claim 1, wherein the recognize movements of the user are based on an array of ultrasound transducers used to perform spatial imaging.

19. The device of claim 1, further comprising an on-screen display to provide feedback for a gesture command, wherein the feedback is a bar or a dial to display a change in beam intensity, beam direction, or dynamic range.

19. The device of claim 1, further comprising a display to provide feedback for a gesture command. wherein the feedback is a bar or a dial.

20. The device of claim 19, wherein the on-screen display is configured to display the feedback for the gesture command on a bar on the screen or dial on the screen to represent a change in beam intensity, beam direction, or dynamic range.

21. The device of claim 19, wherein the on-screen display is configured to display the an error indication of an invalid gesture command.

22. The device of claim 1, wherein the recognized movements of the user represent a gesture, wherein the gesture is at least one among the following gestures: two-hand gesture, hand-and-head gesture, hand and body gesture, and hand to ear gesture.

23. The device of claim 1, wherein the recognized movements of the user represent a gesture, wherein the gesture is at least one among the following gestures: a clockwise hand movement, a counterclockwise hand movement, and a hand rotation, hand grasping, and hand releasing.

24. The device of claim 1, wherein the recognized movements of the user represent a gesture, wherein the gesture is at least one among the following gestures: a hand vertical movement, a hand movement toward a 

25. The device of claim 1, wherein the recognized movements of the user represent a gesture, wherein the gesture is used to synthesize the modified sound field produced by the array of loudspeakers to change a volume of the modified sound field, or control a volume of a beam in the modified sound field.

26. The device of claim 1, wherein the recognized movements of the user represent a gesture, wherein the gesture is used to synthesize the modified sound field produced by the array of loudspeakers to change a beam direction of the modified sound field, or change a beam width of the modified sound field, or change in an echo depth in time of the modified sound field, or change in dynamic range expansion or compression of the modified sound field.

27. The device of claim 1, wherein the recognized movements of the user represent a gesture, wherein the gesture is used to synthesize the modified sound field produced by the array of loudspeakers to create or 



1. A method of signal processing, said method comprising: producing a command in response to a representation of a hand gesture of a user, wherein the hand gesture indicates (i) a change to a sound field near the user and (ii) a direction of where to direct the sound field, wherein the sound field comprises a beam; producing a filter configuration for a spatially directive filter in response to said command based on a current filter configuration; applying the spatially directive filter to a multichannel signal; and driving an array of directionally controllable transducers with the multichannel signal to change the sound field that includes the beam.

2. The method of signal processing according to claim 1, wherein said producing the command comprises selecting the command, based on information from said representation, from among a plurality of commands that includes (A) a command to increase a width of the beam and (B) a command to decrease the width of the beam.

3. A method of signal processing according to claim 1, wherein said producing the command comprises selecting the command, based on information from said representation, from among a plurality of commands that includes (A) a command to change a current direction of the beam to a first direction that is on a first side of the beam and (B) a command to change the current direction of the beam to a second direction that is on a second side of the beam opposite to the first side.

4. A method of signal processing according to claim 1, wherein said producing the filter configuration is based on an indication of at least one among a current direction of the beam and a current width of the beam.

5. A method of signal processing according to claim 1, wherein said producing the filter configuration is based on an indication of a current location of a user.

6. A method of signal processing according to claim 1, wherein said producing the filter configuration comprises selecting the filter configuration, according to said command, from among a plurality of filter configurations.



8. A method of signal processing according to claim 1, wherein the representation of the hand gesture is based on a sequence of images of a user performing the hand gesture within the sound field.

9. The method of signal processing according to claim 8, wherein said sequence of images includes images of a light pattern projected on a hand of the user.

10. A method of signal processing according to claim 1, wherein the hand gesture includes a lateral movement of a hand of a user.

11. A method of signal processing according to claim 1, wherein the hand gesture includes a grasping motion of a hand of a user.

12. A method of signal processing according to claim 1, wherein the hand gesture includes movement of two hands of a user toward each other.



14. A method of signal processing according to claim 1, wherein the hand gesture includes movement of two hands of a user in a same lateral direction.

15. An apparatus for signal processing, said apparatus comprising: means for producing a command in response to a representation of a hand gesture of a user, wherein the hand gesture indicates (i) a change to a sound field near the user and (ii) a direction of where to direct the sound field, wherein the sound field comprises a beam; means for producing a filter configuration for a spatially directive filter in response to said command based on a current filter configuration; means for applying the spatially directive filter to a multichannel signal; and means for driving an array of directionally controllable transducers with the multichannel signal to change the sound field that includes the beam.

16. The apparatus for signal processing according to claim 15, wherein said means for producing the command comprises means for selecting the command, based on information from said representation, from among a plurality of commands that includes (A) a 

17. An apparatus for signal processing according to claim 15, wherein said means for producing the command comprises means for selecting the command, based on information from said representation, from among a plurality of commands that includes (A) a command to change a current direction of the beam to a first direction that is on a first side of the beam and (B) a command to change the current direction of the beam to a second direction that is on a second side of the beam opposite to the first side.

18. An apparatus for signal processing according to claim 15, wherein said means for producing the filter configuration is configured to produce the filter configuration based on an indication of at least one among a current direction of the beam and a current width of the beam.

19. An apparatus for signal processing according to claim 15, wherein said means for producing the filter configuration is configured to produce the filter configuration based on an indication of a current location of a user.



21. The apparatus for signal processing according to claim 20, wherein a first filter configuration among the plurality of filter configurations describes a different phase relation among output channels of the spatially directive filter than a second filter configuration among the plurality of filter configurations.

22. An apparatus for signal processing according to claim 15, wherein the representation of the hand gesture is based on a sequence of images of a user performing the hand gesture within a sound field.

23. The apparatus for signal processing according to claim 22, wherein said sequence of images includes images of a light pattern projected on a hand of the user.

24. An apparatus for signal processing according to claim 15, wherein the hand gesture includes a lateral movement of a hand of a user.



26. An apparatus for signal processing according to claim 15, wherein the hand gesture includes movement of two hands of a user toward each other.

27. An apparatus for signal processing according to claim 15, wherein the hand gesture includes movement of two hands of a user away from each other.

28. An apparatus for signal processing according to claim 15, wherein the hand gesture includes movement of two hands of a user in a same lateral direction.

29. An apparatus for signal processing, said apparatus comprising: a gesture interpreter configured to produce a command in response to a representation of a hand gesture of a user, wherein the hand gesture indicates (i) a change to a sound field near the user and (ii) a direction of where to direct the sound field, wherein the sound field comprises a beam; a command interpreter configured to produce a filter configuration for a spatially directive filter in response to said command based on a current filter configuration; a synthesizer configured to apply the spatially directive filter to a multichannel signal; and an audio output stage 

30. The apparatus according to claim 29, wherein said gesture interpreter is configured to produce the command by selecting the command, based on information from said representation, from among a plurality of commands that includes a command to increase a width of the beam and a command to decrease the width of the beam.

31. An apparatus according to claim 29, wherein said gesture interpreter is configured to produce the command by selecting the command, based on information from said representation, from among a plurality of commands that includes (A) a command to change a current direction of the beam to a first direction that is on a first side of the beam and (B) a command to change the current direction of the beam to a second direction that is on a second side of the beam opposite to the first side.

32. An apparatus according to claim 29, wherein said command interpreter is configured to produce the filter configuration based on an indication of at least one among a current direction of the beam and a current width of the beam.

33. An apparatus according to claim 29, wherein said command interpreter is configured to produce the filter configuration based on an indication of a current location of a user.

34. An apparatus according to claim 29, wherein said command interpreter is configured to produce the filter configuration by selecting the filter configuration, according to said command, from among a plurality of filter configurations.

35. The apparatus according to claim 34, wherein a first filter configuration among the plurality of filter configurations describes a different phase relation among output channels of the synthesizer than a second filter configuration among the plurality of filter configurations.

36. An apparatus according to claim 29, wherein said audio output stage is configured to drive the directionally controllable transducers to produce a sound field that includes the beam, and wherein the representation of the hand gesture is based on a sequence of images of a user performing the hand gesture within the sound field.



38. An apparatus according to claim 29, wherein the hand gesture includes a lateral movement of a hand of a user.

39. An apparatus according to claim 29, wherein the hand gesture includes a grasping motion of a hand of a user.

40. An apparatus according to claim 29, wherein the hand gesture includes movement of two hands of a user toward each other.

41. An apparatus according to claim 29, wherein the hand gesture includes movement of two hands of a user away from each other.

42. An apparatus according to claim 29, wherein the hand gesture includes movement of two hands of a user in a same lateral direction.

43. A non-transitory computer-readable medium having tangible features that cause a machine reading the features to: produce a command in response to a representation of a hand gesture of a user, wherein the 

44. The method according to claim 1, wherein the applying the spatially directive filter is based on the filter configuration.

45. The apparatus according to claim 15, wherein the means for applying the spatially directive filter is based on the filter configuration.

46. The apparatus according to claim 29, wherein the applying the spatially directive filter is based on the filter configuration.

47. The computer-readable medium according to claim 43, wherein the applying the spatially directive filter is based on the filter configuration.



49. The apparatus of claim 29, further comprising a camera configured to detect the representation of the hand gesture.

50. The apparatus of claim 29, wherein the gesture interpreter and the synthesizer are integrated in a processor.

51. The apparatus of claim 29, further comprising the directionally controllable transducers.

52. The apparatus of claim 29, wherein the hand gesture indicates a change to a width of the beam.

53. The apparatus of claim 29, wherein the hand gesture indicates an amplification of the sound field.

54. The apparatus of claim 29, wherein the hand gesture indicates an attenuation of the sound field.


	As shown, the limitations recited in claims  1 – 51 of US 10, 448,161, either alone or in combination with the prior art discussed below, further recite the limitations of claims 1 – 27 of the currently pending application. 

Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

Claims 1, 2, 10 – 12, 23 and 25 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Swan et al. (US 6,351,222) (“Swan”) in view of Chang et al. (“Vision-Based Interface for Integrated Home Entertainment System”) (“Chang”) and further in view of Tsurumi (US 2012/0114137).
For claim 1, Swan discloses a device (Abstract) comprising: a memory (Fig.1, 24); and one or more processors coupled to the memory (Fig.1, 22 and 24; column 2 lines 49 – column 3 line 2), the one or more processors configured to: recognize at least one gesture (gesture including thumb up or thumb down, column 3 lines 15 – 22) of a user within sound produced by a speaker (user provides a thumbs up/ down gestures which is a command to increase or decrease the volume, column 5 lines 30 – 44 and 55 – 64); apply the recognized gesture of the user to determine an indicated change to the sound (column 4 lines 43 –54; column 5 lines 7 – 23; column 6 lines 29 – 50); and synthesize a modified sound produced by the speaker to implement the indicated change (column 4 lines 61 – column 5 lines 5, 23 – 30 and 64 – 67). Yet, Swan fails to teach the following, the memory stores a sequence of images; the gesture which indicates the change to the sound is movement of user represented in the sequence of images; and the sound is a sound field produced by an array of loudspeakers.

Additionally, Chang discloses a vision-based interface for an integrated home entertainment system (Abstract) wherein a gesture comprising a movement of user indicates a change in sound (2.2 Instruction mode, 3 Vision Based Interface, pg. 178 – 180).
Furthermore, Tsurumi discloses an acoustic control apparatus (Abstract) comprising a memory (storage section, Fig.6, 119 and Fig.20, 919) configured to store a sequence of images ([0061] [0077] [0191] [0196]); and one or more processors coupled to the memory (general control section, image processing section and acoustic control section, Fig.6, 101, 107, 115, 119), the one or more processors configured to recognize at least one movement of a user (gesture including waving) represented in the sequence of images ([0079] [0080] [0088] [0090]) and synthesize a sound field produced by an array of loudspeakers (Speaker A – Speaker D, Figure 1; [0040] [0041]) ([0050 – 0053] [0072] [0073] [0186 -0188]), wherein the array of loudspeakers are a part of a home entertainment device ([0040] [0041]). 
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to improve Swan’s device in the same way that Chang’s and Tsurumi’s devices have been improved has been improved to achieve the following predictable results 
for the purpose of increasing the flexibility in  providing input commands to control sound production from entertainment devices by enabling and employing vision based gesture recognition (Swan, column 1 lines 10 – 64) (Tsurumi, [0001 – 0006] [0040] [0041]): the memory further stores a sequence of images; the gesture which indicates the change to the sound is 

For claim 2, Swan and Tsurumi further disclose, wherein the recognized movements of the user represent a gesture  (Swan, column 3 lines 15 – 25) (Tsurumi, [0088]), where the gesture is interpreted as one of a plurality of separate patterns (Swan, thumbs up, thumbs down, waving hand, moving the head, etc. column 3 lines  15 – 25; column 4 lines 48 – 60 and column 6 lines 29 - 59), and decisions by the one or more processors are made to synthesize the corresponding sound field associated with the separate patterns (Swan, column 4 lines  48 – column 5 lines 6, 10 – 26) (Tsurimi, sound produced by an array of surround speakers, [0040] [0041] [0072] [0073]).
For claim 10, Swan further discloses wherein a voice command (acoustic initiation command) is used to enter a gesture control mode (acoustic initiation command is followed by a gesture function command, Swan, column 3 lines 3 – 14 and 27 - 40; column 5 lines 7 – 28). 
 	For claim 11, Tsurumi further discloses wherein the analyzed recognized movements of the user include face recognition (face detection processing), voice recognition  or both face recognition and voice recognition for user identification or user location (Tsurumi, [0079] [0081 - 0083] [0123]).
	For claim 12, Swan and Tsurumi further disclose, wherein the recognized movements of the user represent a gesture (Swan, column 3 lines 15 – 25), wherein the gesture is mapped to a command (Swan, column 4 lines 48 – 60 and column 6 lines 29 – 59), and a command interpreter (Swan, gesture interpretation module 52) integrated into the one or more processors (Swan, column 2 lines 49 – 56 and column 3 lines 55 – 67) is configured to disable changes to a 

	For claim 23, Swan further discloses, wherein the recognized movements of the user represent a gesture, wherein the gestures are at least one among the following gestures: a clockwise hand movement, a counterclockwise hand movement, a hand rotation (Swan, waving, column 3 lines 18 – 23), a hand grasping, and a hand releasing.
For claim 25, Swan and Tsurumi further disclose, wherein the recognized movements of the user represent a gesture (Swan, column 3 lines 15 – 25) , wherein the gesture is used to synthesize the modified sound field produced by the array of loudspeakers to change a volume of the modified sound field (Swan, column 4 lines  48 – column 5 lines 6, 10 – 26) (Tsurimi, sound produced by an array of surround speakers, [0040] [0041] [0072] [0073]), or control a volume of a beam in the modified sound field.

Claims 3 – 8 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Swan et al. (US 6,351,222) (“Swan”) in view of Chang et al. (“Vision-Based Interface for Integrated Home Entertainment System”) (“Chang”), and further in view of Tsurumi (US 2012/0114137) and further in view of Gunes et al. (“Automatic Visual Recognition of Face and Body Action Units”).
For claim 3, the combination of Swan, Tsurumi and Chang further discloses one or more processors configured to detect and locate regions of interest (Swan, hands as region of interest 
However, Gunes discloses method for recognizing face and body action units (Abstract), wherein features are extracted to detect and locate regions of interests in a face and body (eyes, mouth, hands, shoulders) (1.Introduction, 3.1.1. Feature extraction, 3.2. Body feature detection, 3.2.2. Segmentation and tracking of the body part, 3.2.3. Location shoulders).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applications invention to apply the feature extraction techniques to the region of interest detection method disclosed by the combination of Swan, Tsurumi and Chang to achieve the predictable results of extracting features to detect and locate the regions of interest for the purpose of increasing the flexibility in  providing input commands to control sound production from entertainment devices by enabling and employing vision based gesture recognition (Swan, column 1 lines 10 – 64) (Tsurumi, [0001 – 0006] [0040] [0041]).

For claim 4, Tsurumi and Gunes further disclose, wherein the regions of interest include the user's eyes (Tsurumi, [0081] [0096]) (Gunes, 1.Introduction, 3.1.1. Feature extraction). 
	For claim 5, Swan and Gunes further disclose wherein the regions of interest include the user's hands (Swan, column 3 lines 15 – 25 and column 6 lines 29 – 60) (Gunes, 3.2. Body feature detection, 3.2.2. Segmentation and tracking of the body part).
	For claim 6, Tsurumi and Gunes further disclose, wherein the regions of interest include the user's mouth (Tsurumi, [0081] [0096]) (Gunes, 1.Introduction, 3.1.1. Feature extraction).

	For claim 8, Swan and Gunes further disclose one or more processors configured to identify orientations of the one or more features or changes over time within the sequence of images representing a trajectory of a hand  (Swan, hand waving, column 3 lines 15 – 25 and column 6 lines 29 – 60) (Gunes, 3.2.2. Segmentation and tracking of body parts and 3.2.5 Hand pose and orientation), a trajectory of each hand, a rotation of a head, or a tilt of a head.

Claims 9 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Swan et al. (US 6,351,222) (“Swan”) in view of Chang et al. (“Vision-Based Interface for Integrated Home Entertainment System”) (“Chang”), and further in view of Tsurumi (US 2012/0114137) and further in view of Hardracker et al. (US 2009/0304205) (“Hardracker”).
For claim 9, the combination of Swan, Chang and Tsurumi fails to teach, wherein the one or more processors are configured to control the array of loudspeakers to generate beams in different directions to support gesture control independently for different user located in different directions.
However, Hardracker discloses techniques for personalizing audio (Abstract), wherein an array of loudspeakers generates beams in different directions to support gesture control independently for different users located in different directions (Fig.1 and Fig.2; [0010] [0011] [0018] [0019] [0021]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify the combined teachings of Swan, Chang and Tsurumi with .

Claim 13 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over T Swan et al. (US 6,351,222) (“Swan”) in view of Chang et al. (“Vision-Based Interface for Integrated Home Entertainment System”) (“Chang”) and further in view of Tsurumi (US 2012/0114137), and further in view of Haker et al. (US 2012/0120073) (“Haker”) and further in view of Dondi (“Gesture Recognition by Data Fusion of Time-of-Flight and Color Cameras”).
For claim 13, the combination of Swan, Tsurumi and Chang fails to teach an infrared light detector configured to capture the sequence of images based on a laser distance measurement.
However, Harker discloses a method for capturing images (Abstract), wherein individual images of an image sequence are recorded by a time of flight camera ([0017] [0029] [0030]).
Moreover, Dondi discloses a method for recognizing gestures (Abstract), wherein an infrared light detector (time-of-flight camera) captures the sequence of images based on a laser distance measurement (time of flight) (II. Cameras Specifications and A. Gestures Recognition, pg. 1954, 1955 and 1957).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve the invention disclosed by the combination of Tsurumi, Swan  and Chang in the same way that that the inventions disclosed by Haker and Dondi have been improved to achieve the predictable results of incorporating an infrared light detector (time-of-. 

Claims 14 – 16 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Swan et al. (US 6,351,222) (“Swan”) in view of Chang et al. (“Vision-Based Interface for Integrated Home Entertainment System”) (“Chang”), and further in view of Tsurumi (US 2012/0114137) and further in view of Thissen (“A comparative study of optical depth sensors for user interaction”).
For claim 14, the combination of Swan, Tsurumi and Chang fails to teach wherein the one or more processors are configured to recognize the movement based on depth information.
However, Thissen discloses for detecting images to perform gesture control (Abstract), wherein movement is recognized based on depth information generated by optical depth sensors (stereo, structured light, time-of-flight) (2. Optical Sensors, 2.1, Triangulation, 2.2 Time-Of Flight and 4.1 Finger based gesture control, pg. 5 – 15 and 51 -57).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to apply the optical depth sensors disclosed by Thiessen to the cameras disclosed by the combination of Swan, Tsurumi and Chang (Swan, Figure 1, 20) to achieve the predictable results of the movement recognized by the processors being further based on depth information for the purpose of providing a cost effective and enhanced method for the purpose of increasing the flexibility in  providing input commands to control sound production from 

For claim 15, Swan and Thissen further disclose two or more cameras configured to generate the depth information (Swan, Fig.1, 20) (Thissen, 2.1 Triangulation and Stereo, pg. 5 – 9).
For claim 16, Swan and Thissen further disclose a projector configured to project a pattern of stripes, a pattern of dots, or both a pattern of stripes and dots onto a part of the user and estimate depths of surface points of the part of the user (Swan, Fig.1, 20) (Thissen, Figure 8, Structured light and 4.1 Finger based gesture control, pg. 10, 11, 51 – 57).

Claim 17 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Swan et al. (US 6,351,222) (“Swan”) in view of Chang et al. (“Vision-Based Interface for Integrated Home Entertainment System”) (“Chang”), and further in view of Tsurumi (US 2012/0114137), and further in view of Thissen (“A comparative study of optical depth sensors for user interaction”) and further in view of Pearse (“Gestural Mappings-Towards the Creation of Three Dimensional Composition Environment”).
For claim 17, the combination of Swan, Tsurumi, Chang and Thissen further discloses that the Kinect which comprises the projector further comprises an image detector to capture an image of an illuminated part of the user (IR camera/imaging unit, Figure 8 and Structured Light; pg. 9 and 10), yet fails to teach a laser to emit light and a diffraction grating to impose the pattern on the emitted light.

Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to modify the teachings of Swan, Tsurumi, Chang and Thissen with Pearse’s teachings so that the Kinect further comprises a laser to emit light and a diffraction grating to impose the pattern on the emitted light for the purpose of increasing the flexibility in  providing input commands to control sound production from entertainment devices by enabling and employing vision based gesture recognition (Swan, column 1 lines 10 – 64) (Tsurumi, [0001 – 0006] [0040] [0041]) (Thissen, Abstract and 1.Introduction, pg.3). 

Claim 18 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over wan et al. (US 6,351,222) (“Swan”) in view of Chang et al. (“Vision-Based Interface for Integrated Home Entertainment System”) (“Chang”), and further in view of Tsurumi (US 2012/0114137) and further in view of Dahl et al. (US 2011/0254762) (“Dahl”).
For claim 18, the combination of Swan, Chang and Tsurumi fails to teach, wherein the recognize movements of the user are based on an array of ultrasound transducers used to perform spatial imaging.
However, Dahl discloses a method for detecting movement of objects including hands and fingers (Abstract; [0002] [0028]) using an array of ultrasound transducers to perform spatial imaging ([0003] [0009] [0013] [0036] [0037]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to improve the invention disclosed by the combination of Swan, Chang and 

Claim 19 and 20 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over wan et al. (US 6,351,222) (“Swan”) in view of Chang et al. (“Vision-Based Interface for Integrated Home Entertainment System”) (“Chang”), and further in view of Tsurumi (US 2012/0114137), and further in view of Machida (US 2006/0140420) and further in view of Keenan (US 2009/0102800).
For claims 19 and 20, the combination of Swan, Chang and Tsurumi further discloses an on-screen display to provide feedback for a gesture command (Swan, column3 lines 40 – 50 and column 5 lines 64 – column 6 lines 6), yet fails to teach that the feedback is a bar or dial to display a change in beam intensity, beam direction or dynamic range.
However, Machida discloses an audio playback method (Abstract), wherein a sound field comprising surround sound can be delivered to a listener’s position using a sound beam produced by one panel loudspeaker array ([0001] [0002]).
Additionally, Keenan discloses an interactive input system controller and method (Abstract), wherein feedback for a gesture command includes a bar or dial to display a change in intensity (Fig.2, 48, [0035]).

Additionally, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify the combined teachings of Swan, Chang, Tsurumi and Machida with Kennan’s teachings so that the feedback is a bar or dial to display a change in beam intensity  for the purpose of increasing the flexibility in providing input commands to control sound production from entertainment devices by enabling and employing vision based gesture recognition (Swan, column 1 lines 10 – 64) (Tsurumi, [0001 – 0006] [0040] [0041]).

Claim 21 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over wan et al. (US 6,351,222) (“Swan”) in view of Chang et al. (“Vision-Based Interface for Integrated Home Entertainment System”) (“Chang”), and further in view of Tsurumi (US 2012/0114137), and further in view of Machida (US 2006/0140420) and further in view of Keenan (US 2009/0102800), and further in view of Kim et al. (WO 2011/059202) (“Kim”).
For claim 21, the combination of Swan, Chang, Tsurumi, Machida and Keenan fails to teach, wherein the on-screen display is configured to display an error indication of an invalid gesture command.
	However, Kim discloses a method for controlling a display device (Abstract), wherein an on-screen display is configured to display an error indication of an invalid gesture (Fig.31, [0081 – 0083] [0150] [0151]).
.

Claims 22 and 24 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Swan et al. (US 6,351,222) (“Swan”) in view of Chang et al. (“Vision-Based Interface for Integrated Home Entertainment System”) (“Chang”), and further in view of Tsurumi (US 2012/0114137) and further in view of Reville et al. (US 2011/0289455) (“Reville”).
For claim 22, the combination of Swan, Chang and Tsurumi fails to teach, wherein the recognized movements of the user represent a gesture, wherein the gesture is at least one among the following gestures: two-hand gesture, hand-and-head gesture, hand and body gesture, and hand to ear gesture.
However, Reville discloses a method for operating a user interface with gestures (Abstract), wherein gestures include two hands (Fig.16A- Fig.17B, [0158 - 0160]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify the combined teachings of Swan, Chang and Tsurumi with Reville’s teachings so that the recognized movements of the user represent a gesture, wherein the gesture is at least one among the following gestures: two-hand gesture, hand-and-head gesture, hand and 

For claim 24, the combination of Swan, Chang and Tsurumi fails to teach, wherein the recognized movements of the user represent a gesture, wherein the gesture is at least one among the following gestures: a hand vertical movement, a hand movement toward a sensor used to capture the sequence of images, a hand movement away from the sensor used to capture the sequence of images, hands moving in a same direction, hands moving toward each other, and hands moving away from each other.
However, Reville discloses a method for operating a user interface with gestures (Abstract), wherein gestures include hands moving away from each other (Fig.17A and Fig.17 B; [0158 - 0160]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify the combined teachings of Swan, Chang and Tsurumi with Reville’s teachings so that the recognized movements of the user represent a gesture, wherein the gesture is at least one among the following gestures: a hand vertical movement, a hand movement toward a sensor used to capture the sequence of images, a hand movement away from the sensor used to capture the sequence of images, hands moving in a same direction, hands moving toward each other, and hands moving away from each other for the purpose of increasing the flexibility in providing input commands to control sound production from entertainment devices by 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SONIA L GAY whose telephone number is (571)270-1951. The examiner can normally be reached Monday-Friday 9-5 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SONIA L GAY/Primary Examiner, Art Unit 2657