DETAILED ACTION
This action is in response to the RCE filed on 11/19/2022.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 11/19/2022 has been entered.

Response to Amendment
Applicant’s amendment filed on 11/19/2022 has been entered. Claims 1, 12, 19 and 20 – 27 have been amended.  Duplicate claim 19 has been canceled. Claims 37 – 40 have been added. Claims 1 – 40 are still pending in this application, with claims 1 and 31 – 36 being independent.
Furthermore, a review of the claim language in claims 28 - 33 has been conducted. Claims 28 - 33  recite the following limitations which are interpreted as intended results of positively recited user movement recognition and synthesis steps: “ for menu navigation”, “for user -interface feedback via sound” and “for user-interface feedback via display icon”. The specification was consulted to determine if the aforementioned limitations provide a patentable distinction., i.e. impose a functional limit on the user movement recognition and/or  synthesis steps. Paragraph 101 of the specification recites the following:
[00101] Task TA10 may be implemented to perform gesture recognition by selecting one or more among a plurality of candidate gesture representations. Such gesture recognition may include classifying a gesture as the closest among a set of gesture candidates (e.g., according to the largest similarity measure), if a measure of the match (e.g., similarity measure) is above a threshold, which may be candidate-dependent. Such classification may be based on a hidden Markov model or other pattern recognition algorithm to recognize a gesture element from individual features within a scene or frame and/or to recognize a sequence of gesture elements over time. Additional applications may include compound gestures (e.g., a sequence of two or more gestures) for menu navigation and/or user-interface feedback (e.g., via a sound and/or display icon) in response to a gesture recognition.
	As shown above, the specification  fails to discuss how recognizing the user movement for “user interface feedback” functionally limits the user movement recognition. Moreover, the specification fails to discuss how synthesizing the sound field for “user interface feedback” or “menu navigation” functionally limits the sound field synthesis. Additionally, the specification fails to provide any discussion related to the unique features of user movement recognition in “user-interface feedback” or sound synthesis in “user interface feedback” or “menu navigation”. Therefore, the specification fails to support a broadest reasonable interpretation of the “user interface feedback” or “menu navigation” providing a patentable distinction over processors in the prior art which are configured to recognize user movement and/or synthesize a modified sound field produced by the array of loudspeakers to implement an indicated change to the sound field.

Allowable Subject Matter
Aside from the non-prior art rejections, the prior art fails to teach or suggest in reasonable combination the limitations recited in claims 26 and 35.
Claims 27, 38 and 39 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims in view of the determination that the prior art fails to teach or suggest in reasonable combination the limitations recited in these claims.
Claims 34 and 36 are allowed in view of the determination that the prior art fails to teach or suggest in reasonable combination the limitations recited in these claims.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1 –18, 20 - 27 and 35 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 – 51 of U.S. Patent No. 10,448,161. Although the claims at issue are not identical, they are not patentably distinct from each other.

The claim mapping is as follows.
1. (Currently Amended) A device comprising: a memory configured to store a sequence of images; and one or more processors coupled to the memory, the one or more processors configured to: recognize at least one movement of a user represented in the sequence of images, wherein the at least one recognized movement of the user is a gesture, wherein the gesture is mapped to a command, wherein the command is context-dependent; apply the at least one recognized movements of the user to determine an indicated change to a sound field produced by an array of loudspeakers; synthesize a modified sound field produced by the array of loudspeakers to implement the indicated change, wherein the indicated change is to change a beam direction of the modified sound field.

2. (Currently Amended) The device of claim 1, wherein the at least one recognized movements of the user represent a gesture, where the gesture is interpreted as one of a plurality of separate patterns, and decisions by the one or more processors are made to synthesize the corresponding sound field associated with the separate patterns.

3. (Original) The device of claim 1, further comprising one or more processors configured to extract one or more features used to detect and locate regions of interest.

4. (Original) The device of claim 3, wherein the regions of interest include the user's eyes.

5. (Original) The device of claim 3, wherein the regions of interest include the user's hands.

6. (Original) The device of claim 3, wherein the regions of interest include the user's mouth.

7. (Original) The device of claim 3, wherein the regions of interest include the user's body.

8. (Original) The device of claim 3, further comprising one or more processors configured to identify orientations of the one or more features or changes over time within the sequence of images representing a trajectory of a hand, a trajectory of each hand, a rotation of a head, or a tilt of a head.

9. (Original) The device of claim 1, wherein the one or more processors are configured to control the array of loudspeakers to generate beams in different directions to support gesture control independently for different users located in the different directions.

10. (Original) The device of claim 1, wherein a voice command is used to enter a gesture control mode.

11. (Previously Presented) The device of claim 1, wherein the at least one recognized movement of the user include face recognition, voice recognition, or both face recognition and voice recognition for user identification or user location.

12. (Currently Amended) The device of claim 1, wherein the at least one recognized movement of the user represent a gesture, wherein the gesture is mapped to a command, and a command interpreter integrated into the one or more processors is configured to disable changes to a current sound field configuration[[;]] or enable changes to the current sound field configuration.

13. (Original) The device of claim 1, further comprising an infrared light detector configured to capture the sequence of images based on a laser distance measurement.

14. (Previously Presented) The device of claim 1, wherein the one or more processors are configured to recognize the at least one movement based on depth information.

15. (Original) The device of claim 14, further comprising two or more cameras configured to generate the depth information.

16. (Original) The device of claim 14, further comprising a projector configured to project a pattern of stripes, a pattern of dots, or both a pattern of stripes and dots onto a part of the user and estimate depths of surface points of the part of the user.

17. (Original) The device of claim 16, further comprising a laser to emit light and a diffraction grating to impose the pattern on the emitted light, and an image detector to capture an image of an illuminated part of the user.

18. (Previously Presented) The device of claim 1, wherein the at least one recognized movement of the user are based on an array of ultrasound transducers used to perform spatial imaging.

19. (Currently Amended) The device of claim 1, further comprising an on-screen display to provide feedback for a the gesture command, wherein the feedback is a bar or a dial to display a change in beam intensity, beam direction, or dynamic range.

19. (Cancelled).

20. (Currently Amended) The device of claim 19, wherein the on-screen display is configured to display the feedback for the gesture command on a bar on the screen or dial on the screen to represent a change in beam intensity, beam direction, or dynamic range.

21. (Currently Amended) The device of claim 19, wherein the on-screen display is configured to display the an error indication of an invalid gesture command.

22. (Currently Amended) The device of claim 1, wherein the at least one recognized movement of the user that represents a the gesture, wherein the gesture is at least one among the following gestures: two-hand gesture, hand-and-head gesture, hand and body gesture, and hand to ear gesture.

23. (Currently Amended) The device of claim 1, wherein the at least one recognized movement of the user that represents a the gesture, wherein the gesture is at least one among the following gestures: a clockwise hand movement, a counterclockwise hand movement, and a hand rotation, hand grasping, and hand releasing.

24. (Currently Amended The device of claim 1, wherein the at least one recognized movement of the user that represents a-the gesture, wherein the gesture is at least one among the following gestures: a hand vertical movement, a hand movement toward a sensor used to capture the sequence of images, a hand movement away from the sensor used to capture the sequence of images, hands moving in a same direction, hands moving toward each other, and hands moving away from each other.

25. (Currently Amended) The device of claim 1, wherein the at least one recognized movement of the user that represents a the gesture, wherein the gesture is used to synthesize the modified sound field produced by the array of loudspeakers to change a volume of the modified sound field[[;]] or control a volume of a beam in the modified sound field.

26. (Currently Amended) The device of claim 1, wherein the at least one recognized movement of the user that represents a the gesture, wherein the gesture is used to synthesize the modified sound field produced by the array of loudspeakers to change a beam width of the modified sound field[[;]] or change an echo depth in time of the modified sound field[[; ]]or change in dynamic range expansion or compression of the modified sound field.

27. (Currently Amended) The device of claim 1, wherein the at least one recognized movement of the user that represents a-the gesture, wherein the gesture is used to synthesize the modified sound field produced by the array of loudspeakers to create or delete a sound null in an indicated direction relative to an axis of the array of the loudspeaker.

28. (Previously Presented) The device of claim 1, wherein the at least one recognized movement of the user represent a sequence of two or more gestures, wherein the sequence of two or more gestures are used to synthesize the modified sound field produced by the array of loudspeakers for menu navigation.

29. (Previously Presented) The device of claim 1, wherein the at least one recognized movement of the user are used to synthesize the modified sound field produced by the array of loudspeakers for user-interface feedback via sound.

30. (Previously Presented). The device of claim 1, wherein the at least one recognized movement is used for user-interface feedback via a display icon.

31. (Previously Presented) A device comprising: a memory configured to store a sequence of images; and one or more processors coupled to the memory, the one or more processors configured to: recognize at least one movement of a user, represented in the sequence of images; apply the at least one recognized movement of the user to determine an indicated change to a sound field produced by an array of loudspeakers; synthesize a modified sound field produced by the array of loudspeakers to implement the indicated change to the sound field for user-interface feedback via a display icon.

32. (Previously Presented) A device comprising: a memory configured to store a sequence of images; and one or more processors coupled to the memory, the one or more processors configured to: recognize at least one movement of a user, represented in the sequence of images; apply the at least one recognized movements of the user to determine an indicated change to a sound field produced by an array of loudspeakers; synthesize a modified sound field produced by the array of loudspeakers to implement the indicated change to the sound field for menu navigation.

33. (Previously Presented) A device comprising: a memory configured to store a sequence of images; and one or more processors coupled to the memory, the one or more processors configured to: recognize at least one movement of a user, represented in the sequence of images; apply the at least one recognized movements of the user to determine an indicated change to a sound field produced by an array of loudspeakers; synthesize a modified sound field produced by the array of loudspeakers to implement the indicated change to the sound field for user-interface feedback via sound.

34. (Previously Presented) A device comprising: a memory configured to store a sequence of images; and one or more processors coupled to the memory, the one or more processors configured to: recognize at least one movement of a user, represented in the sequence of images; apply the at least one recognized movement of the user to determine an indicated change to a sound field produced by an array of loudspeakers; synthesize a modified sound field produced by the array of loudspeakers to implement the indicated change to the sound field to create or delete a sound null in an indicated direction relative to an axis of the array of the loudspeakers.

35. (Previously Presented) A device comprising: a memory configured to store a sequence of images; and one or more processors coupled to the memory, the one or more processors configured to: recognize at least one movement of a user, represented in the sequence of images; apply the at least one recognized movement of the user to determine an indicated change to a sound field produced by an array of loudspeakers; synthesize a modified sound field produced by the array of loudspeakers to implement the indicated change to the sound field to change a beam width of the sound field.

36. (Previously Presented) A device comprising: a memory configured to store a sequence of images; and one or more processors coupled to the memory, the one or more processors configured to: recognize at least one movement of a user, represented in the sequence of images; apply the at least one recognized movement of the user to determine an indicated change to a sound field produced by an array of loudspeakers; synthesize a modified sound field produced by the array of loudspeakers to implement the indicated change to the sound field to change an echo depth in time of the sound field.

37. (New) The device of claim 1, wherein the command is context-dependent includes that a command is produced only in response to representation of the gesture that are appropriate for a current context.

38. (New) The device of claim 37, wherein the representation of the gesture that are appropriate for the current context is to ignore the representation of the gesture to reduce volume when a system in already in a muted state.

39. (New) The device of claim 37, wherein the representation of the gesture that are appropriate for the current context is to ignore the representation of the gesture to block sound from a direction when a system is already in a blocked state in that direction.

40. (New) The device of claim 37, wherein the representation of the gesture that are appropriate for the current context indicates whether the command is applied locally or globally.
US 10,448,161
1. A method of signal processing, said method comprising: producing a command in response to a representation of a hand gesture of a user, wherein the hand gesture indicates (i) a change to a sound field near the user and (ii) a direction of where to direct the sound field, wherein the sound field comprises a beam; producing a filter configuration for a spatially directive filter in response to said command based on a current filter configuration; applying the spatially directive filter to a multichannel signal; and driving an array of directionally controllable transducers with the multichannel signal to change the sound field that includes the beam.

2. The method of signal processing according to claim 1, wherein said producing the command comprises selecting the command, based on information from said representation, from among a plurality of commands that includes (A) a command to increase a width of the beam and (B) a command to decrease the width of the beam.

3. A method of signal processing according to claim 1, wherein said producing the command comprises selecting the command, based on information from said representation, from among a plurality of commands that includes (A) a command to change a current direction of the beam to a first direction that is on a first side of the beam and (B) a command to change the current direction of the beam to a second direction that is on a second side of the beam opposite to the first side.

4. A method of signal processing according to claim 1, wherein said producing the filter configuration is based on an indication of at least one among a current direction of the beam and a current width of the beam.

5. A method of signal processing according to claim 1, wherein said producing the filter configuration is based on an indication of a current location of a user.

6. A method of signal processing according to claim 1, wherein said producing the filter configuration comprises selecting the filter configuration, according to said command, from among a plurality of filter configurations.

7. The method of signal processing according to claim 6, wherein a first filter configuration among the plurality of filter configurations describes a different phase relation among output channels of the spatially directive filter than a second filter configuration among the plurality of filter configurations.

8. A method of signal processing according to claim 1, wherein the representation of the hand gesture is based on a sequence of images of a user performing the hand gesture within the sound field.

9. The method of signal processing according to claim 8, wherein said sequence of images includes images of a light pattern projected on a hand of the user.

10. A method of signal processing according to claim 1, wherein the hand gesture includes a lateral movement of a hand of a user.

11. A method of signal processing according to claim 1, wherein the hand gesture includes a grasping motion of a hand of a user.

12. A method of signal processing according to claim 1, wherein the hand gesture includes movement of two hands of a user toward each other.

13. A method of signal processing according to claim 1, wherein the hand gesture includes movement of two hands of a user away from each other.

14. A method of signal processing according to claim 1, wherein the hand gesture includes movement of two hands of a user in a same lateral direction.

15. An apparatus for signal processing, said apparatus comprising: means for producing a command in response to a representation of a hand gesture of a user, wherein the hand gesture indicates (i) a change to a sound field near the user and (ii) a direction of where to direct the sound field, wherein the sound field comprises a beam; means for producing a filter configuration for a spatially directive filter in response to said command based on a current filter configuration; means for applying the spatially directive filter to a multichannel signal; and means for driving an array of directionally controllable transducers with the multichannel signal to change the sound field that includes the beam.

16. The apparatus for signal processing according to claim 15, wherein said means for producing the command comprises means for selecting the command, based on information from said representation, from among a plurality of commands that includes (A) a command to increase a width of the beam and (B) a command to decrease the width of the beam.

17. An apparatus for signal processing according to claim 15, wherein said means for producing the command comprises means for selecting the command, based on information from said representation, from among a plurality of commands that includes (A) a command to change a current direction of the beam to a first direction that is on a first side of the beam and (B) a command to change the current direction of the beam to a second direction that is on a second side of the beam opposite to the first side.

18. An apparatus for signal processing according to claim 15, wherein said means for producing the filter configuration is configured to produce the filter configuration based on an indication of at least one among a current direction of the beam and a current width of the beam.

19. An apparatus for signal processing according to claim 15, wherein said means for producing the filter configuration is configured to produce the filter configuration based on an indication of a current location of a user.

20. An apparatus for signal processing according to claim 15, wherein said means for producing the filter configuration comprises means for selecting the filter configuration, according to said command, from among a plurality of filter configurations.

21. The apparatus for signal processing according to claim 20, wherein a first filter configuration among the plurality of filter configurations describes a different phase relation among output channels of the spatially directive filter than a second filter configuration among the plurality of filter configurations.

22. An apparatus for signal processing according to claim 15, wherein the representation of the hand gesture is based on a sequence of images of a user performing the hand gesture within a sound field.

23. The apparatus for signal processing according to claim 22, wherein said sequence of images includes images of a light pattern projected on a hand of the user.

24. An apparatus for signal processing according to claim 15, wherein the hand gesture includes a lateral movement of a hand of a user.

25. An apparatus for signal processing according to claim 15, wherein the hand gesture includes a grasping motion of a hand of a user.

26. An apparatus for signal processing according to claim 15, wherein the hand gesture includes movement of two hands of a user toward each other.

27. An apparatus for signal processing according to claim 15, wherein the hand gesture includes movement of two hands of a user away from each other.

28. An apparatus for signal processing according to claim 15, wherein the hand gesture includes movement of two hands of a user in a same lateral direction.

29. An apparatus for signal processing, said apparatus comprising: a gesture interpreter configured to produce a command in response to a representation of a hand gesture of a user, wherein the hand gesture indicates (i) a change to a sound field near the user and (ii) a direction of where to direct the sound field, wherein the sound field comprises a beam; a command interpreter configured to produce a filter configuration for a spatially directive filter in response to said command based on a current filter configuration; a synthesizer configured to apply the spatially directive filter to a multichannel signal; and an audio output stage configured to drive an array of directionally controllable transducers with the multichannel signal to change the sound field that includes the beam.

30. The apparatus according to claim 29, wherein said gesture interpreter is configured to produce the command by selecting the command, based on information from said representation, from among a plurality of commands that includes a command to increase a width of the beam and a command to decrease the width of the beam.

31. An apparatus according to claim 29, wherein said gesture interpreter is configured to produce the command by selecting the command, based on information from said representation, from among a plurality of commands that includes (A) a command to change a current direction of the beam to a first direction that is on a first side of the beam and (B) a command to change the current direction of the beam to a second direction that is on a second side of the beam opposite to the first side.

32. An apparatus according to claim 29, wherein said command interpreter is configured to produce the filter configuration based on an indication of at least one among a current direction of the beam and a current width of the beam.

33. An apparatus according to claim 29, wherein said command interpreter is configured to produce the filter configuration based on an indication of a current location of a user.

34. An apparatus according to claim 29, wherein said command interpreter is configured to produce the filter configuration by selecting the filter configuration, according to said command, from among a plurality of filter configurations.

35. The apparatus according to claim 34, wherein a first filter configuration among the plurality of filter configurations describes a different phase relation among output channels of the synthesizer than a second filter configuration among the plurality of filter configurations.

36. An apparatus according to claim 29, wherein said audio output stage is configured to drive the directionally controllable transducers to produce a sound field that includes the beam, and wherein the representation of the hand gesture is based on a sequence of images of a user performing the hand gesture within the sound field.

37. The apparatus according to claim 36, wherein said sequence of images includes images of a light pattern projected on a hand of the user.

38. An apparatus according to claim 29, wherein the hand gesture includes a lateral movement of a hand of a user.

39. An apparatus according to claim 29, wherein the hand gesture includes a grasping motion of a hand of a user.

40. An apparatus according to claim 29, wherein the hand gesture includes movement of two hands of a user toward each other.

41. An apparatus according to claim 29, wherein the hand gesture includes movement of two hands of a user away from each other.

42. An apparatus according to claim 29, wherein the hand gesture includes movement of two hands of a user in a same lateral direction.

43. A non-transitory computer-readable medium having tangible features that cause a machine reading the features to: produce a command in response to a representation of a hand gesture of a user, wherein the hand gesture indicates (i) a change to a sound field near the user and (ii) a direction of where to direct the sound field, wherein the sound field comprises a beam; produce a filter configuration for a spatially directive filter in response to said command based on a current filter configuration; apply the spatially directive filter to a multichannel signal; and drive an array of directionally controllable transducers with the multichannel signal to change the sound field that includes the beam.

44. The method according to claim 1, wherein the applying the spatially directive filter is based on the filter configuration.

45. The apparatus according to claim 15, wherein the means for applying the spatially directive filter is based on the filter configuration.

46. The apparatus according to claim 29, wherein the applying the spatially directive filter is based on the filter configuration.

47. The computer-readable medium according to claim 43, wherein the applying the spatially directive filter is based on the filter configuration.

48. The method of claim 1, further comprising detecting the hand gesture using a camera.

49. The apparatus of claim 29, further comprising a camera configured to detect the representation of the hand gesture.

50. The apparatus of claim 29, wherein the gesture interpreter and the synthesizer are integrated in a processor.

51. The apparatus of claim 29, further comprising the directionally controllable transducers.

52. The apparatus of claim 29, wherein the hand gesture indicates a change to a width of the beam.

53. The apparatus of claim 29, wherein the hand gesture indicates an amplification of the sound field.

54. The apparatus of claim 29, wherein the hand gesture indicates an attenuation of the sound field.


	As shown, the limitations recited in claims  1 – 51 of US 10, 448,161, either alone or in combination with the prior art discussed below, further recite the limitations of claims 1 –18, 20 - 27 and 35 of the currently pending application


Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.




Claims 1, 3 - 8, 11, 12, 14, 15, 24, 25, 29 and 31 - 33 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Strauss et al. (US 2011/0103620) (“Strauss”) in view of Black et al. ( US 5,774,591) (“Black”), and further in view of Broyles et al. (US 2012/0005632) (“Broyles”).
For claim 1, Strauss discloses a device (Abstract) comprising: components configured to: recognize at least one movement of a user represented in a sequences of images (position or orientation of the head changes over time,  [0057] [0058] [0112 - 0116]); apply the at least one movement of the user to determine an indicated change to a sound field produced by an array of loudspeakers (filter characteristic generator generates time varying filter characterizes for the programmable filters in response to the position of orientation of the head as determined by the image analyzer, [0057] [0058] [0081] [0082]); and synthesize a modified sound field produced by the array of loudspeakers to implement the indicated change, wherein the indicated change is to change a beam direction of the modified sound field  (focus zone changes based on a change in the position and/or orientation of the listener, thus changing the direction of the beam, [0015] [0016] [0034] [0081] [0112]). Yet, Strauss fails to teach the following: the at least one recognized movement of the user is gesture, wherein the gesture is mapped to a command, wherein the command is context dependent; and  a memory configured to store a sequence of images, wherein the memory is coupled to the one or more processors configured to perform the aforementioned method.
However, Black discloses an apparatus and method for recognizing facial expressions (Abstract) comprising a memory (Fig.1, 14) configured to store a sequence of images (column 6 lines 66 – column 7 lines 5, 45 – 55); and one or more processors coupled to the memory (image sequence manager, motion estimation system, image segmentation system, region tracking system and feature motion detector, Fig.1, 10, 12, 16 and 18), the one or more processors configured to recognize at least one movement of a user (motion of human head and facial features) represented in the sequence of images (column 7 lines 1 – 44; column 24 lines 6 – column 25 line 13).
Additionally, Broyles discloses a method for executing a command (Abstract), wherein any recognized movement of a user is a gesture ([0014] [0015] [0023] [0024]). Additionally, the gestures are mapped to a command ([0024 – 0026]), wherein the command is context-dependent (The command is mapped to a gesture in a given mode of operation. Therefore, the command is dependent on a context comprising a mode of operation, Fig.3; [0026] [0044 – 0049]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to modify Strauss’ teachings with Black’s teachings so that the device further comprises a memory configured to store a sequence of images and one more processors coupled to the memory, wherein the processors are configured to perform the aforementioned method for the purpose of increasing user satisfaction by focusing sound at a user human head or ears (Strauss, [0002 -0008]). 
Moreover, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to improve the invention disclosed by the combination of Strauss and Black in the same way that Broyles’ invention has been improved to achieve the predictable results of the recognized movements of the user (changes in head orientation) further comprising gestures which are mapped to a command, wherein the command is context-dependent for the purpose of providing a user friendly experience for controlling and managing devices using detected gestures (Broyles, [0010]).

For claim 3, Strauss, Black and Broyles further disclose, one or more processors  (Strauss, [0057] [0058] [0112] [0114 – 0116]) (Black, image segmentation system and motion emotion system, Fig.1, 12 and 16) (Broyles, [0014] [0016] [0019] [0022] [0024]) configured to extract one or more features used to detect and locate regions of interest (Strauss, [0112] [0114 – 0116]) (Black, column 7 lines 1 – 21; column 8 lines 8 – 37; column 30 lines 11 – 26) (Broyles, [0024] [0058 – 0060]).
For claim 4, Strauss and Black further discloses, wherein the regions of interest the user’s eyes (Strauss, [0057] [0058] [0112] [0114 – 0116]) (Black, column 7 lines 1 – 21; column 8 lines 8 – 37).
For claim 5, Black and Broyles further discloses, wherein the regions of interest include the user’s hands (Black, column 7 lines 1 – 21; column 8 lines 8 – 37; column 30 lines 11 – 26) (Broyles, [0022] [0023] [0040]).
For claim 6, Strauss and Black further discloses, wherein the region of interest include the user’s mouth (Strauss, [0057] [0058] [0112] [0114 – 0116]) (Black, column 7 lines 1 – 21; column 8 lines 8 – 37).
For claim 7, Strauss and Black further disclose, wherein the regions of interest include the user’s body (Strauss, face/head as part of the body [0057] [0058] [0112] [0114 – 0116]) (Black, face as part of the body and other non-rigid objects which can include a body, column 7 lines 1 – 21; column 8 lines 8 – 37; column 30 lines 11 – 26).
For claim 8, Black and Broyles further disclose one or more processors configured to identify orientations of the one or more features or changes over time within the sequence of images representing a trajectory of a hand (Black, column 30 lines 11 – 26)(Broyles, Fig.2, 290), a trajectory of each hand, a rotation of a head (Black, column 24 lines 45 – column 25  line 10). or a tilt of a head 
For claim 11, Strauss further disclose, wherein the at least one recognized movement of the user include face recognition, voice recognition, or both face recognition and voice recognition for user identification or user location (Strauss, [0057] [0058] [0112 -0116]).
For claim 12, Broyles further discloses wherein the command interpreter integrated into the one or more processors is configured to disable changes to a current sound field configuration or enable changes to the current sound field configuration  (Broyles, Volume Up or Down, Fig.3; [0014] [0016] [0019] [0022] [0024 – 0026] [0045]).
For claim 14, Strauss and Broyles further discloses wherein the one or more processors are configured to recognize the at least one movement based on depth  (Strauss, [0057])(Broyles, [0039]).
For claim 15, Strauss and Broyles further discloses two or more cameras configured to generate depth information (Strauss, [0113]) (Broyles, the 3D depth image capture device is a stereoscopic device, [0039]).
For claim 24, Broyles further discloses wherein the recognized movements of the user represent a gesture, wherein the gesture is at least one among the following gestures: a hand vertical movement, a hand movement toward a sensor used to capture the sequence of images  (Broyles, Fig.290; [0014] [0016] [0022 – 0024] [0038]), a hand movement away from the sensor used to capture the sequence of images, hands moving in a same direction, hands moving toward each other, and hands moving away from each other.
For claim 25, Broyles further discloses wherein the at least one recognized movement that represents the gesture is used to synthesize the modified sound field produced by the array of loudspeakers to change a volume of the modified sound field (Broyles, [0014] [0016] [0022 – 0024] [0044]), or control a volume of a beam in the modified sound field.
For claim 29, Strauss further discloses wherein the at least one recognized movement of the user are used to synthesize the modified sound field produced by the array of loudspeakers for user-interface feedback via sound (Strauss, [0015] [0016] [0034] [0081] [0112]).
For claim 30, Broyles further discloses wherein the at least one recognized movement is used for user-interface feedback via a display icon (Broyles, [0014] [0015] [0022 - 0026]).

For claim 31, Strauss discloses a device (Abstract) comprising; components configured to: recognize at least one movement of a user represented in a sequences of images (position or orientation of the head changes over time,  [0057] [0058] [0112 - 0116]); apply the at least one movement of the user to determine an indicated change to a sound field produced by an array of loudspeakers (filter characteristic generator generates time varying filter characterizes for the programmable filters in response to the position of orientation of the head as determined by the image analyzer, [0057] [0058] [0081] [0082]); and synthesize a modified sound field produced by the array of loudspeakers to implement the indicated change  to the sound field for user-interface feedback via display icon ([0016] [0034] [0081] [0112]). Yet, Strauss fails to teach the following: the at least one recognized movement of the user is gesture, wherein the gesture is mapped to a command, wherein the command is context dependent; and a memory configured to store a sequence of images, wherein the memory is coupled to the one or more processors configured to perform the aforementioned method.
However, Black discloses an apparatus and method for recognizing facial expressions (Abstract) comprising a memory (Fig.1, 14) configured to store a sequence of images (column 6 lines 66 – column 7 lines 5, 45 – 55); and one or more processors coupled to the memory (image sequence manager, motion estimation system, image segmentation system, region tracking system and feature motion detector, Fig.1, 10, 12, 16 and 18), the one or more processors configured to recognize at least one movement of a user (motion of human head and facial features) represented in the sequence of images (column 7 lines 1 – 44; column 24 lines 6 – column 25 line 13).
Additionally, Broyles discloses a method for executing a command (Abstract), wherein any recognized movement of a user is a gesture ([0014] [0015] [0023] [0024]). Additionally, the gestures are mapped to a command ([0024 – 0026]), wherein the command is context-dependent (The command is mapped to a gesture in a given mode of operation. Therefore, the command is dependent on a context comprising a mode of operation, Fig.3; [0026] [0044 – 0049]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to modify Strauss’ teachings with Black’s teachings so that the device further comprises a memory configured to store a sequence of images and one more processors coupled to the memory, wherein the processors are configured to perform the aforementioned method for the purpose increasing user satisfaction by focusing sound at a user human head or ears (Strauss, [0002 -0008]). 
Moreover, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to improve the invention disclosed by the combination of  Strauss and Black in the same way that Broyles’ invention has been improved to achieve the predictable results of recognized movements of the user (changes in head orientation) further comprising gestures which are mapped to a command, wherein the command is context-dependent for the purpose of providing a user friendly experience for controlling and managing devices using detected gestures (Broyles, [0010]).
For claim 32, Strauss discloses a device (Abstract) comprising; components configured to: recognize at least one movement of a user represented in a sequences of images (position or orientation of the head changes over time,  [0057] [0058] [0112 - 0116]); apply the at least one movement of the user to determine an indicated change to a sound field produced by an array of loudspeakers (filter characteristic generator generates time varying filter characterizes for the programmable filters in response to the position of orientation of the head as determined by the image analyzer, [0057] [0058] [0081] [0082]); and synthesize a modified sound field produced by the array of loudspeakers to implement the indicated change to the sound field for menu navigation ([0016] [0034] [0081] [0112]). Yet, Strauss fails to teach the following: the at least one recognized movement of the user is gesture, wherein the gesture is mapped to a command, wherein the command is context dependent; and  a memory configured to store a sequence of images, wherein the memory is coupled to the one or more processors configured to perform the aforementioned method.
However, Black discloses an apparatus and method for recognizing facial expressions (Abstract) comprising a memory (Fig.1, 14) configured to store a sequence of images (column 6 lines 66 – column 7 lines 5, 45 – 55); and one or more processors coupled to the memory (image sequence manager, motion estimation system, image segmentation system, region tracking system and feature motion detector, Fig.1, 10, 12, 16 and 18), the one or more processors configured to recognize at least one movement of a user (motion of human head and facial features) represented in the sequence of images (column 7 lines 1 – 44; column 24 lines 6 – column 25 line 13).
Additionally, Broyles discloses a method for executing a command (Abstract), wherein any recognized movement of a user is a gesture ([0014] [0015] [0023] [0024]). Additionally, the gestures are mapped to a command ([0024 – 0026]), wherein the command is context-dependent (The command is mapped to a gesture in a given mode of operation. Therefore, the command is dependent on a context comprising a mode of operation, Fig.3; [0026] [0044 – 0049]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to modify Strauss’ teachings with Black’s teachings so that the device further comprises a memory configured to store a sequence of images and one more processors coupled to the memory, wherein the processors are configured to perform the aforementioned method for the purpose increasing user satisfaction by focusing sound at a user human head or ears (Strauss, [0002 -0008]). 
Moreover, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to improve the invention disclosed by the combination of  Strauss and Black in the same way that Broyles’ invention has been improved to achieve the predictable results of recognized movements of the user (changes in head orientation) further comprising gestures which are mapped to a command, wherein the command is context-dependent for the purpose of providing a user friendly experience for controlling and managing devices using detected gestures (Broyles, [0010]).

For claim 33, Strauss discloses a device (Abstract) comprising; components configured to: recognize at least one movement of a user represented in a sequences of images (position or orientation of the head changes over time,  [0057] [0058] [0112 - 0116]); apply the at least one movement of the user to determine an indicated change to a sound field produced by an array of loudspeakers (filter characteristic generator generates time varying filter characterizes for the programmable filters in response to the position of orientation of the head as determined by the image analyzer, [0057] [0058] [0081] [0082]); and synthesize a modified sound field produced by the array of loudspeakers to implement the indicated change to the sound field for user - interface feedback via sound ([0016] [0034] [0081] [0112]). Yet, Strauss fails to teach the following: the at least one recognized movement of the user is gesture, wherein the gesture is mapped to a command, wherein the command is context dependent; and  a memory configured to store a sequence of images, wherein the memory is coupled to the one or more processors configured to perform the aforementioned method.
However, Black discloses an apparatus and method for recognizing facial expressions (Abstract) comprising a memory (Fig.1, 14) configured to store a sequence of images (column 6 lines 66 – column 7 lines 5, 45 – 55); and one or more processors coupled to the memory (image sequence manager, motion estimation system, image segmentation system, region tracking system and feature motion detector, Fig.1, 10, 12, 16 and 18), the one or more processors configured to recognize at least one movement of a user (motion of human head and facial features) represented in the sequence of images (column 7 lines 1 – 44; column 24 lines 6 – column 25 line 13).
Additionally, Broyles discloses a method for executing a command (Abstract), wherein any recognized movement of a user is a gesture ([0014] [0015] [0023] [0024]). Additionally, the gestures are mapped to a command ([0024 – 0026]), wherein the command is context-dependent (The command is mapped to a gesture in a given mode of operation. Therefore, the command is dependent on a context comprising a mode of operation, Fig.3; [0026] [0044 – 0049]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to modify Strauss’ teachings with Black’s teachings so that the device further comprises a memory configured to store a sequence of images and one more processors coupled to the memory, wherein the processors are configured to perform the aforementioned method for the purpose increasing user satisfaction by focusing sound at a user human head or ears (Strauss, [0002 -0008]). 
Moreover, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to improve the invention disclosed by the combination of  Strauss and Black in the same way that Broyles’ invention has been improved to achieve the predictable results of recognized movements of the user (changes in head orientation) further comprising gestures which are mapped to a command, wherein the command is context-dependent for the purpose of providing a user friendly experience for controlling and managing devices using detected gestures (Broyles, [0010]).

Claims 2, 10 and 23 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Strauss et al. (US 2011/0103620) (“Strauss”) in view of Black et al. ( US 5,774,591) (“Black”), and further in view of Broyles et al. (US 2012/0005632) (“Broyles”)  and further in view of Swan et al. (US 6,351,222) (“Swan”).
For claim 2, the combination of Strauss, Black and Broyles further discloses wherein, the gesture is interpreted as one of a plurality of separate patterns (Black, face recognition software comprises gesture recognition, wherein each gestures is pattern, column 7 lines 1 – 44; column 9 lines 21 – 33; column 24 lines 6 – column 25 line 13) (Broyles, each gesture is a pattern Fig.3; [0014] [0015] [0022 – 0025]), and some of the patterns are used to synthesize a sound field (Broyles, Volume Up and Volume Down, Fig.3; [0045]), yet fails to teach decisions by one or more processors are made to synthesize the corresponding sound field associated with the separate patterns.
However, Swan discloses a method for the purpose of processing gesture input commands by an entertainment device (Abstract), wherein a gesture (moving the head) is interpreted as one of a plurality of separate patterns (gesture commands associated with motion artifacts, column 3 lines 15 – 25 and column 6 lines 51 – 59) and used by one or more processors (processing module, Fig.2, 50) to decide to synthesize a corresponding sound (adjust  volume) associated with the separate patterns (column 4 lines 44 – column 5 lines 29 and 55 – 67).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to improve the invention disclosed by the combination of Strauss, Black and Broyles in the same way that Swan’s invention has been improved to achieve the predictable results of the e one or more processors further making decisions to synthesize the sound field associated with the separate patterns for the purpose of increasing user satisfaction by enabling users to control entertainment device using gestures (Swan, column 1 lines 10 – 64).

For claim 10, the combination of Strauss, Black and Broyles fails to teach, wherein a voice command is used to enter a gesture control mode.
 However, Swan discloses a method for the purpose of processing gesture input commands by an entertainment device (Abstract), wherein a gesture is interpreted as one of a plurality of separate patterns (gesture commands associated with motion artifacts, column 3 lines 15 – 25 and column 6 lines 51 – 59) in a gesture control mode which is entered using a voice command (column 4 lines 1 – 60).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to improve the invention disclosed by the combination of Strauss, Black and Broyles in the same way that Swan’s invention has been improved to achieve the predictable results of the gesture (moving the head) further being interpreted as one of a plurality of separate patterns in a gesture control mode which is entered using a voice command for the purpose of increasing user satisfaction by enabling users to control entertainment device using gestures (Swan, column 1 lines 10 – 64).

	For claim 23,  the combination of Strauss, Black and Broyles fails to teach wherein the gestures are at least one among the following gestures: a clockwise hand movement, a counterclockwise hand movement, a hand rotation, a hand grasping, and a hand releasing.
However, Swan discloses a method for the purpose of processing gesture input commands by an entertainment device (Abstract), wherein a gesture is interpreted as one of a plurality of separate patterns (gesture commands associated with motion artifacts, column 3 lines 15 – 25 and column 6 lines 51 – 59), the gesture is mapped to a command (Swan, column 4 lines 48 – 60 and column 6 lines 29 – 59), and a command interpreter (gesture interpretation module 52) integrated into the one or more processors (column 2 lines 49 – 56 and column 3 lines 55 – 67) is configured to disable changes to a current sound field configuration, or enable changes to a current sound (increase/decrease volume as enabling changes to sound, column 4 lines  48 – column 5 lines 6, 10 – 26), wherein the gesture is a  hand rotation (waving, column 3 lines 18 – 23)
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to improve the invention disclosed by the combination of Strauss, Black and Broyles in the same way that Swan’s invention has been improved to achieve the predictable results of the gesture further comprising a hand rotation (Swan, waving, column 3 lines 18 – 23) for the purpose of increasing user satisfaction by enabling users to control entertainment device using gestures (Swan, column 1 lines 10 – 64).

Claim 9 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Strauss et al. (US 2011/0103620) (“Strauss”) in view of Black et al. ( US 5,774,591) (“Black”), and further in view of Broyles et al. (US 2012/0005632) (“Broyles”), and further in view of Hardracker et al. (US 2009/0304205) (“Hardracker”).
For claim 9, the combination of  Strauss, Black and Broyles fails to teach, wherein the one or more processors are configured to control the array of loudspeakers to generate beams in different directions to support gesture control independently for different user located in different directions.
However, Hardracker discloses techniques for personalizing audio (Abstract), wherein an array of loudspeakers generates beams in different directions to support gesture control independently for different users located in different directions (Fig.1 and Fig.2; [0010] [0011] [0018] [0019] [0021]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify the combined teachings of Strauss, Black and Broyles with Hardracker’s teachings so that the one or more processors are configured to control the array of loudspeakers to generate beams in different directions to support gesture control independently for different user located in different directions for the purpose of further enhancing a user’s listening experience (Hardracker, [0001] [0009]).

Claim 13 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Strauss et al. (US 2011/0103620) (“Strauss”) in view of Black et al. ( US 5,774,591) (“Black”), and further in view of Broyles et al. (US 2012/0005632) (“Broyles”), and further in view of Dondi (“Gesture Recognition by Data Fusion of Time-of-Flight and Color Cameras”).
For claim 13, the combination of Strauss, Black and Broyles fails to teach an infrared light detector configured to capture the sequence of images based on a laser distance measurement.
However, Dondi discloses a method for recognizing gestures (Abstract), wherein an infrared light detector (time-of-flight camera) captures a sequence of images based on a laser distance measurement (time of flight) (II. Cameras Specifications and A. Gestures Recognition, pg. 1954, 1955 and 1957).
Moreover, the combination of Strauss, Black and Broyles further discloses wherein the gestures are recognized using a time of flight camera (Broyles, [0039]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve the invention disclosed by the combination of Strauss, Black and Broyles in the same way that that the invention disclosed by Dondi has been improved to achieve the predictable results of time-of-flight camera further being an infrared light detector which is configured to capture the sequence of images based on a laser distance measurement for the purpose of improving the flexibility in providing input commands, gestures, to control sound production from entertainment devices using a ToF camera which  is well suited to capture images due to providing a depth edge which is perfectly registered with a brightness image. 

Claim 16 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Strauss et al. (US 2011/0103620) (“Strauss”) in view of Black et al. ( US 5,774,591) (“Black”), and further in view of Broyles et al. (US 2012/0005632) (“Broyles”) and further in view of Thissen (“A comparative study of optical depth sensors for user interaction”).
For claim 16, the combination of Strauss, Black and Broyles fails to teach a projector configured to project a pattern of stripes, a pattern of dots, or both a pattern of stripes and dots onto a part of the user and estimate depths of surface points of the part of the user
However, Thissen discloses a method for detecting images to perform gesture control (Abstract), wherein movement is recognized based on depth information generated by optical depth sensors (stereo, structured light, time-of-flight) (2. Optical Sensors, 2.1, Triangulation, 2.2 Time-Of Flight and 4.1 Finger based gesture control, pg. 5 – 15 and 51 -57) and a projector configured to project a pattern of stripes, a pattern of dots, or both a pattern of stripes and dots onto a part of the user and estimate depths of surface points of the part of the user (Swan, Fig.1, 20) (Figure 8, Structured light and 4.1 Finger based gesture control, pg. 10, 11, 51 – 57).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to apply the optical depth sensors disclosed by Thiessen to the cameras disclosed by the combination of Strauss, Black and Broyles to achieve the predictable results of the movement recognized by the processors being further based on depth information using a projector configured to project a pattern of stripes, a pattern of dots, or both a pattern of stripes and dots onto a part of the user and estimate depths of surface points of the part of the user for the purpose of providing a cost effective and enhanced method for the purpose of increasing the flexibility in  providing input commands to control sound production from entertainment devices by enabling and employing vision based gesture recognition.

Claim 17 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Strauss et al. (US 2011/0103620) (“Strauss”) in view of Black et al. ( US 5,774,591) (“Black”), and further in view of Broyles et al. (US 2012/0005632) (“Broyles”), and further in view of Thissen (“A comparative study of optical depth sensors for user interaction”) and further in view of Pearse (“Gestural Mappings-Towards the Creation of Three Dimensional Composition Environment”).
For claim 17, the combination of Strauss, Black, Broyles and Thissen further discloses that the Kinect which comprises the projector further comprises an image detector to capture an image of an illuminated part of the user (Thissen, IR camera/imaging unit, Figure 8 and Structured Light; pg. 9 and 10), yet fails to teach a laser to emit light and a diffraction grating to impose the pattern on the emitted light.
However, Pearse discloses a method for applying gesture control to signal processing (Abstract), wherein a Kinect comprises a laser to emit light and a diffraction grating to impose a pattern on the emitted light (5. Kinect, pg. 128).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to modify the teachings of Strauss, Black, Broyles and Thissen with Pearse’s teachings so that the Kinect further comprises a laser to emit light and a diffraction grating to impose the pattern on the emitted light for the purpose of increasing the flexibility in  providing input commands to control sound production from entertainment devices by enabling and employing vision based gesture recognition. 

Claim 18 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over H Strauss et al. (US 2011/0103620) (“Strauss”) in view of Black et al. ( US 5,774,591) (“Black”), and further in view of Broyles et al. (US 2012/0005632) (“Broyles”) and further in view of Dahl et al. (US 2011/0254762) (“Dahl”).
For claim 18, the combination of Strauss, Black and Broyles fails to teach, wherein the recognized movements of the user are based on an array of ultrasound transducers used to perform spatial imaging.
However, Dahl discloses a method for detecting movement of objects including hands and fingers (Abstract; [0002] [0028]) using an array of ultrasound transducers to perform spatial imaging ([0003] [0009] [0013] [0036] [0037]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to improve the invention disclosed by the combination of Strauss, Black and Broyles in the same way that Dahl’s invention has been improved to achieve the predictable results of further recognizing movements of the user based on an array of ultrasound transducers used to perform spatial imaging for the purpose of increasing the flexibility in providing input commands to control sound production from entertainment devices by enabling and employing vision based gesture recognition.

Claim 19 and 20 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Strauss et al. (US 2011/0103620) (“Strauss”) in view of Black et al. ( US 5,774,591) (“Black”), and further in view of Broyles et al. (US 2012/0005632) (“Broyles”) and further in view of Keenan (US 2009/0102800).
For claims 19 and 20, the combination of Strauss, Black and Broyles fails to teach an on-screen display to provide feedback for a gesture command, wherein the feedback is a bar or dial to display a change in beam intensity, beam direction or dynamic range.
However, Keenan discloses an interactive input system controller and method (Abstract), wherein feedback for a gesture command includes a bar or dial to display a change in intensity on an on-screen display (Fig.2, 48, [0035]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify the combined teachings of Strauss, Black and Broyles with Kennan’s teachings so that the system further comprises an on-screen display to provide feedback for a gesture command, wherein the feedback is a bar or dial to display a change in beam intensity for the purpose of enabling and employing vision based gesture recognition.

Claim 21 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Strauss et al. (US 2011/0103620) (“Strauss”) in view of Black et al. ( US 5,774,591) (“Black”), and further in view of Broyles et al. (US 2012/0005632) (“Broyles”), and further in view of Keenan (US 2009/0102800) and further in view of Hong et al. (WO 2011/059202) (“Hong”).
For claim 21, the combination of Strauss, Black, Broyles and Kenan fails to teach, wherein the on-screen display is configured to display an error indication of an invalid gesture command.
	However, Hong discloses a method for controlling a display device (Abstract), wherein an on-screen display is configured to display an error indication of an invalid gesture (Fig.31, [0081 – 0083] [0150] [0151]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve the invention disclosed by the combination of Strauss, Black, Broyles and Keenan in the same way that Hong’s invention has been improved to achieve the predictable results of the on-screen display being further configured to display an error indication of an invalid gesture command for the purpose of increasing the flexibility in providing input commands to control sound production from entertainment devices by enabling and employing vision based gesture recognition.

Claims 22 and 24 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable Strauss et al. (US 2011/0103620) (“Strauss”) in view of Black et al. ( US 5,774,591) (“Black”), and further in view of Broyles et al. (US 2012/0005632) (“Broyles”) and further in view of Reville et al. (US 2011/0289455) (“Reville”).
For claim 22, the combination of Strauss, Black and Broyles fails to teach, wherein the gesture is at least one among the following gestures: two-hand gesture, hand-and-head gesture, hand and body gesture, and hand to ear gesture.
However, Reville discloses a method for operating a user interface with gestures (Abstract), wherein gestures include two hands (Fig.16A- Fig.17B, [0158 - 0160]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify the combined teachings of  Strauss, Black and Broyles with Reville’s teachings so that the recognized movements of the user further comprise a gesture, wherein the gesture is at least one among the following gestures: two-hand gesture, hand-and-head gesture, hand and body gesture, and hand to ear gesture for the purpose of increasing the flexibility in providing input commands to control sound production from entertainment devices by enabling and employing vision based gesture recognition.

For claim 24, the combination of Strauss, Black and Broyles fails to teach, wherein the gesture is at least one among the following gestures: a hand vertical movement, a hand movement toward a sensor used to capture the sequence of images, a hand movement away from the sensor used to capture the sequence of images, hands moving in a same direction, hands moving toward each other, and hands moving away from each other.
However, Reville discloses a method for operating a user interface with gestures (Abstract), wherein gestures include hands moving away from each other (Fig.17A and Fig.17 B; [0158 - 0160]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify the combined teachings of Strauss, Black and Broyles with Reville’s teachings so that the recognized movements of the user further comprise a gesture, wherein the gesture is at least one among the following gestures: a hand vertical movement, a hand movement toward a sensor used to capture the sequence of images, a hand movement away from the sensor used to capture the sequence of images, hands moving in a same direction, hands moving toward each other, and hands moving away from each other for the purpose of increasing the flexibility in providing input commands to control sound production from entertainment devices by enabling and employing vision based gesture recognition.

Claims 28 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Strauss et al. (US 2011/0103620) (“Strauss”) in view of Black et al. ( US 5,774,591) (“Black”), and further in view of Broyles et al. (US 2012/0005632) (“Broyles”), and further in view of Keenan (US 2009/0102800) and further in view of Bennett (US 2013/0106686)
For claim 28, the combination of Strauss, Black and Broyles fails to teach, wherein the at least one recognized movement of the user represent a sequence of two or more gestures, wherein the sequence of two or more gestures are used to synthesize the modified sound field produced by the array of loudspeaker for menu navigation.
However, Bennett disclose a gesture processing framework (Abstract), wherein  a sequence of two or more gestures are mapped to a control signal to change a sound volume ([0061] [0128]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to modify the combined teachings of Strauss, Black and Broyles with Bennett’s teachings  so that the at least one recognized movement of the user represent a sequence of two or more gestures, wherein the sequence of two or more gestures are used to synthesize the modified sound field produced by the array of loudspeaker for menu navigation for the purpose of providing a user friendly experience for controlling and managing devices using detected gestures (Broyles, [0010]).

Claims 37 and 40 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Strauss et al. (US 2011/0103620) (“Strauss”) in view of Black et al. ( US 5,774,591) (“Black”), and further in view of Broyles et al. (US 2012/0005632) (“Broyles”) and further in view of Clavin et al. (US 9,268,404) (“Clavin”).
For claim 37, the combination of Strauss, Black and Broyles fails to teach, wherein the command is context dependent includes that a command is produced only in response to representation of the gesture that are appropriate for a current context.
However, Clavin discloses a gesture based system (Abstract), wherein a command (control) is produced only in response to representation of a gesture that are appropriate for a given context (mode) ([0036 – 0039]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to modify the combined teachings of Strauss, Black and Broyles with Clavin’s teachings so that the command is further produced only in response to representation of the gesture that are appropriate for a current context for the purpose of enhancing the efficiency of the gesture based system by organizing the gestures and related commands according modes.

For claim 40, Clavin further discloses, wherein the representation of the gesture that are appropriate for the current context indicates whether the command is applied locally or globally(reserved gesture) (Clavin, [0038] [0039] [0046 – 0048]).

Response to Arguments
Applicant’s argument’s regarding claims 1 –  30  have been considered but are moot in view of the new grounds of rejection.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SONIA L GAY whose telephone number is (571)270-1951. The examiner can normally be reached Monday-Friday 9-5 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SONIA L GAY/Primary Examiner, Art Unit 2657