DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/23/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Objections
Claim 27 is objected to because of the following informalities:  “a hand over at the portion” appears to be misspelling of “a hand over at least the portion”.  Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims 27-30 in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 

Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

Claim Rejections - 35 USC § 102

A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-2, 4-5, 7-8, 10-12, 16-20, 22-29 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by RYU et al. (US 2012/0179472).
Regarding claim 1, RYU et al. do teach a device to process an audio signal representing input sound (¶ 0014: “The device” (i.e., “device” “100” (a device)) “may further include a voice input unit which receives a voice input, and the control unit, if a first motion representing a voice recognition mode is recognized, may convert a motion recognition mode to the voice recognition mode and performs a control operation of the electronic device according to the voice input”), 
the device comprising:
a hand detector configured to generate a first indication responsive to detection of at least a portion of a hand over at least a portion of the device (¶ 0085 sentence 1 referring to Fig. 9: “analyzing photographing image data to recognize push motion” (a first indication generated corresponding to a portion of image of a hand as shown in Fig. 9); ¶ 0040 sentence 1: “the motion recognition unit 110 may recognize user motion with respect to the electronic device 100 such as a push motion in a direction of the electronic device 100” (“push motion” is determined by “the motion recognition unit 
an automatic speech recognition system configured to be activated, responsive to the first indication, to process the audio signal (¶ 0093 lines 3+: “if a user wishes to use a voice recognition mode” (speech recognition system activated ) “user may make a push motion” (in response to the first indication) “by raising his or her hand and stretching the hand frontward” (attributed to images of a portion of the user’s hand) “Accordingly speech recognition mode may be entered immediately” (to process any audio input)).

Regarding claim 2, RYU et al. do teach the device of claim 1, further comprising:
a screen, wherein the hand detector is configured to generate the first indication responsive to detection of at least a portion of the hand over the screen (¶ 0072: “the x-axis and the y-axis are disposed to form a horizontal surface with respect to a display screen” (a screen used by the “MOTION RECOGNITION UNIT” (the hand detector)) “In the push motion” (e.g. the motion responsible for the “voice recognition” activation above)) “user’s palm” (for e.g. the first indication corresponding to detection of the hand image) “moves in the direction of the electronic device 100 along the z-axis”) ; and
a microphone configured to be activated, responsive to the first indication, to generate the audio signal based on the input sound (¶ 0066 sentence 2: “if a motion for 

Regarding claim 4, RYU et al. do teach the device of claim 1, further comprising one or more sensors coupled to the hand detector and configured to provide sensor data to the hand detector (¶ 0060: “the motion recognition unit 110” (the hand detector) “includes a photographing unit” “the photographing unit may have a” “3D” “depth camera” (includes a “camera” (one or more sensors to provide hand images (sensor data))).

Regarding claim 5, RYU et al. do teach the device of claim 4, wherein the one or more sensors include a camera configured to provide image data to the hand detector (¶ 0060: “the motion recognition unit 110” (the hand detector) “includes a photographing unit” “the photographing unit may have a” “3D” “depth camera” (includes a “camera” (to provide hand images (sensor data) such as the one in Fig 9)).

Regarding claim 7, RYU et al. do teach the device of claim 5, wherein the hand detector includes a hand pattern detector configured to process the image data to determine whether the image data includes a hand pattern (¶ 0074 last sentence: “if a 

Regarding claim 8, RYU et al. do teach the device of claim 7, wherein the one or more sensors further include an infrared sensor (¶ 0060 lines 5-7: “The 3D depth camera irradiates infrared rays and measures the time for the infrared rays” (the one or more sensors comprise of an infrared sensor) “to reach an object and return to the camera”).

Regarding claim 10, RYU et al. do teach the device of claim 1, further comprising activation circuitry coupled to the hand detector and that is configured to activate the automatic speech recognition system in response to receiving the first indication (¶ 0093 lines 5+: “user may make push motion” (upon recognition of the first indication) “a voice recognition mode may be entered immediately” (automatically activating the automatic speech recognition system)).

Regarding claim 11, RYU et al. do teach the device of claim 1, wherein the automatic speech recognition system includes a buffer and an automatic speech recognition engine, and wherein activating the automatic speech recognition system 

Regarding claim 12, RYU et al. do teach the device of claim 11, wherein the hand detector is further configured to generate a second indication in response to detection that the portion of the hand is no longer over the portion of the device, the second indication corresponding to an end-of- utterance signal that causes the automatic speech recognition engine to begin processing audio data from the buffer (¶ 0067 lines 4+: “control unit 120 may stop the motion recognition mode” (a second indication is generated that no longer requires any “motion” (hand detection over the device)) “automatically and perform voice control in the voice recognition mode” (that causes the automatic speech recognition engine to process audio from e.g. the “analyzing unit” (the buffer)).


the method comprising:
detecting, at a device, at least a portion of a hand over at least a portion of the device (¶ 0085 sentence 1 referring to Fig. 9: “analyzing photographing image data to recognize push motion” (a first indication generated corresponding to a portion of image of a hand as shown in Fig. 9); ¶ 0040 sentence 1: “the motion recognition unit 110 may recognize user motion with respect to the electronic device 100 such as a push motion in a direction of the electronic device 100” (“push motion” is determined by “the motion recognition unit 110” (a hand detector)); ¶ 0072 last sentence: “In the push motion” “user’s palm” “moves in the direction of the electronic device” ) ; and
responsive to detecting the portion of the hand over the portion of the device, activating an automatic speech recognition system to process the audio signal (¶ 0093 lines 3+: “if a user wishes to use a voice recognition mode” (automatic speech recognition system activated ) “user may make a push motion” (in response to the first indication corresponding to detection of the portion of the hand over the device) “by 

Regarding claim 17, RYU et al. do teach the method of claim 16, 
Wherein the portion of the device includes a screen of the device (¶ 0072: “the x-axis and the y-axis are disposed to form a horizontal surface with respect to a display screen” (a screen included by the “MOTION RECOGNITION UNIT” (the hand detector) in the “device” (the device)) “In the push motion” (e.g. the motion responsible for the “voice recognition” activation above)) “user’s palm” (for e.g. the first indication corresponding to detection of the hand image) “moves in the direction of the electronic device 100 along the z-axis”) ; and
Further comprising, responsive to detecting the portion of the hand over the screen, activating a microphone to generate the audio signal based on the input sound (¶ 0066 sentence 2: “if a motion for entering into a voice recognition mode is sensed, the control unit 120 activates” (activating in response to the motion which is detected via  the screen) “the voice input unit” (i.e., a “microphone” (¶ 0053 sentence 2: “voice input unit 170 may include a microphone”)).



Regarding claim 19, RYU et al. do teach the method of claim 16, wherein activating the automatic speech recognition system includes initiating buffering of the audio signal (¶ 0053 lines 3-5: “voice input unit 170” “include[s]” “an analyzing unit” (a buffer) and according to ¶ 0093 there exists a “voice recognition mode” (an automatic speech recognition engine); ¶ 0054: “The analyzing unit” (the buffer associated with the “voice” input to “voice” (automatic speech) recognition system) “performs mathematical conversion processing” (for buffering) “with respect to a received voice input signal” (of the audio signal) “at a short period of every 20-30 ms”).



Regarding claim 22, RYU et al. do teach the method of claim 20, wherein detecting the portion of the hand over the portion of the device further includes processing infrared sensor data from an infrared sensor of the device (¶ 0060 lines 5-7: “The 3D depth camera irradiates infrared rays and measures the time for the infrared rays” (the one or more sensors comprise of an infrared sensor) “to reach an object and return to the camera” (to detect e.g. the hand image in Fig. 9)).

Regarding claim 23, RYU et al. do teach a non-transitory computer-readable medium comprising instructions that, when executed by one or more processors of a device (¶ 0096:  “Meanwhile, a program code for performing the above-mentioned controlling method may be stored in various types of recording media readable by a terminal”), 

the operations comprising:
detecting at least a portion of a hand over at least a portion of the device (¶ 0085 sentence 1 referring to Fig. 9: “analyzing photographing image data to recognize push motion” (a first indication generated corresponding to a portion of image of a hand as shown in Fig. 9); ¶ 0040 sentence 1: “the motion recognition unit 110 may recognize user motion with respect to the electronic device 100 such as a push motion in a direction of the electronic device 100” (“push motion” is determined by “the motion recognition unit 110” (a hand detector)); ¶ 0072 last sentence: “In the push motion” “user’s palm” “moves in the direction of the electronic device” ) ; and
responsive to detecting the portion of the hand over the portion of the device, activating an automatic speech recognition system to process the audio signal (¶ 0093 lines 3+: “if a user wishes to use a voice recognition mode” (automatic speech recognition system activated ) “user may make a push motion” (in response to the first indication corresponding to detection of the portion of the hand over the device) “by 


Regarding claim 24, RYU et al. do teach the non-transitory computer-readable medium of claim 23, 
Wherein the portion of the device includes a screen of the device (¶ 0072: “the x-axis and the y-axis are disposed to form a horizontal surface with respect to a display screen” (a screen included by the “MOTION RECOGNITION UNIT” (the hand detector) in the “device” (the device)) “In the push motion” (e.g. the motion responsible for the “voice recognition” activation above)) “user’s palm” (for e.g. the first indication corresponding to detection of the hand image) “moves in the direction of the electronic device 100 along the z-axis”) ; and
The operations further comprising, responsive to detecting the portion of the hand over the screen, activating a microphone to generate the audio signal based on the input sound (¶ 0066 sentence 2: “if a motion for entering into a voice recognition mode is sensed, the control unit 120 activates” (activating in response to the motion which is detected via  the screen) “the voice input unit” (i.e., a “microphone” (¶ 0053 sentence 2: “voice input unit 170 may include a microphone”)).

Regarding claim 25, RYU et al. do teach the non-transitory computer-readable medium of claim 23, the operations further comprising:  detecting that the portion of the hand is no longer over the portion of the device; and responsive to detecting that the portion of the hand is no longer over the portion of the device, providing an end-of- utterance signal to the automatic speech recognition system (¶ 0067 lines 4+: “control unit 120 may stop the motion recognition mode” (a second indication is generated that no longer requires any “motion” (hand detection over the device)) “automatically and perform voice control in the voice recognition mode” (that causes the automatic speech recognition engine to process audio from e.g. the “analyzing unit”)).

Regarding claim 26, RYU et al. do teach the non-transitory computer-readable medium of claim 23, wherein detecting the portion of the hand over the portion of the device includes processing sensor data to detect a hand shape (¶ 0074 last sentence: “if a hand, that is, the object 11 touches an object, the size of a pixel group” (a pattern of the hand image or shape is detected) “corresponding to the object 11” (as shown in Fig. 9 by using the pixel information)) “decreased”).

Regarding claim 27, RYU et al. do teach an apparatus to process an audio signal representing input sound (¶ 0014: “The device” (i.e., “device” “100” (a device or 
the apparatus comprising:
means for detecting at least a portion of a hand over at least a portion of a device (¶ 0085 sentence 1 referring to Fig. 9: “analyzing photographing image data to recognize push motion” (a first indication generated corresponding to a portion of image of a hand as shown in Fig. 9); ¶ 0040 sentence 1: “the motion recognition unit 110 may recognize user motion with respect to the electronic device 100 such as a push motion in a direction of the electronic device 100” (“push motion” is determined by “the motion recognition unit 110” (a hand detector (means for detecting))); ¶ 0072 last sentence: “In the push motion” “user’s palm” “moves in the direction of the electronic device” ) ; and
means for processing the audio signal, the means for processing configured to be activated responsive to detection of the portion of a hand over at least the portion of the device (¶ 0093 lines 3+: “if a user wishes to use a voice recognition mode” (automatic speech recognition system (means for processing) activated ) “user may make a push motion” (in response to the first indication corresponding to detection of the portion of the hand over the device) “by raising his or her hand and stretching the 

Regarding claim 28, RYU et al. do teach the apparatus of claim 27, further comprising: 
Means for displaying information, wherein the means for detecting is configured to detect the portion of the hand over the means for displaying (¶ 0072: “the x-axis and the y-axis are disposed to form a horizontal surface with respect to a display screen” (a screen (means for displaying information) included by the “MOTION RECOGNITION UNIT” (the hand detector (means for detecting) in the “device” (the device)) “In the push motion” (e.g. the motion responsible for the “voice recognition” activation above)) “user’s palm” (for e.g. the first indication corresponding to detection of the hand image) “moves in the direction of the electronic device 100 along the z-axis”) ; and
Means for generating the audio signal based on the input sound, the means for generating configured to be activated responsive to the detection of the portion of the hand over the means for displaying  (¶ 0066 sentence 2: “if a motion for entering into a voice recognition mode is sensed, the control unit 120 activates” (activating in response to the “motion” (¶ 0093 “hand” gesture) associated with the hand image which is detected via  the screen (means for displaying)) “the voice input unit” (i.e., a 

Regarding claim 29, RYU et al. do teach the apparatus of claim 27, further comprising means for generating image data, and wherein means for detecting is configured to determine whether the image data includes a hand pattern (¶ 0074 last sentence: “if a hand, that is, the object 11 touches an object, the size of a pixel group” (a pattern of the hand image is detected) “corresponding to the object 11” (as shown in Fig. 9 by using the “pixel” information (image data obtained using means for generating the image data))) “decreased”).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3, 6, 13, 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over RYU et al., and further in view of HIROKI (US 2020/0104629).
Regarding claim 3, RYU et al. do teach The device of claim 2, wherein the hand detector is configured to generate the distance of a hand from the device (¶ 0060 last 
RYU et al. do not specifically disclose:
first indication responsive to detection that the portion of the hand is at a distance of 10 centimeters to 30 centimeters from the screen.
HIROKI do teach:
first indication responsive to detection that the portion of the hand is at a distance of 10 centimeters to 30 centimeters from the screen (¶ 0006 lines 7+: “a second recognition unit configured to recognize a state of [vehicle occupant] hand including” “a shape of the hand” “based on the image captured by the imaging unit”; ¶ 0027 lines 2+: “imaging unit” “is provided at a position” “for example, a position near a roof on the front side of the vehicle” (in all conventional vehicles the distance between a vehicle driver and front roof of the vehicle is between 10-30 centimeters)).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the function of the “DEVICE CONTROL APPRATUS” (Fig. 1) of HIROKI into the “device” of RYU et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable RYU et al. to implement his “device” into a “vehicle”.


HIROKI does teach the device of claim 5, wherein the camera includes a low-power ambient light sensor configured to generate the image data (¶ 0027 last sentence:  “either a visible light camera” (using a low power ambient light sensor) “or an infrared camera can be used for the imaging unit 10” (to generate image data)).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the functions of the “cameras” of HIROKI into the camera of RYU et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable RYU et al. to benefit from a “visible light camera” when there is sufficient light that does not require use of its “infrared” camera.

Regarding claim 13, RYU et al. do not specifically disclose the device of claim 1, wherein the hand detector and the automatic speech recognition system are integrated in a vehicle.
HIROKI does teach the device of claim 1, wherein the hand detector and the automatic speech recognition system are integrated in a vehicle (¶ 0045 lines 8+: “For example, in the example of FIG. 4, when the posture of the occupant is recognized as 
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the gestures associated with images of “index finger” for triggering a speech recognizer of HIROKI into the gestures used in RYU et al. for generating speech recognition in general, would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable RYU et al. to be able to use its device in a vehicle.

Regarding claim 21, RYU et al. do not specifically disclose the method of claim 20, wherein the image data is generated at a low-power ambient light sensor of the device.
HIROKI does teach the method of claim 20, wherein the image data is generated at a low-power ambient light sensor of the device (¶ 0027 last sentence:  “either a visible light camera” (using a low power ambient light sensor) “or an infrared camera can be used for the imaging unit 10” (to generate image data)).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the functions of the .


Claims 9, 30 is/are rejected under 35 U.S.C. 103 as being unpatentable over RYU et al., and further in view of SHIKII et al. (US 2015/0105976).
Regarding claim 9, RYU et al. do not specifically disclose the device of claim 8, wherein the hand detector further includes a hand temperature detector configured to process infrared sensor data from the infrared sensor.
SHIKII et al. do teach:
a hand detector further includes a hand temperature detector configured to process infrared sensor data from the infrared sensor (¶ 0230 last sentence: “in order to measure temperature of hands” (to do a hand temperature detection) “the infrared array sensor” (using infrared sensor) “is provided” (to process infrared data); i.e., because according to ¶ 0097 lines 14+: “infrared sensors can measure the surface temperature of an object in a non-contact manner by detecting infrared radiation from the object”).


Regarding claim 30, Ryu et al. do teach the apparatus of claim 29, further comprising at least one of:
means for detecting a distance of the portion of the hand from the device (¶ 0060 last sentence: “The image photographed by the depth camera” (e.g. the hand image in Fig. 9 obtained by the “depth camera” (means for detecting a distance)) “is output as a grey level, and coordinates of width, length and distance” (includes distance information of the hand from the screen) “for each pixel”).
RYU et al. do not specifically disclose:
means for detecting a temperature associated with the portion of the hand.
SHIKII et al. do teach:

It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the functions of the infrared sensor of SHIKKI et al. into the “3D” “infrared ray” “camera” of RYU et al. (¶ 0060), would enable the combined systems and associated methods to perform in combination as they do separately and to further enable RYU et al. to only activate “voice collection” when a human “body is detected” by analysis of “infrared rays” emitted from the body and thus “reduc[e]” “power consumption” as disclosed in GAO (U2020/0202851) ¶ 0039 sentence 2.

Claim 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over RYU et al., and further in view of SHIN et al. (US 2009/0253463).
Regarding claim 14, RYU et al. do not specifically disclose the device of claim 1, wherein the hand detector and the automatic speech recognition system are integrated in a portable communication device.

the device of claim 1, wherein the hand detector and the automatic speech recognition system are integrated in a portable communication device (For a “mobile terminal” (portable communication device (¶ 0094 lines 1-2)) according to ¶ 0094 last sentence: “The voice recognition function may also be activated by” “user’s” “hand gesture” (using a hand image a speech recognition engine in the portable communication device is activated)).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the “hand gesture” of SHIN et al. into the gestures of RYU et al. associated with triggering its “voice recognition” would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable RYU et al. to utilize its device functions in a mobile communication terminal.

Claim 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over RYU et al., and further in view of Suomela et al. (US 2002/0077830).
Regarding claim 15, RYU et al. do not specifically disclose the device of claim 1, wherein the hand detector and the automatic speech recognition system are integrated in a virtual reality or augmented reality headset.

It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the “gestures” for “activating” a “speech recognition” engine of Suomela et al. into the “gestures” used for activating “voice recognition” of RYU et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable RYU et al. to implement its “voice recognition” capabilities in a “wearable computer having head-mounted display” as disclosed in Suomela et al. ¶ 0023 lines 10-12.

Conclusion


Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, DANIEL C WASHBURN can be reached on (571)272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the 






/Farzad Kazeminezhad/
Art Unit 2657
September 9th 2021.