DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is in response to the applicants’ RCE application filed on July 29, 2021 and wherein the Applicant has amended claims 1, 12, and claim 2 was previously canceled.
In virtue of this communication, claims 1, 3-12 are currently pending in this Office Action.
With respect to the objection of claims 1, 3-12 due to formality issues, as set forth in the previous Office Action, the Applicant’s amendment, and argument, see paragraph 2-3 of page 5 in Remarks filed on July 29, 2021, have been fully considered and the argument is persuasive. Therefore, the objection of claims 1, 3-12 due to the formality issues, as set forth in the previous Office Action, has been withdrawn.
The Examiner appreciates the explanation of the amendment and analyses of the prior arts, and however, although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993) and MPEP 2145.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to 

Claims 1, 3-4, 6, 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Hart et al (US 20120062729 A1, hereinafter Hart) and in view of reference Kuruba et al (US 20160336913 A1, hereinafter Kuruba).
Claim 1: Hart teaches a speech input method (title and abstract, ln 1-12, and method steps in figs. 6-7 and a system in figs. 1-2), comprising:
detecting whether a user's face is in proximity to a speech Input device including one or more microphones (determining whether a speaker is within a viewable range, i.e., proximity of a computing device 302 in figs. 3a-d, para [0039], e.g., a first person is more closer to the device than the second person; one or more microphones pickup voice from the first person, para [0039]); and
performing correction processing on an audio signal obtained through a sound collection by the one or more microphones when it is detected that the user's face is in proximity to the speech input device (an optimal processing performed to focus/capture the voice from detection of the closed or primary person by further filtering out sound captured by the microphone from a particular direction but emitted from directions other than the relative position of the first person 308, para [0039]; including removing background noise at step 708 after the active user is determined which is determined at step 702 in fig. 7, para [0058]),
wherein the audio signal is obtained through the sound collection by the one or more microphones and has a single directivity (the collected audio signal has a direction focused on the active speaker or closed speaker by determining active user from image data at 702 and further filtering, para [0039], para [0058]), and 

However, Hart does not explicitly teach wherein the correction processing includes a process of converting single directivity into omni-directional directivity to sense the user’s voice.
Kuruba teaches an analogous field of endeavor by disclosing a voice input method (title and abstract, ln 1-15 and method steps in fig. 5) and wherein one or more microphones are disclosed (microphone elements 301-1 … 301-n on a boom 206 in figs. 2A/2B, 3) and correction processing is disclosed (carried out by audio processing 230 in fig. 2C) and wherein the correction processing includes a process of converting single directivity into omni-directional directivity to sense the user’s voice (while the ambient noise level is below the user’s voice level, switching to omnidireciton beamforming pattern of the microphone signals at the quiet mode, from other microphone signal pattern such narrow beam to a particular direction of a desired sound such as wearer’s voice at loud mode in fig. 4, para [0014], para [0062]; for increasing sound quality, para [0041]; desired sound source is the user’s mouth, para [0057]) for benefits of achieving an improved sound quality of the sound pickup device by flexibly configuring and minimizing noise in a variety of acoustic environments (para [0025]-[0026], para [0041]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied the correction processing and wherein the correction processing includes the process of converting the single directivity into the omni-directional directivity to sense the user’s voice, as taught by Kuruba, to the correction processing in the method, as taught by Hart, for the benefits discussed above.
Claim 3: the combination of Hart and Kuruba further teaches, according to claim 1 above, wherein the correction processing includes a process of decreasing gain (Hart, altering at least one of a gain of audio capture element 354, 356 in fig. in fig. 3c, para [0047] and including filtering out noise portion from the microphone signals, i.e., reducing the gain of the microphone signals, para [0029], para [0034]).
Claim 4: the combination of Hart and Kuruba further teaches, according to claim 1 above, wherein the correction processing includes a process of decreasing gain of a component at a predetermined frequency or lower (Hart, reducing noise from the microphone signals by using filtering, para [0029], and thus, inherently voice or speech signal frequency range for the filtering, and Kuruba, through setting up a frequency response of the one or more filters on the audio processing circuitry 230 in fig. 2c, para [0053]).
Claim 6: the combination of Hart and Kuruba further teaches, according to claim 1 above, wherein the speech input device includes a camera (Hart, image capture element 304 can be a camera in fig. 3b, para [0043]), and in the detecting whether the user’s face is in proximity to the speech input device, the detecting is performed according to a change in a size of the user’s face in an image captured by the camera (Hart, mouth movement and face size are captured by a camera, the face size at a near-end is larger than face size at a far-end in fig. 10).

Claim 11 has been analyzed and rejected according to claim 1 above and the combination of Hart and Kuruba further teaches a non-transitory computer-readable recording medium (Hart, memory device 204 in fig. 2 and Kuruba, memory 226 in fig. 2C) for use In a computer (Hart, including processor 202 in fig. 2, and Kuruba, including CPU 222 in fig. 2C), the recording medium having a computer program recorded thereon (Hart, program instructions in the memory in fig. 2, para [0037], and Kuruba, software, para [0015], instructions executed by the CPU, para [0022]) for causing the computer to execute the speech input method according to claim 1 (the discussion in claim 1 above).
Claim 12 has been analyzed and rejected according to claims 1 and 11 above and the combination of Hart and Kuruba further teaches a speech input device (Hart, a computing device 302 in fig. 3a/b and Kuruba, microphones 301 in fig. 3) comprising a detector (Hart, including camera 304 in fig. 3a/3b) and a corrector (Hart, algorithms and software implemented by the processor 202 in fig. 2 and including filtering element, adjustment of directivity, etc., para [0029] and Kuruba, the ambient noise level is below the user’s voice level, switching to omnidireciton pattern from a loud mode in fig. 4, abstract, para [0062], para [0014]; for increasing sound quality, para [0041] and the discussion in claim 1 above).


Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Hart (above) and in view of references Kuruba (above) and Burke et al (US 20180358035 A1, hereinafter Burke).
Claim 5: the combination of Hart and Kuruba further teaches, according to claim 1 above, wherein the speech Input device includes a triaxial accelerometer (Hart, accelerometer or gyroscope to detect tilting of the device and p.5, para 43), and in the detecting whether the user's face is in proximity to the speech Input device, the detecting is performed (Hart, computing device is moved, tilted, or adjusted in position or orientation such that the relative position is updated and detected in a 3D space in fig. 3c-d and p.8, para 59 and such tilting can be detect by an accelerometer, p.5,para 43).
However, the combination of Hart and Kuruba does not explicitly teach that it is based on a result obtained by comparing a pattern that indicates a temporal change In an output for detecting whether the user's face is in proximity to the speech Input device.
Burke teaches an analogous field of endeavor by disclosing the speech input method (title and abstract, ln 1-9 and fig. 4) and wherein a 3D accelerometer is disclosed (213 in fig. 2 and measuring tilting angles in three direction in fig. 5A and relative position determined upon using a Bayesian network 800 in fig. 8A and p.1, para 8 and p.6, para 75) and wherein it is based on a result obtained by comparing a pattern that indicates a temporal change In an output of the triaxial accelerometer (including x_accel at t=i-1, at t=i, and at t=i+1 at different three states and later vector data, including accelerometer vector data is updated based and trained on the previous vector data in a Hidden Markov model 1000 in fig. 10 and p.6, para 75-78) for detecting whether the user's face is in proximity to the speech Input device (different device pose indicating the relative position of the device to the user’s mouth and face in fig. 1) for 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied the accelerometer output data and wherein it is based on the result obtained by comparing the pattern that indicates the temporal change in the output for detecting whether the user's face is in proximity to the speech Input device, as taught by Burke, to the accelerometer output data for detecting whether the user’s face is in proximity to the speech input device, as taught by the combination of Hart and Kuruba, for the benefits discussed above.

Claims 7-10 are rejected under 35 U.S.C. 103 as being unpatentable over Hart (above) and in view of references Kuruba (above) and Iwamatsu (JP 2009-164747 A, IDS submitted on October 14, 2020).
Claim 7: the combination of Hart and Kuruba teaches all the elements of claim 7, according to claim 1 above, including detecting whether the user’s face is in proximity to the speech input device is detected (Hart, determining whether a speaker is within a viewable range in figs. 3a-d, para [0039] and the discussion in claim 1 above), except explicitly teaching that it is according to a change in gain of the audio signal obtained through the sound collection for the disclosed detecting whether the user’s face is in proximity to the speech input device.
Iwamatsu teaches an analogous field of endeavor by disclosing speech input method (title and abstract, ln 1-10 and implementation of the system in fig. 1) and wherein a change in gain of the audio signal obtained through the sound collection is disclosed (the distance is determined by signal level detected by the microphone, para [0036]) and according to a change in gain of the audio signal obtained through the sound collection to detect whether the user’s face is in proximity to the speech input device (distance is determined directly according to the relative level relationship, para [0039]) for benefits of improving an performance for the voice pickup by reducing the proximity effect (para [0005]) and stabilizing the audio signal level regardless of distance of the speaker and sound pickup device (abstract).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied wherein it is according to a change in gain of the audio signal obtained through the sound collection for the disclosed detecting whether the user’s face is in proximity to the speech input device, as taught by Iwamatsu, to the detection of whether the user’s face is in proximity to the speech input device in the method, as taught by the combination of Hart and Kuruba, for the benefits discussed above.
Claim 8: the combination of Hart, Kuruba, and Iwamatsu further teaches, according to claim 7 above, wherein in the detecting whether the user's face Is in proximity to the speech input device, the detecting is performed according to a change observed between an average value of gains of the audio signal obtained through the sound collection in a first period, and an average value of gains of the audio signal obtained through the sound collection In a second period following the first period (Iwamatsu, signal level detection unit 20 sequentially detects the maximum amplitude of the audio signal Smic input of the audio input unit 10 and 
Claim 9: the combination of Hart, Kuruba, and Iwamatsu further teaches, according to claim 1 above, wherein in the detecting whether the user's face Is in proximity to the speech input device, the detecting is performed according to a change in gain of a component at a predetermined frequency or lower of the audio signal 30 obtained through the sound collection (Iwamatsu, proximity effect is significant in low frequency range, para [0003]-[0004] and distance or proximity changes is significantly relied on the signal level change in the low frequency band range, para [0023] and fig. 3).
Claim 10: the combination of Hart, Kuruba, and Iwamatsu further teaches, according to claim 9 above, wherein in the detecting, whether the user's face is in proximity to the speech Input device is detected according to a change observed between an average value of gains of components at the predetermined frequency or lower of the audio signal obtained through the sound collection in a third period and an average value of gains of components at the predetermined frequency or lower of the audio signal obtained through the sound collection In a fourth period following the third period (Iwamatsu, sequentially detect sound level and maximum of sound level to average the sound level, para [0025] and at low band frequency, para [0026]).

Response to Arguments

Applicant's arguments filed on July 29, 2021 have been fully considered and but are moot in view of the new ground(s) of rejection necessitated by the applicant amendment. Although a new ground of rejection has been used to address additional limitations that have been added to claims 1, 12, a response is considered necessary for several of applicant’s arguments since references Hart, Kuruba continue to be used to meet several claimed limitations.
With respect to the prior art rejection of independent claim 1 and about the claimed features “correction processing” and “a state” that “positional relationship between the … input device and the user’s face is variable”, similar to claim 12, under 35 USC §103(a), as set forth in the Office Action, the Applicant argued: Hart fails to disclose above feature because “Nothing in Hart discloses or suggests that a portable speech input device is used in a state in which a positional relationship between the portable speech input device and the user’s face is variable in the detecting of whether a user’s face is in proximity to the portable speech input device”, but the applicant indicated that Hart discloses “The image information captured by the image capture elements can then be analyzed, using any appropriate image analysis algorithm, to determine actions or movements performed by either person”, and Hart’s “computing device that detects the position of a person, e.g., the detection of a person close to the device and functions, e.g., directivity adjustment, gain adjustment, and filtering to perform correction processing on an audio signal obtained through a sound collection”, as asserted in paragraphs 5-7 of page 6 in Remarks filed on July 29, 2021.
In response to the argument above, the Office respectfully disagrees because Hart’s detect of the proximity of the active user’s face to the audio input device (smartphone in fig. 1, 
The applicant further challenged prior art Kuruba about the feature “correction processing includes a process of converting the single directivity into an omni-directional directivity to sense a user’s voice” and argued: “the idea of equipping a freely portable device with a unidirectional sound collection function is not disclosed by Kuruba” because Kuruba’s “omni-directivity is converted into single directivity to sense the voice of a user who is in proximity to the headset” which is “completely the inverse of the correction processing  recited in claims 1 and 12”, and Kuruba’s “the user’s face and the position of a microphone are fixed by the headset”, as asserted in paragraphs 4-7 of page 7 in Remarks filed on July 29, 2021.
In response to the argument above, the Office further disagrees because (1) the applicant argument is conflict self by indicating user’s mouth is proximity to the headset, and then argued that the relative distance between the user’s face and the headset is fixed; (2) as recognized by the applicant, Kuruba clearly teaches converting the omni-directivity into the single directivity to sense the voice of a user is based on an ambient sound level is high, and it is well-known in the art that high ambient noise collected by Kuruba’s boom microphone does not lead to a situation that the sound collection device is relative more proximity to the user’s mouth and in fact, it is reversed because low ambient noise is presented inherently while the 
The applicant further argued “hint side” related to the feature “converting single directivity into omni-directivity to sense a user’s voice appears to be from the applicant’s own disclosure”, as asserted in the paragraphs 1-2 of page 8 in Remarks filed on July 29, 2021.
In response to the argument above, the Office further disagrees because, as the discussion above and the previous office action in the section of Response to Remarks, Kuruba clearly discloses an omnidirectional sound pickup function is obtained at the quiet mode, which is equivalent to low ambient noise situation, i.e., the boom microphone is more closing to the wearer’s mouth for picking up the wearer’s mouth. Because the argued feature has been taught by Kuruba which is before the effective filing date of the claimed invention, there is no hint side from the later filed application disclosure or claims and see MPEP 2145 X(A) and the section of Response to the Remarks in the previous office action and In re McLaughlin, 443 F.2d 1392, 170 USPQ 209 (CCPA 1971).
On the bases of above analyses and evidences from the prior arts, the prior art rejection of independent claims 1, 12 under 35 USC §103(a), as set forth in the office action, is maintained. For the at least similar reasons discussed above, the prior art rejection of dependent claims 3-11 is also maintained. 


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LESHUI ZHANG whose telephone number is (571)270-5589.  The examiner can normally be reached on Monday-Friday 6:30am-4:00pm EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached on 571-272-7848.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/LESHUI ZHANG/