Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. CN201811296970.5, filed on 11/01/2018.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 08/27/2021 is being considered by the examiner.
Drawings
The drawing submitted on 10/30/2019 is been accepted by the examiner.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 03/24/2022 has been entered.



Response to Amendment
Claims 1-2, 6-8, and 13-25 are currently pending and among them claims 1, 7, and 13 are independent claims. Claims 1, 6-7, and 12 has been amended and claims 14-25 has been added as new.
Response to Arguments
Applicant's arguments filed 07/05/2022  with respect to independent claims 1, 7, and 13, have been fully considered but they are not persuasive for the following reasons:
Applicant Argument 1: To sum up, Rand fails to disclose any processes of performing a resampling processing on the first audio played by the mobile terminal to obtain a third audio, performing a dual channel to single channel processing on the third audio to obtain a fourth audio and eliminating, according to the fourth audio, the first audio played by the vehicle terminal in the recorded audio to obtain the second audio, i.e., Rand fails to disclose the above distinguishing features.
Examiner Response 1: Examiner with all due respect completely disagree with the applicant arguments, since the applicant arguments in supports of the limitation above are nowhere could be found. For example, applicant argues that performing resampling process is according to applicant invention is “Specifically, the target of the periodical sampling in Rand is aiming at a noise itself, while the target of the resampling as recited in amended claim 1 of the present application is aiming at the sampled noise data of the noise. The following is merely an example and does not limit the scope of the invention: there is a piece of noise A1-B1-C1-D1-E1, in Rand, a sampling rate of 44.1 KHz can be adopted for the noise, so as to obtain a noise audio data A2-B2-C2-D2-E2. On the contrary, in the solution as recited in amended claim 1 of the present application, a sampling rate of 32 KHz can be adopted to perform resampling on the sampled noise audio data A2-B2-C2-D2-E2, so as to obtain a resampled noise audio data A3-B3- C3-D3-E3.”
Examiner is not sure what applicant meant by the argument “Specifically, the target of the periodical sampling in Rand is aiming at a noise itself, while the target of the resampling as recited in amended claim 1 of the present application is aiming at the sampled noise data of the noise.” Examiner find no difference between the arguing term applicant use to differential prior art teaching by “recording sample audio to determine noise in Rand and the sampled noise data of the noise by applicant argument.” Why the Rand’s recording of periodic sample of noise by either closing voice channel  or ducking voice channel or delaying voice channel from recording both voice channel and music channel is not sample noise data of noise  is not understood by examiner.
Further there is no support in the specification that reflects applicant other example arguments. Applicant’s cited specification [0071-0072] is also not clear how the resample as recited by the specification, supports applicant explanation and argument above ([0071]    specifically, "the first audio played by the mobile terminal" herein is the first audio played by the mobile terminal when the recorded audio of the current environment is obtained. [0072] The reason for performing the resampling processing on the first audio played by the mobile terminal is as follows: [0073]    due to the nature of a voice recognition module inside the mobile terminal, the voice recognition module may not be able to process the form of the first audio played by the mobile terminal. Therefore, the first audio played by the mobile terminal needs to be resampled to obtain the third audio. It can be understood that the third audio is an audio that matches the voice recognition module.). 
The specification only recites performing a dual channel to single channel processing but did not further elaborated how the process is being done since the limitation is  broad and can be done in many different ways (i.e. echo cancellation, conversion from stereo to mono channel, filtering certain channel frequency and closing one channel while leaving other channel for processing, superimposing two channel to one channel etc.)  according to a specific inventive scope. 
Further applicant’s specification recites teaching that support the examiner interpretation dual channel to single channel processing , even though specification does not explicitly cites as it is: [0087]   the voice recognition module subtracts the transmission delay duration from the first duration to obtain a second time, and determines an audio received by the voice recognition module at the second time as the reference audio corresponding to the recorded audio, and a first time is a time when the voice recognition module receives the recorded audio. [0088]    Where the reference audio corresponding to the recorded audio is the third audio or the fourth audio described above. 
Therefore, applicant’s argument that examiner cited prior art teaching “Closing voice channel or delaying voice channel for recording background music played from car audio system of periodic sample” is not  “performing dual channel to single channel processing” not persuasive, since the applicant disclosure is silent on detail process of  the dual channel to single channel process differing from the examiner interpretation based on the cited prior art teaching.
Further since the current amendment to independent claims 1, and 7, are just rolling over the claim 5 limitation into the independent claims 1 and 7, which was rejected before, therefore, all rejection of the pending claims remain same.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: voice recognition module, in claims 14, 18-20 and 24-25 .
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
Claims 14 and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 14 and 20 recites the limitation “audio is an audio that matches a voice recognition module in the mobile terminal”. It is well-known for audio recognition based on matching voice model or sample stored in the voice recognition module, however it is not clear from the specification on “how an audio can match a voice recognition module”. Therefore, it is indefinite for failing to particularly point out and distinctly claiming the subject matter which the inventors regards as the invention. For the examination purpose examiner will interpreted the limitation as “resample audio captured as a noise at the microphone, i.e. noise is music coming through speakers and leaking into a microphone, the approximate noise signal can be predicted/estimated from the music signal that is being fed to the speakers”.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-2, 6-8, and 12-25 are rejected under 35 U.S.C. 102(a)(1)as being anticipated by Rand (US 2016/0036962 A1).

Regarding Claims 1 and 7, Rand teaches: An audio processing method, wherein a mobile terminal and a vehicle terminal are in a connected state, the method is applied to the mobile terminal, and the method comprises: playing a first audio (phone playing music) synchronously with the vehicle terminal(outputting phone music through pairing or Bluetooth connection with car audio system or speakers), wherein an amplitude corresponding to the first audio when being played by the mobile terminal is 0 ([0100] Media cancellation: Because existing noise-cancellation mechanisms assume noise is ambient and unpredictable, prediction and cancellation can be improved by considering the source of the noise, which is often the same as the device that can cancel it (e.g. a phone playing music that comes through a car's speakers with which the phone is paired or connected). [0193] To illustrate this concept, imagine a smart phone is being used to play music through a car's speaker system via a Bluetooth connection (or a cord). Note: in a paired or Bluetooth or cord connection the output amplitude corresponding to the phone music through phone speaker is 0, since the music is outputted through car speakers.); obtaining a recorded audio of a current environment, wherein the recorded audio comprises the first audio played by the vehicle terminal (music) and a second audio (voice signals) for voice recognition ([0098] Auto Ducker: this provides a novel technique to mix VoIP and background music. [0099] Adjustable Speech Recognition Setting: would be used to improve voice detection and work in conjunction with the auto ducker. [0101] Voice & Audio Clip: Traditional audio clips are one of two things: a clip inserted from an existing audio file, or a recording of the microphone. Embodiments of the invention are designed to enable the simultaneous recording of background streams with voice over as detected through the microphone, in real time, splitting the audio streams into two pieces but recording them simultaneously and having an option to process each individually before superimposing into a single signal.); and eliminating, according to the first audio played by the mobile terminal, the first audio played by the vehicle terminal (music signal that is being fed to the car speakers ) in the recorded audio (recording voice while music playing through car speakers, i.e. recording audio signals from media and voice signals together) to obtain the second audio(voice); wherein the eliminating, according to the first audio played by the mobile terminal, the first audio played by the vehicle terminal in the recorded audio to obtain the second audio comprises: performing a resampling processing (periodically sample the background noise, i.e. the music signal that is being fed to the speakers by ducking voice frequency range) on the first audio played by the mobile terminal to obtain a third audio (sample background noise, i.e. microphone recording the background  music output by car audio system as periodically sample background noise); performing a dual channel to single channel (ducking or Closing voice channel or delaying voice channel for recording background music played from car audio system as periodic sample) processing on the third audio to obtain a fourth audio (The opposite signal based on the noise signal) ([0101] Voice & Audio Clip: Traditional audio clips are one of two things: a clip inserted from an existing audio file, or a recording of the microphone. Embodiments of the invention are designed to enable the simultaneous recording of background streams with voice over as detected through the microphone, in real time, splitting the audio streams into two pieces but recording them simultaneously and having an option to process each individually before superimposing into a single signal.); and eliminating, according to the fourth audio (The opposite signal based on the noise signal could then be superimposed at the microphone in order to cancel it), the first audio played by the vehicle terminal in the recorded audio to obtain the second audio ([0098] A further refinement would include the ability to duck a specific frequency range, generally the voice frequency range. [0100] Media cancellation: Because existing noise-cancellation mechanisms assume noise is ambient and unpredictable, prediction and cancellation can be improved by considering the source of the noise, which is often the same as the device that can cancel it (e.g. a phone playing music that comes through a car's speakers with which the phone is paired or connected). In one sense, media cancellation would be similar to echo cancellation, in the sense that a known signal is subtracted at a point where it is not desired, such as a microphone. The architecture of the code required to determine the signal to cancel, however, would be particular to the embodiment of the system comprising audio signals from media and voice signals together. [0186] Settings can be accessed to select the size (e.g. recording time) of the buffer and when it records (e.g. automatic when a channel is closed to incoming voice). [0192] For example, when the adjustable speech recognition setting is activated, a delay could be used on PTT messages such that the first 1000 ms (or another amount of time) would record only background noise, after which the user would be prompted to “Speak Now”. The subsequent recording period would record both the user's voice signal and the background noise, but the background noise could be more accurately reduced given the noise signal's approximate form as determined during the initial delay. [0193] For example, if the noise is music coming through speakers and leaking into a microphone, the approximate noise signal can be predicted/estimated from the music signal that is being fed to the speakers. The opposite signal could then be superimposed at the microphone in order to cancel it (the time offset could be compensated for by a pre-set delay or by comparing the predicted to the actual noise signal). To illustrate this concept, imagine a smart phone is being used to play music through a car's speaker system via a Bluetooth connection (or a cord). The phone is thus aware of the music signal being sent to the car's speakers. Now suppose that a Voice Over IP connection is also running simultaneously, such that the person driving the car can be listening to music and talking to whoever is connected at the same time. The music playing out loud in the car could be cancelled at the microphone with great precision because the approximate noise signal is already known to the phone in advance. This methodology would enable voice communication simultaneously with background music or other media (like movie & game sounds) even when loudspeakers are used. [0206] Likewise, automated methods could be used in order to periodically sample the background noise through the periodic activation and recording of the microphone. [00241] In cases of increased integration with the media delivery, the delivery of video and audio could both be delayed intentionally by the same amount, providing a time buffer for Media Cancellation processing.).

Regarding Claims 2 and 8, Rand teaches:  The method according to claim 1, wherein the method further comprises: caching the first audio locally before playing the first audio synchronously with the vehicle terminal ([0186] Recording Buffer: A buffer can store messages for playback. Settings can be accessed to select the size (e.g. recording time) of the buffer and when it records (e.g. automatic when a channel is closed to incoming voice). [00241] In cases of increased integration with the media delivery, the delivery of video and audio could both be delayed intentionally by the same amount, providing a time buffer for Media Cancellation processing.).

Regarding Claims 6 and 12, Rand teaches: The method according to claim 1, wherein the eliminating, according to the fourth audio, the first audio played by the vehicle terminal in the recorded audio to obtain the second audio comprises: eliminating the first audio played by the vehicle terminal in the recorded audio to obtain the second audio by taking the fourth audio as a reference (sample) audio (see rejection of claim 1).

Regarding Claim 13, Rand teaches: A nonvolatile memory, wherein the nonvolatile memory has stored thereon a program or an instruction, wherein the method of claim 1 is executed when the program or the instruction is operated on a computer (See rejection of claim 1 and [0035] Yet a further aspect of the present invention is a computer-readable medium comprising instructions in code which are stored in a memory of a mobile device and executable by a processor of the mobile device to cause the mobile device to execute an integrated conversation and music management application for selectively listening to music and conversing with one of a plurality of other users and to cause the mobile device to display a management screen for managing music (or other audio content such as TV, news, etc.) and conversations with the other users. ).

Regarding Claims 14 and 20, Rand teaches: The method according to claim 1, wherein the third audio is an audio that matches a voice recognition module in the mobile terminal, and the voice recognition module is configured to eliminate the first audio played by the vehicle terminal in the recorded audio ( See rejection of claim 1, specifically: [0101] Voice & Audio Clip: Traditional audio clips are one of two things: a clip inserted from an existing audio file, or a recording of the microphone. Embodiments of the invention are designed to enable the simultaneous recording of background streams with voice over as detected through the microphone, in real time, splitting the audio streams into two pieces but recording them simultaneously and having an option to process each individually before superimposing into a single signal. [0193] For example, if the noise is music coming through speakers and leaking into a microphone, the approximate noise signal can be predicted/estimated from the music signal that is being fed to the speakers. The opposite signal could then be superimposed at the microphone in order to cancel it (the time offset could be compensated for by a pre-set delay or by comparing the predicted to the actual noise signal). 

Regarding Claims 15-17 and 21-23, Rand teaches: The method according to claim 1, wherein the third audio is dual channel data, and the fourth audio is single channel data (See rejection of claim 1, specifically [0101] Voice & Audio Clip: Traditional audio clips are one of two things: a clip inserted from an existing audio file, or a recording of the microphone. Embodiments of the invention are designed to enable the simultaneous recording of background streams with voice over as detected through the microphone, in real time, splitting the audio streams into two pieces but recording them simultaneously and having an option to process each individually before superimposing into a single signal.).

Regarding Claims  18-19 and 24-25, Rand teaches:  The method according to claim 6, wherein the voice recognition module is configured to perform time calibration (time offset by pre-set delay) on the reference audio (sample) and the recorded audio (See rejection of claim 1, specifically [0193] For example, if the noise is music coming through speakers and leaking into a microphone, the approximate noise signal can be predicted/estimated from the music signal that is being fed to the speakers. The opposite signal could then be superimposed at the microphone in order to cancel it (the time offset could be compensated for by a pre-set delay or by comparing the predicted to the actual noise signal).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The prior art of record Layton et al. (US 2016/0098989 A1) teach: [0005] In order for speech recognition systems to achieve reasonable recognition rates the audio captured from the microphone may be processed to reduce noise and/or echo. For example, a speech recognition system operating on a mobile phone may utilize the mobile phone's built-in echo canceller/noise suppressor to process the audio captured by the microphone. In some configurations, the speech recognition system does not operate on the same device as the microphone. For example, a wireless headset may capture the audio, process the audio and then transmit the audio to a mobile phone that handles the speech recognition. In an alternative example, an automobile headunit may capture and process the audio and send the resulting audio to a mobile phone or a cloud based server for speech recognition. The audio captured from inside an automobile may be problematic to the speech recognition system because there may be many sources of audio to confuse the speech recognition system. An automobile may have many different audio sources including navigation prompts, music, chimes/gongs and text to speech output. Each of these audio sources may be captured in the microphone signal that is sent to the speech recognition system. There is a need for improved processing of audio captured in an automobile or other similar environments for use in voice recognition systems. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD K ISLAM whose telephone number is (571)270-5878.  The examiner can normally be reached on Monday -Friday, EST (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MOHAMMAD K ISLAM/Primary Examiner, Art Unit 2656