DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
In response to the Office Action mailed 1/29/2021, applicant has submitted an amendment filed 7/29/2021.
Claim(s) 1-3, 6, 9, 11-14, 17-20, 22, 26-27, 29-31, has/have been amended.  Claim(s) 5, 7, 21, 28, has/have been cancelled.  New Claim(s) 32-33 has/have been added.
Response to Arguments
Further search and consideration led to discovery of references that could be applied to reject what was previously indicated as allowable subject matter (see rejections for full detail).  This Office Action is, therefore, non-final, and includes new prior art rejections and 112 rejections.
Claim Interpretation
In claim 1, “the frames” in the 3rd to last line is interpreted as having implicit antecedent basis based on “the received signal is processed frame by frame” in the 4th to last line (a signal logically cannot be processed “frame by frame” if there are not multiple frames to be processed one at a time).
In claims 22, 32, and 33, “sharpening an amplitude envelope of each of the plurality of tone signals within the audio signal” is interpreted as where each tone signal 
Claim Objections
Applicant is advised that should claim 18 and 24 be found allowable, claims 24 and 18 (respectively) will be objected to under 37 CFR 1.75 as being a substantial duplicate thereof. When two claims in an application are duplicates or else are so close in content that they both cover the same thing, despite a slight difference in wording, it is proper after allowing one claim to object to the other as being a substantial duplicate of the allowed claim. See MPEP § 608.01(m).

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 13-20, 23-25, 27, 30-31, are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

As per Claim 13 (and similarly claims 30-31):
within the audio signal” at the end of claim 13 lacks antecedent basis.  Line 3 recites “encoding data into an audio signal using a sequence of tones, which is not limited to generating an audio signal comprising a sequence of tones as the encoding of the data.  For example, the sequence of tones could be a data form of the tones (e.g. in frequency spectrum data) used to determine encoded data (derived from but not necessarily including the sequence of tones) which are then output in audio form.  Therefore, “encoding data into an audio signal using a sequence of tones” does not necessarily/inherently provide antecedent basis for “the sequence of tones within the audio signal” at the end of claim 13.

Claim 17 was amended to depend on claim 15 and, as a result, “the interference” in claim 17 is now ambiguous, because claim 15 recites “interference” in “predictions of interference” (in line 2) and claim 13 recites “environmental interference” (lines 4-5) and, as claimed, it is not clear which “interference” “the interference in claim 17 is supposed to refer to (Applicant fairly clearly meant to refer to “interference” in claim 15 but “environmental interference” is still “interference” that can be referenced by “the interference”)

As per claim 19, “at least some of the tones of the sequence of tones” in line 3 is ambiguous.  Claim 13 recites “a sequence of tones” (line 3) which inherently includes tones, and claim 13 also recites “at least some tones of the sequence of tones” (2nd to last line) which are also “tones of the sequence of tones” which are not necessarily the same tones inherently included in “a sequence of tones”.  At a minimum, it is not clear if nd to last line of claim 13).

As per Claim 20, “the at least some of the frequencies” lacks antecedent basis.  Claim 19 recites “frequencies of at least some of the tones of the sequence of tones”

Claim 25 recites “the step of acoustically transmitting the audio signal for receipt by a microphone” which lacks antecedent basis.

Claim 27 recites “a second device comprising a microphone for acoustically receiving the audio signal” where “the audio signal” has antecedent basis from “an audio signal” in “a first device comprising a speaker for acoustically transmitting an audio signal encoding the data”, and claim 27 also recites “process the received audio signal encoding the data to minimize environmental interference within the received signal”.  
As claimed, “the received audio signal encoding the data” and “the received signal” in “process the received audio signal encoding the data to minimize environmental interference within the received signal” are ambiguous, because the plain meaning of “the received audio signal encoding the data” is what was transmitted by the first device (i.e. the audio signal without interference), and because “process the received audio signal encoding the data to minimize environmental interference within with “interference”, where the first device only transmits the audio signal, and does not transmit the interference).
Applicant’s intended meaning is clear (process the “audio signal encoding the data”+”interference” signal captured by the second device’s microphone to minimize “interference”) but as claimed “the received audio signal encoding the data” and “the received signal” can refer to either:
1. “an audio signal encoding the data” in “a first device comprising a speaker for acoustically transmitting an audio signal encoding the data”, which provides explicit antecedent basis for “the received audio signal encoding the data” but which does include “environmental interference” and does not need any processing to minimize environmental interference because it is the “noiseless” component of what is received by the second device’s microphone, and is what is acoustically transmitted by the first device (where the first device does not transmit interference in the acoustically transmitted audio signal)
or
2. a “received audio signal”/”received signal” which has inherent (not explicit) antecedent basis as a result of “a microphone… acoustically receiving the audio signal” which includes the acoustically transmitted audio signal and environmental interference, (such that there is environmental interference within the received signal to be minimized) and which does not have antecedent basis from “an audio signal encoding 
“the received signal” in the 4th to last line of claim 27 is similarly ambiguous.
For the purposes of applying art, the examiner has interpreted “the received audio signal encoding the data” and “the received signal” in lines 8-9 and the 4th to last line of claim 27 as referring to interpretation 2 (i.e. the signal received by the microphone of the second device which includes the acoustically transmitted audio signal encoding the data and environmental interference)

The dependent claims include the issues of their respective parent claims.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 13, 25, 30, 31, is/are rejected under 35 U.S.C. 103 as being unpatentable over Takara et al. (US 2013/0010979), hereafter Takara, in view of Werner et al. (US 2007/0144235), hereafter Werner.

A method for encoding data for acoustic transmission, including encoding data into an audio signal…, wherein the audio signal is configured to minimize environmental interference… (paragraphs 11, 20, 21, 23-25, 28-36, 40, 45-46, 48, 51-52; Figures 1-2;
Paragraph 11 describes “convert[ing] various types of encoded information into at least one acoustic wave in the audible spectrum and carries out transmission thereof” [suggesting, among other things, that the acoustic wave is audible].  Paragraph 20 describes “messages, URLs, and/or various other types of information are sent as acoustic waves, with air serving as medium, from a transmitter to a receiver, the transmitter transmitting acoustic waves from a speaker, and the receiver receiving these acoustic waves by means of a microphone and carrying out decoding for recognition of the transmitted information”.  Paragraph 21 describes “carrier waves which take into consideration the ambient sound present at the location at which the transmitter and receiver are installed, i.e., the location where sound pressure oscillation information [hereinafter "sonic code"] in the form of acoustic waves is transmitted, are used to send sonic code, and examples of ambient sound which are “sounds in the environs of the location at which the sonic code is transmitted”, and where music is another example of ambient sound.  Paragraphs 28-29 describe a hardware and software embodiment of a transmitter 10.  Paragraphs 30-36 and 40 and Figure 2 describe where ambient sound [which can be music or other sounds as per paragraph 21] is received [paragraph 31] and converted into the frequency domain [paragraph 32] where a peak frequency is detected [paragraph 33] where carrier waves are generated based on the peak 
These portions suggest “A method for encoding data for acoustic transmission, including encoding data into an audio signal…;” [converting/”encoding” message/URL/information “data” into a sonic code analog “audio signal” and then into audible acoustic waves that are acoustically transmitted via a speaker, where the conversion is based on ambient sound] and “wherein the audio signal is configured to 
Takara does not, but Werner suggests A method for encoding data for acoustic transmission, including encoding data into an audio signal using a sequence of tones, wherein the audio signal is configured to minimize environmental interference and wherein the audio signal is configured by configuring the sequence of tones to insert space between at least some tones of the sequence of tones within the audio signal (paragraphs 18-20, 24-27;
Werner, like Takara, describes an audible signal which is used for outputting information [paragraph 27, see also paragraphs 18-20]. 
Werner more specifically describes where an audible signal which is for outputting information can comprise a sequence of tones [paragraphs 24-27], and 
Werner thus suggests “A method for encoding data for acoustic transmission, including encoding data into an audio signal using a sequence of tones, wherein the audio signal is configured to minimize environmental interference and wherein the audio signal is configured by configuring the sequence of tones to insert space between at least some tones of the sequence of tones within the audio signal”: where the sonic code in Takara is more specifically a sequence of tones including pauses between at least some of the tones [where, to generate a tone sequence including pauses between tones, it is at least suggested that pauses/”spaces” are placed/”inserted” between tones, because a pause between tones logically cannot be a pause between tones unless the tones bordering the pause are determined to exist])
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of audio signal for outputting information with another because the prior art teaches the claimed invention except for the substitution of an audio signal for outputting information which is not necessarily a sequence of tones which includes pauses between tones with an audio signal for outputting information which is.  Werner teaches that an audio signal for outputting information which is a sequence of tones which includes pauses between tones was known in the art.  One of ordinary skill in the art could have substituted one type of audio signal for outputting information with another to obtain the predictable results of a device which converts message/URL/information into a sonic code and then into audible acoustic waves that are acoustically transmitted via a speaker (as per 

As per Claims 30-31, they are directed to apparatus and computer readable medium equivalents of claim 13 and so are rejected under similar rationale (see paragraph 23 which describes an embodiment of the transmitter which is a personal computer, and where the personal computer includes a processor and a hard drive for storing programs.)

As per Claim 25, Takara suggests acoustically transmitting the audio signal for receipt by a microphone (paragraphs 11, 20, 21, 23-25, 28-36, 40, 45-46, 48, 51-52; Figures 1-2;
Paragraph 40 describes where the sonic code analog “audio signal” is transmitted as acoustic waves from a speaker, and paragraphs 23-24 describe where the speaker transmits sound constituting sonic code, and where a microphone picks up the sonic code transmitted from the speaker.)

Claims 1, 2, 3, 6, 12, 26, 27, 29, is/are rejected under 35 U.S.C. 103 as being unpatentable over Takara et al. (US 2013/0010979), hereafter Takara, in view of Lee (US 2010/0267340), Nieto et al. (US 9,118,401), hereafter Nieto, and Otani et al. (US 2007/0232257), hereafter Otani.

A method for receiving data transmitted acoustically, including: receiving an acoustically transmitted signal encoding the data; processing the received signal to minimize environmental interference within the received signal; and decoding the processed signal to extract the data, wherein the data is encoded within the transmitted signal using a sequence of tones… (Figure 5; paragraphs 61-68; paragraphs 11, 20, 21, 23-25, 28-36, 40, 45-46, 48, 51-52; Figures 1-2;
To provide context for the processes performed by the “decoder” in Takara: 
Paragraph 11 describes “convert[ing] various types of encoded information into at least one acoustic wave in the audible spectrum and carries out transmission thereof” [suggesting, among other things, that the acoustic wave is audible].  Paragraph 20 describes “messages, URLs, and/or various other types of information are sent as acoustic waves, with air serving as medium, from a transmitter to a receiver, the transmitter transmitting acoustic waves from a speaker, and the receiver receiving these acoustic waves by means of a microphone and carrying out decoding for recognition of the transmitted information”.  Paragraph 21 describes “carrier waves which take into consideration the ambient sound present at the location at which the transmitter and receiver are installed, i.e., the location where sound pressure oscillation information [hereinafter "sonic code"] in the form of acoustic waves is transmitted, are used to send sonic code, and examples of ambient sound which are “sounds in the environs of the location at which the sonic code is transmitted”, and where music is another example of ambient sound.  Paragraphs 28-29 describe a hardware and software embodiment of a transmitter 10.  Paragraphs 30-36 and 40 and Figure 2 describe where ambient sound 
These portions suggest “A method for encoding data for acoustic transmission, including encoding data into an audio signal using a sequence of tones;” 
For Claim 1, Figure 5 and Paragraphs 61-68 describe a “decoder” [corresponding to the “encoder” of Figure 2] that receives the sonic code 150 [generated by Figure 2] superimposed on ambient sound [see paragraph 61, 63, 66-67] captures ambient sound as it exists when no sonic code is present and subtracts, from the ambient sound on which the sonic code has been superimposed, the captured ambient sound to extract the sonic code [paragraphs 62-63, 66-67] and then determines the baseband signal [baseband signal/data 110 that was encoded into a sonic code in Figure 2] from the 
The portions discussed in the previous paragraph [given the context of the “encoder” discussed above] suggest: “A method for receiving data transmitted acoustically” [receiving sonic code audio signal data that was transmitted acoustically from the transmitter via the speaker at the transmitter] 
“including: receiving an acoustically transmitted signal encoding the data;” [receiving sound of an audio signal including the sonic code superimposed on ambient sound, where the sound of the audio signal including the sonic code superimposed on ambient sound can be interpreted as “an acoustically transmitted signal” in the sense that it is a physical sound “signal” that was “transmitted acoustically” by sound sources 
“processing the received signal to minimize environmental interference within the received signal;”: subtracting the captured ambient sound from the “received”/data form of the sonic code+ambient sound “acoustically transmitted signal encoding data” to remove/”minimize” the ambient sound that is “within the” “received”/data form of the sonic code+ambient sound “signal”, where, based on paragraph 21, the ambient sound can be interpreted as “environmental interference” in the background/environmental noise sense, and in the sense that the ambient sound is sound in the “environs” that “interferes” with the system receiving only the transmitted sonic code
“and decoding the processed signal to extract the data,”: [“decoding” the sonic code signal by obtaining the carrier waves from the sonic code signal and obtaining, based on the carrier waves, the baseband message/URL/information “data” that was “encoded” in the sonic code signal, where the sonic code signal is the result of “processing” that removes the ambient sound from the “received”/data-form of the ambient sound+sonic code “signal”] 
“wherein the data is encoded within the transmitted signal using a sequence of tones…”: the message/URL/information baseband “data” is added-to/incorporated-into/“encoded within” “the transmitted” ambient sound+sonic code “signal” by outputting, via the speaker, a sonic code sound into the environment, where the 
Takara does not, but Lee suggests the received signal is processed frame by frame (“transmitter signal, or a speech source signal, is input through the microphone… A/D converter… transmitter signal may include a speech signal and/or a noise signal”, paragraph 39; “divides the A/D-converted transmitter signal into units of a predetermined frame. The first FFT unit 104 transforms the transmitter signal divided on a frame-by-frame basis into a frequency-domain signal by performing FFT. The first FFT unit 104 outputs the frequency-domain signal transformed from the transmitter signal to the transmitter noise estimator 106 frame by frame”, paragraph 40; “determines if each frame of the transmitter signal is a frame with a speech signal, or a frame with a noise signal”, paragraph 41; Figure 1;
	As discussed in the portion of this rejection of claim 1 based on Takara, Takara suggests where the sonic code+ambient sound is received by a microphone and 
	Lee similarly describes where a multi-component audio signal [speech+noise] is input through a microphone, converted into a digital signal via an A/D converter, and an FFT is performed afterwards, and audio processing is performed on the FFT output [paragraphs 39-41, Figure 1] and more specifically describes where the digital signal is divided into frames and where the FFT is performed on a frame-by-frame basis and the audio processing is performed on each frame.
	Lee thus suggests where Takara’s “decoder” processes the ambient-sound+sonic code audio “signal” in its “received”/data form frame-by-frame, for example by adding the framer between the A/D converter and the FFT of Figure 5 in Takara and changing the FFT into a frame-by-frame FFT and the remaining processes of Figure 5 in Takara to frame-by-frame processes [“wherein the received signal is processed frame by frame”])
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of audio processing that uses an A/D converter and an FFT with another because the prior art teaches the claimed invention except for the substitution of audio processing that uses an A/D converter and an FFT which does not necessarily process audio frame-by-frame with audio processing that uses an A/D converter and an FFT which does.  Lee teaches that audio processing that uses an A/D converter and an FFT which processes audio frame-
Takara, in view of Lee, do not, but Nieto suggests and a Fast-Fourier Transform (FFT) for at least some of the frames is processed to modify a magnitude in each bin of the FFT in accordance with a… value of the corresponding bin… (col. 5, lines 52-62; col. 7, line 15 - col. 8, line 20; col. 13, line 51 – col. 15, line 13;
The combination [thus far] is as discussed in the portion of this rejection of claim 1 based on Lee, including where Takara’s “decoder” processes the ambient-sound+sonic code audio “signal” in its “received”/data form frame-by-frame, for example by adding the framer between the A/D converter and the FFT of Figure 5 in Takara and changing the FFT into a frame-by-frame FFT and the remaining processes of Figure 5 in Takara to frame-by-frame processes [“wherein the received signal is processed frame by frame”]
In Nieto, col. 7, lines 15-30 and col. 7, line 54 – col. 8, line 20 suggests calculating an instantaneous magnitude of each frequency bin of an FFT.  Col. 7, lines 15-30 teaches where an average magnitude value is determined for each frequency-
Nieto suggests “and a Fast-Fourier Transform (FFT) for at least some of the frames is processed to modify a magnitude in each bin of the FFT in accordance with a… value of the corresponding bin…”: where one of the frame-by-frame processes [performed by the Takara/Lee combination] includes calculating an instantaneous magnitude for each bin of a frame’s FFT, determining an average magnitude for each bin of the frame’s FFT, calculating a mask value for each bin of the frame’s FFT based on a respective bin’s average magnitude value, and adjusting the instantaneous magnitude for each bin of the frame’s FFT based on a respective bin’s calculated mask value.)
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to combine prior art elements according to known methods because the prior art included each element claimed, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference (Takara and Lee suggest receiving an audio signal, performing A/D conversion on the audio signal, performs frame-by-frame FFT, and performs frame-by-
	Takara, in view of Lee and Nieto, do not, but Otani suggests and a Fast-Fourier Transform (FFT) for at least some of the frames is processed to modify a magnitude in each bin of the FFT in accordance with a magnitude value of the corresponding bin in a preceding frame (paragraph 12;
	The combination [thus far] is as discussed in the portion of this rejection of claim 1 based on Nieto, including where one of the frame-by-frame processes [performed by the Takara/Lee combination] includes calculating an instantaneous magnitude for each bin of a frame’s FFT, determining an average magnitude for each bin of the frame’s FFT, calculating a mask value for each bin of the frame’s FFT based on a respective bin’s average magnitude value, and adjusting the instantaneous magnitude for each bin of the frame’s FFT based on a respective bin’s calculated mask value.
	Nieto’s average is not specifically based on a preceding frame’s magnitude value.

	Otani thus suggests “and a Fast-Fourier Transform (FFT) for at least some of the frames is processed to modify a magnitude in each bin of the FFT in accordance with a magnitude value of the corresponding bin in a preceding frame”: where the average magnitude determined for each bin of a frame’s FFT in the Takara/Lee/Nieto combination is calculated by averaging instantaneous magnitudes of a respective bin, where the instantaneous magnitudes of a respective bin include instantaneous magnitudes of past frames [such that, for each bin, the instantaneous magnitude of the bin is adjusted/”modified” based on a mask value of a respective bin that is based on an average magnitude of a respective bin that is based on “a magnitude value of the corresponding bin in a preceding frame”])
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of averaging with another because the prior art teaches the claimed invention except for the substitution of averaging which does not necessarily average magnitudes of preceding frames with averaging which does.  Otani teaches that averaging magnitudes of preceding frames was known in the art.  One of ordinary skill in the art could have substituted one type of averaging with another to obtain the predictable results of a device which receives an audio signal including ambient sound and a sonic code, and performs audio processing that performs A/D conversion on the audio signal, performs an FFT, subtracts the ambient sound from the audio signal, and determines baseband data from the sonic 

As per Claim 2, Takara suggests wherein the acoustically transmitted signal is human-audible (Figure 5; paragraphs 61-68; paragraphs 11, 20, 21, 23-25, 28-36, 40, 45-46, 48, 51-52; Figures 1-2;
As discussed in the rejection of claim 1, the sound of the audio signal including the sonic code superimposed on ambient sound is interpreted as an “acoustically transmitted signal encoding data”
Paragraph 11 describes where the transmitted acoustic wave is “in the audible spectrum”.  Paragraph 45 describes that sonic code composed of “carrier waves is superimposed on ambient sound, such that masking due to ambient sound makes this a sound which is not easily perceived by the human ear” [suggesting that it can be perceived audibly, just not easily].  Paragraph 46 describes where ambient sound is music and superimposing sonic code thereon merely changes the timbre and causes almost no sense of discomfort from the perspective of someone listening to the music 
These portions thus suggest where “the” ambient sound+sonic code “acoustically transmitted signal encoding data” is “human-audible” [where the ambient sound is something that a person can actually hear like music, and where the sonic code is also audible, but not easily perceptible])

As per Claim 3, Takara suggests wherein the environmental interference is caused by and/or during transmission of the acoustically transmitted signal (Figure 5; paragraphs 61-68; paragraphs 11, 20, 21, 23-25, 28-36, 40, 45-46, 48, 51-52; Figures 1-2;
As discussed in the rejection of claim 1, the sound of the audio signal including the sonic code superimposed on ambient sound is interpreted as an “acoustically transmitted signal encoding data” and the ambient sound component of the “received”/data form of the ambient sound+sonic code “signal" is interpreted as “the environmental interference”.
Paragraph 21 describes where ambient sound “environmental interference” can be various sounds produces by different things, and paragraph 40 describes where the sonic code is transmitted from the speaker.
The presence of ambient sound [“environmental interference”] in the “received”/data form of the ambient sound+sonic code “signal" is thus “caused by and/or during transmission of the” “acoustically transmitted signal encoding data” in the sense that the ambient sound component of the “received”/data form of the ambient 

As per Claim 6, Takara does not, but Lee suggests wherein each frame of the received signal is processed to generate a Fast-Fourier Transform (FFT) (paragraphs 39-41; Figure 1;
	Same combination as the one applied to reject claim 1, where Lee, paragraph 40, describes where FFT transformation is performed on a frame-by-frame basis and outputs a frequency-domain signal [at least suggested to be a “Fast-Fourier Transform” of each frame] frame by frame, which suggests where the frame-by-frame FFT of the combination applied to reject claim 1 processes “each frame of the received” data form of the ambient sound+sonic code “signal” to “generate a Fast-Fourier Transform” [frequency domain frame produced by FFT] of each frame)
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of audio processing that uses an A/D converter and an FFT with another because the prior art teaches the claimed invention except for the substitution of audio processing that uses an A/D converter and an FFT which does not necessarily process audio frame-by-frame with audio processing that uses an A/D converter and an FFT which does.  Lee teaches that audio processing that uses an A/D converter and an FFT which processes audio frame-by-frame was known in the art.  One of ordinary skill in the art could have substituted 

	As per Claim 12, Takara suggests wherein the acoustically transmitted signal is received via a microphone (Figure 5; paragraphs 61-68; paragraphs 11, 20, 21, 23-25, 28-36, 40, 45-46, 48, 51-52; Figures 1-2;
As discussed in the rejection of claim 1, the sound of the audio signal including the sonic code superimposed on ambient sound is interpreted as an “acoustically transmitted signal encoding data”.
Paragraphs 66-68 describe where sonic code+ambient sound is input [paragraph 67] and where sound is picked up by microphone [paragraph 66])	

	As per Claims 26 and 29, they are directed to apparatus and computer readable medium equivalents of claim 1 and so are rejected under similar rationale (see paragraph 24 which describes where the receiving device that picks up the sonic code transmitted from a speaker is a mobile phone, and where the mobile phone includes a processor and memory storing programs)

A system for receiving data transmitted acoustically, including: a first device comprising a speaker for acoustically transmitting an audio signal and one or more processors; and a second device comprising a microphone for acoustically receiving the audio signal and one or more processors configured to: process the received audio signal encoding the data to minimize environmental interference within the received signal; and decode the processed signal to extract the data, wherein the data is encoded within the transmitted signal using a sequence of tones… (Figure 5; paragraphs 61-68; paragraphs 11, 20, 21, 23-25, 28-36, 40, 45-46, 48, 51-52; Figures 1-2;
Figure 1 depicts the mobile telephone 20 and transmitter personal computer 10 [collectively a “system”] and paragraphs 23-24 describe where the transmitter personal computer includes a speaker for transmitting sonic code sound and a processor, and where the mobile telephone includes a microphone that picks up sonic code that has been transmitted from the speaker and a processor.
	Takara thus suggests “A system for receiving data transmitted acoustically,” [transmitter personal computer 10 and mobile telephone 20, together, where the system, among other things, “receives” sonic code “data transmitted acoustically” in the mobile telephone 20]
“including: a first device comprising a speaker for acoustically transmitting an audio signal encoding the data and one or more processors;” [transmitter personal computer 10 which includes a speaker that acoustically transmits the sonic code “audio signal” and which includes a processor] 

To provide context for the processes performed by the “decoder” in Takara:
Paragraph 11 describes “convert[ing] various types of encoded information into at least one acoustic wave in the audible spectrum and carries out transmission thereof” [suggesting, among other things, that the acoustic wave is audible].  Paragraph 20 describes “messages, URLs, and/or various other types of information are sent as acoustic waves, with air serving as medium, from a transmitter to a receiver, the transmitter transmitting acoustic waves from a speaker, and the receiver receiving these acoustic waves by means of a microphone and carrying out decoding for recognition of the transmitted information”.  Paragraph 21 describes “carrier waves which take into consideration the ambient sound present at the location at which the transmitter and receiver are installed, i.e., the location where sound pressure oscillation information [hereinafter "sonic code"] in the form of acoustic waves is transmitted, are used to send sonic code, and examples of ambient sound which are “sounds in the environs of the location at which the sonic code is transmitted”, and where music is another example of ambient sound.  Paragraphs 28-29 describe a hardware and software embodiment of a transmitter 10.  Paragraphs 30-36 and 40 and Figure 2 describe where ambient sound [which can be music or other sounds as per paragraph 21] is received [paragraph 31] and converted into the frequency domain [paragraph 32] where a peak frequency is detected [paragraph 33] where carrier waves are generated based on the peak 
These portions suggest “A method for encoding data for acoustic transmission, including encoding data into an audio signal using a sequence of tones;” [converting/”encoding” message/URL/information “data” into a sonic code analog “audio signal” and then into audible acoustic waves that are acoustically transmitted via a speaker, where the conversion is based on ambient sound which is at least suggested 
For Claim 27, Figure 5 and Paragraphs 61-68 describe a “decoder” [corresponding to the “encoder” of Figure 2] that receives the sonic code 150 [generated by Figure 2] superimposed on ambient sound [see paragraph 61, 63, 66-67] captures ambient sound as it exists when no sonic code is present and subtracts, from the ambient sound on which the sonic code has been superimposed, the captured ambient sound to extract the sonic code [paragraphs 62-63, 66-67] and then determines the baseband signal [baseband signal/data 110 that was encoded into a sonic code in Figure 2] from the carrier waves obtained from the sonic code [paragraphs 63-65, 68, and Figure 5, where it is at least suggested that the input to element 215 of Figure 5 is the sonic code obtained after subtracting the captured ambient sound, and the output of 
The portions discussed in the previous paragraph [given the context of the “encoder” discussed above] suggest where the “one or more processors” of the “second device” are “configured to:” 
“process the received audio signal encoding the data to minimize environmental interference within the received signal;”: subtracting captured ambient sound from a “received”/data form of a sonic code+ambient sound “audio signal” to remove/”minimize” the ambient sound that is “within the” “received”/data form of the sonic code+ambient sound “signal”, where, based on paragraph 21, the ambient sound can be interpreted as “environmental interference” in the background/environmental noise sense, and in the sense that the ambient sound is sound in the “environs” that “interferes” with the system receiving only the transmitted sonic code.  The “received”/data form of the sonic code+ambient sound “audio signal” can also be interpreted as a “received audio signal 
“and decode the processed signal to extract the data,”: “decoding” the sonic code signal by obtaining the carrier waves from the sonic code signal and obtaining, based on the carrier waves, the baseband message/URL/information “data” that was “encoded” in the sonic code signal, where the sonic code signal is the result of “processing” that removes the ambient sound from the “received”/data-form of the ambient sound+sonic code “signal”] 
“wherein the data is encoded within the transmitted signal using a sequence of tones…”: the message/URL/information baseband “data” is added-to/incorporated-into/“encoded within” “the transmitted” ambient sound+sonic code “signal” by outputting, via the speaker, a sonic code sound into the environment, where the message/URL/information baseband “data” would not be “encoded within” the received ambient sound+sonic code signal if the sonic code was not output via the speaker, and the actual sounds making up the sonic code are suggested to be, in at least some embodiments, a “sequence of tones”, because, in order for the sonic code to be relatively imperceptible in the presence of an ambient sound, see e.g. paragraphs 45-46, the sonic code logically needs to be very similar to the ambient sound, and paragraph 21, particularly the music example, suggests where the ambient sound is, in at least some embodiments, made of a “sequence of tones” such that the sonic code is suggested, in at least some embodiments, to be a very similar “sequence of tones” 
Takara does not, but Lee suggests the received signal is processed frame by frame (“transmitter signal, or a speech source signal, is input through the microphone… A/D converter… transmitter signal may include a speech signal and/or a noise signal”, paragraph 39; “divides the A/D-converted transmitter signal into units of a predetermined frame. The first FFT unit 104 transforms the transmitter signal divided on a frame-by-frame basis into a frequency-domain signal by performing FFT. The first FFT unit 104 outputs the frequency-domain signal transformed from the transmitter signal to the transmitter noise estimator 106 frame by frame”, paragraph 40; “determines if each frame of the transmitter signal is a frame with a speech signal, or a frame with a noise signal”, paragraph 41; Figure 1;
	As discussed in the portion of this rejection of claim 27 based on Takara, Takara suggests where the sonic code+ambient sound is received by a microphone and processed by subtracting the ambient sound component and “decoding” the baseband data from the sonic code [paragraphs 66-67].  Figure 5 and paragraphs 66-67 describe where sound is picked up by a microphone, sent to an A/D converter and then an FFT is performed on the A/D output.
	Lee similarly describes where a multi-component audio signal [speech+noise] is input through a microphone, converted into a digital signal via an A/D converter, and an FFT is performed afterwards, and audio processing is performed on the FFT output [paragraphs 39-41, Figure 1] and more specifically describes where the digital signal is 
	Lee thus suggests where Takara’s “decoder” processes the ambient-sound+sonic code audio “signal” in its “received”/data form frame-by-frame, for example by adding the framer between the A/D converter and the FFT of Figure 5 in Takara and changing the FFT into a frame-by-frame FFT and the remaining processes of Figure 5 in Takara to frame-by-frame processes [“wherein the received signal is processed frame by frame”])
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of audio processing that uses an A/D converter and an FFT with another because the prior art teaches the claimed invention except for the substitution of audio processing that uses an A/D converter and an FFT which does not necessarily process audio frame-by-frame with audio processing that uses an A/D converter and an FFT which does.  Lee teaches that audio processing that uses an A/D converter and an FFT which processes audio frame-by-frame was known in the art.  One of ordinary skill in the art could have substituted one type of audio processing that uses an A/D converter and an FFT with another to obtain the predictable results of a system including a first device that acoustically transmits a sonic code and a second device which receives an audio signal including ambient sound and a sonic code, and performs audio processing that performs A/D conversion on the audio signal, performs an FFT, subtracts the ambient sound from the audio signal, and determines baseband data from the sonic code (as per Takara) where 
Takara, in view of Lee, do not, but Nieto suggests and a Fast-Fourier Transform (FFT) for at least some of the frames is processed to modify a magnitude in each bin of the FFT in accordance with a… value of the corresponding bin… (col. 7, line 15 - col. 8, line 20; col. 13, line 51 – col. 15, line 13;
The combination [thus far] is as discussed in the portion of this rejection of claim 27 based on Lee, including where Takara’s “decoder” processes the ambient-sound+sonic code audio “signal” in its “received”/data form frame-by-frame, for example by adding the framer between the A/D converter and the FFT of Figure 5 in Takara and changing the FFT into a frame-by-frame FFT and the remaining processes of Figure 5 in Takara to frame-by-frame processes [“wherein the received signal is processed frame by frame”]
In Nieto, col. 7, lines 15-30 and col. 7, line 54 – col. 8, line 20 suggests calculating an instantaneous magnitude of each frequency bin of an FFT.  Col. 7, lines 15-30 teaches where an average magnitude value is determined for each frequency-domain bin.  Col. 7, lines 53-67 describes obtaining an average magnitude value for each frequency bin based on an instantaneous magnitude value of a respective bin.  Col. 13, line 51 – col. 15, line 13 where a mask value for each bin is calculated and is used to adjust the instantaneous bin magnitudes for each bin in order to excise/notch-out an interfering signal.  
Nieto suggests “and a Fast-Fourier Transform (FFT) for at least some of the frames is processed to modify a magnitude in each bin of the FFT in accordance with 
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to combine prior art elements according to known methods because the prior art included each element claimed, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference (Takara and Lee suggest receiving an audio signal, performing A/D conversion on the audio signal, performs frame-by-frame FFT, and performs frame-by-frame processing, and Nieto describes where processing performed on FFT includes calculating an instantaneous magnitude for each bin of the FFT, determining an average magnitude for each bin of the FFT, calculating a mask value for each bin of the FFT based on a respective average magnitude of a respective bin of the FFT, and adjusting the calculated instantaneous magnitude for each bin of the FFT based on the calculated mask value).  One of ordinary skill in the art could have combined the elements as claimed by known methods (by adding the calculating of an instantaneous magnitude for each bin of the FFT, the determining of an average magnitude for each bin of the FFT, the calculating of a mask value for each bin of the FFT based on a 
and a Fast-Fourier Transform (FFT) for at least some of the frames is processed to modify a magnitude in each bin of the FFT in accordance with a magnitude value of the corresponding bin in a preceding frame (paragraph 12;
	The combination [thus far] is as discussed in the portion of this rejection of claim 27 based on Nieto, including where one of the frame-by-frame processes [performed by the Takara/Lee combination] includes calculating an instantaneous magnitude for each bin of a frame’s FFT, determining an average magnitude for each bin of the frame’s FFT, calculating a mask value for each bin of the frame’s FFT based on a respective bin’s average magnitude value, and adjusting the instantaneous magnitude for each bin of the frame’s FFT based on a respective bin’s calculated mask value.
	Nieto’s average is not specifically based on a preceding frame’s magnitude value.
	Otani [paragraph 12] describes where determining an average can be performed by “averaging the amplitude components of input signals in past frames that do not include the voice of a speaker”.
	Otani thus suggests “and a Fast-Fourier Transform (FFT) for at least some of the frames is processed to modify a magnitude in each bin of the FFT in accordance with a magnitude value of the corresponding bin in a preceding frame”: where the average magnitude determined for each bin of a frame’s FFT in the Takara/Lee/Nieto combination is calculated by averaging instantaneous magnitudes of a respective bin, where the instantaneous magnitudes of a respective bin include instantaneous magnitudes of past frames [such that, for each bin, the instantaneous magnitude of the 
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of averaging with another because the prior art teaches the claimed invention except for the substitution of averaging which does not necessarily average magnitudes of preceding frames with averaging which does.  Otani teaches that averaging magnitudes of preceding frames was known in the art.  One of ordinary skill in the art could have substituted one type of averaging with another to obtain the predictable results of a system including a first device that acoustically transmits a sonic code and a second device which receives an audio signal including ambient sound and a sonic code, and performs audio processing that performs A/D conversion on the audio signal, performs an FFT, subtracts the ambient sound from the audio signal, and determines baseband data from the sonic code (as per Takara) where the audio processing also divides the A/D output into frames and provides the frames to the FFT, and where the audio processing is frame by frame processing (as per Lee) where the frame by frame audio processing includes calculating an instantaneous magnitude for each bin of the FFT, determining an average magnitude for each bin of the FFT, calculating a mask value for each bin of the FFT based on a respective average magnitude of a respective bin of the FFT, and adjusting the calculated instantaneous magnitude for each bin of the FFT based on the calculated mask value (as per Nieto) where each average magnitude is determined by averaging magnitudes of past frames (as per Otani).

Claim 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Takara, in view of Lee, Nieto, and Otani, as applied to claim 1, above, and further in view of Layton et al. (US 2016/0098989), hereafter Layton.

As per Claim 4, Takara, in view of Lee, Nieto, and Otani, do not, but Layton suggests wherein the environmental interference is reverberation (“The acoustic space 110 may be, for example, a room or an automobile interior. The acoustic space 110 may contain one or more acoustic sound sources including, for example, the known audio signal reproduced using the audio transducer 106 referred to as a reproduced known audio signal 108, a noise signal 120 associated with a noise source 118, and voice signals 116 associated with a user 114. The acoustic space 110 may contain one or more known audio signals 108, one or more noise signals 120 and one or more voice signals 116. The noise signal 120 may include, for example, air conditioner noise, road noise, wind noise, background noise, echoes and reverberation. In one alternative, a voice signal 116 may be considered to be a noise signal 120”, paragraph 17;
Takara [paragraph 21] describes many types of ambient sound including wind noise, car sounds, people’s voices, but not reverberation.
Layton teaches that noises that can be wind or road noise can also be reverberation [paragraph 17].
Layton thus suggests where the ambient sound “environmental interference within the received”/data form of the ambient sound+sonic code “signal” is, alternatively, reverberation)
.

Claims 8-9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Takara, in view of Lee, Nieto, and Otani, as applied to claim 1 above, and further in view of Haulick et al. (US 2008/0027722).

As per Claims 8-9, Takara, in view of Lee, Nieto, and Otani, do not, but Haulick suggests wherein an impulse response of an acoustic environment is calculated, wherein the impulse response is calculated via measurements of an acoustic space (Figure 1-3; paragraphs 21-26, 29-30;
Takara [paragraph 21 and 25] describes where ambient sound is “sounds in the environs of the location at which the sonic code is transmitted” and where a transmitter may be installed in, among other things, various types of facilities that a user visits [suggesting that the ambient sound is sound in a room of a facility], and as discussed in the rejection of claim 1, Takara suggests subtracting captured ambient sound from ambient sound+sonic code in order to extract sonic code [see paragraphs 61-68]
In Haulick, paragraph 21 describes where a “hands-free set 110 having a microphone” detects a “structure-borne noise component” [via an acoustic emission sensor] and where a background noise reduction system improves the quality of a speech signal detected by a microphone, where the microphone generates a “microphone output signal… representing the speaker’s utterance along with the structure-borne noise component” [where the structure-borne noise component is analogous to the captured ambient sound in Takara and where the microphone output signal is analogous to the ambient sound+sonic code signal in Takara].  Paragraphs 22-23 describes where the microphone output signal [in digitized form] and the structure-borne noise reference signal [in digitized form] is received by a noise compensation filter circuit.  Paragraphs 24-26 describe where filter coefficients corresponding to a noise compensation filter circuit are adapted using a process and may be calculated 
Haulick thus suggests where a noise signal subtracted from a speech+noise signal to generate a noise compensated speech signal is provided based on a calculated transfer function/impulse response of acoustic room 248 which is calculated based on determining/measuring/calculating acoustic properties of acoustic room 248 
	Haulick suggests where the ambient sound subtraction in Takara is, instead, performed by calculating a transfer function or impulse response of the room in which the mobile device and the transmitter are located based on “measurements” of the room [the data representation of the room obtained from the input of the room or additionally/alternatively, acoustic properties calculated by analyzing the data representation of the room obtained from the input of the room], determining, based on the calculated transfer function/impulse response, the ambient sound to be subtracted from the “received” data form of the ambient sound+sonic code “signal”, and then subtracting the determined ambient sound to be subtracted from the “received” data from of the ambient sound+sonic code “signal”.

For claim 9, determining the impulse response of the room based on the “measurements” [the data representation of the room obtained from the input of the room or additionally/alternatively, acoustic properties calculated by analyzing the data representation of the room obtained from the input of the room] can be interpreted as “wherein the impulse response is calculated via measurements of an acoustic space”)
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of environmental sound subtraction with another because the prior art teaches the claimed invention except for the substitution of environmental sound subtraction that does not necessarily determine sound to be used for environmental sound subtraction by calculating an impulse response of an acoustic environment via measurements of acoustic space with environmental sound subtraction that does.  Haulik suggests that environmental sound subtraction that determines sound to be used for environmental sound subtraction by calculating an impulse response of an acoustic environment via measurements of acoustic space was known in the art.  One of ordinary skill in the art could have substituted one type of environmental sound subtraction with another to obtain the predictable results of a device which receives an audio signal including ambient sound and a sonic code, and performs audio processing that performs A/D conversion on the audio signal, performs an FFT, subtracts the ambient sound from the audio signal, and determines baseband data from the sonic code (as per Takara) where the audio .

Claims 10-11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Takara, in view of Lee, Nieto, Otani, and Haulik, as applied to claim 8 above, and further in view of Yermeche et al. (US 2013/0034243), hereafter Yermeche.

	As per Claim 10, Takara, in view of Lee, Nieto, Otani, and Haulik, do not, but Yermeche suggests wherein the impulse response is processed to generate a transfer function (Figure 1; paragraphs 41, 47;
	The combination [thus far] is as discussed in the portion of this rejection of claims 8-9 based on Takara, Lee, Nieto, Otani, and Haulik, above, including where Haulik suggests calculating either the impulse response or the transfer function of the room, 
	Paragraph 41 describes where noise environments can be a car cabin or an office.  Paragraph 47 describes where a transfer function is obtained by applying an FFT to an impulse response.  Figure 1 depicts where the FFT in paragraph 47 is used in the context of removing estimated noise from a microphone signal [similar to Haulik]
Yermeche suggests where, in the embodiment of the Takara/Lee/Nieto/Otani/Haulik combination which determines the ambient sound to be subtracted from the “received” data form of the ambient sound+sonic code “signal” based on a transfer function of the room, an FFT is applied to a calculated impulse response of the room to calculate the transfer function of the room)
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of transfer function with another because the prior art teaches the claimed invention except for the substitution of a transfer function which is not necessarily generated from an impulse response with a transfer function which is.  Yermeche teaches that a transfer function which is generated from an impulse response was known in the art.  One of ordinary skill in the art could have substituted one type of transfer function with another to obtain the predictable results of a device which receives an audio signal including ambient sound and a sonic code, and performs audio processing that performs A/D conversion on the audio signal, performs an FFT, subtracts the ambient sound from the audio signal, and determines baseband data from the sonic code (as per Takara) where the 

As per Claim 11, Takara suggests wherein the received signal is processed… (Figure 5; paragraphs 61-68; paragraphs 11, 20, 21, 23-25, 28-36, 40, 45-46, 48, 51-52; Figures 1-2;
See portion of the rejection of claim 1 directed to “processing the received signal to minimize environmental interference within the received signal”, where the “received” data form of the ambient sound+sonic code “signal” is “processed” by subtracting the ambient sound from the “received” data form of the ambient sound+sonic code “signal”)
wherein the received signal is processed using… transfer function (Figure 1-3; paragraphs 21-26, 29-30;
As discussed in the rejection of claims 8-9, above:
	Haulick suggests where the ambient sound subtraction in Takara is, instead, performed by calculating a transfer function or impulse response of the room in which the mobile device and the transmitter are located based on “measurements” of the room [the data representation of the room obtained from the input of the room or additionally/alternatively, acoustic properties calculated by analyzing the data representation of the room obtained from the input of the room], determining, based on the calculated transfer function/impulse response, the ambient sound to be subtracted from the “received” data form of the ambient sound+sonic code “signal”, and then subtracting the determined ambient sound to be subtracted from the “received” data from of the ambient sound+sonic code “signal”.
For claim 8, calculating an impulse response of the room in which the mobile device and the transmitter are located can be interpreted as “wherein an impulse response of an acoustic environment is calculated”.
For claim 11, the embodiment where the ambient sound to be subtracted is determined based on the calculated transfer function also reads on “wherein the received signal is processed using… transfer function”)
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of environmental sound subtraction with another because the prior art teaches the claimed invention 
Takara, in view of Lee, Nieto, Otani, and Haulik, do not, but Yermeche suggests wherein the received signal is processed using the transfer function (Figure 1; paragraphs 41, 47;
	Same combination applied to reject claim 10, where Yermeche suggests where, in the embodiment of the Takara/Lee/Nieto/Otani/Haulik combination which determines the ambient sound to be subtracted from the “received” data form of the ambient sound+sonic code “signal” based on a transfer function of the room, an FFT is applied to the calculated impulse response of the room to calculate the transfer function of the room)
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of transfer function with another because the prior art teaches the claimed invention except for the substitution of a transfer function which is not necessarily generated from an impulse response with a transfer function which is.  Yermeche teaches that a transfer function which is generated from an impulse response was known in the art.  One of ordinary skill in the art could have substituted one type of transfer function with another to obtain the predictable results of a device which receives an audio signal including ambient sound and a sonic code, and performs audio processing that performs A/D conversion on the audio signal, performs an FFT, subtracts the ambient sound from the audio signal, and determines baseband data from the sonic code (as per Takara) where the audio processing also divides the A/D output into frames and provides the frames to the .

Claims 13-20, 23-25, 30-31 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jeong et al. (US 2015/0088495), hereafter Jeong, in view of Werner et al. (US 2007/0144235), hereafter Werner.

As per Claim 13, Jeong suggests A method for encoding data for acoustic transmission, including encoding data into an audio signal using a sequence of tones, wherein the audio signal is configured to minimize environmental interference… (Figures 1-2, 5, 11-12; paragraphs 29, 33, 34, 36, 39, 50, 52-57, 59, 66, 68-69, 70-71, 76-77, 80, 85, 111, 119; [all paragraphs and Figures are cited for each limitation with “key” paragraphs and Figures pertaining to each limitation identified 
Paragraph 33 describes “generating a sound code corresponding to information, by generating a multiple number of partial information corresponding to information, determining a frequency band corresponding to each of the of the multiple number of the generated partial information from an audible sound wave frequency band and a non-audible sound wave frequency band, determining a frequency corresponding to each of the multiple number of the generated partial information within the determined frequency band, generating a multiple number of sound signals corresponding to the multiple number of the determined frequencies, and combining the multiple number of the generated sound signals with one another depending on a preset time interval. In this case, the sound code may be represented by a sound QR code and generated based on frequencies within an audible sound wave frequency band or frequencies within a non-audible sound wave frequency band”.  Paragraph 34 describes outputting the generated sound code through a sound wave output device.  Paragraph 36 describes “When identical information (or particle information) continues, the encoding apparatus 10 may determine different frequencies corresponding to the two (2) or more respective continued partial information. Accordingly, the encoding apparatus 10 is capable of eliminating noise and errors resulting from a reverberant component generated when an identical frequency continues” [“particle” seems to be intended to be “partial”].  Paragraph 39 describes where the decoding apparatus, among other things, “divides the sound code depending on a preset time interval” [“preset time interval” is the same phrase used in paragraph 33 which describes “combining the multiple number 
	These portions suggest “A method for encoding data for acoustic transmission,”: encoding information/”data” [e.g., “01A”] into a sound code to be output/”transmitted” by the sound wave output device to a decoder [paragraph 33-34 describes generating a sound code corresponding to information and outputting the generated sound code through a sound wave output device, see also paragraphs 50, 52-57 and 59 which describe where characters are “encoded” into a sound code including sounds with frequencies corresponding to the characters, paragraphs 76-77 at least suggest where “decoded” information corresponding to a sound code can be a sequence of characters, i.e. “data”, which suggests that the information which is “encoded” as a sound code by the process described in paragraph 33 is a sequence of characters, particularly because Figures 11-12 depict encoding and decoding processes that are at least suggested to be “mirrors”/”reverses”/”inverses” of each other]
“including encoding data into an audio signal using a sequence of tones,”: generating a sound code “audio signal” that encodes a sequence of characters/”data” [paragraphs 33-34; 50, 52-57, 59, 76-77; Figures 11-12, as discussed in the previous paragraph] where the sound code “audio signal” is a time “sequence” of consecutive sound frames [suggested by paragraphs 70-71 which describes where a sound code is a sound wave that lasts 10 seconds and is divided into 10 1-second frames, which suggests that the 10 frames are in consecutive time order/”sequence”] where the consecutive sound frames each have a corresponding frequency “tone” such that the sound code “audio signal” made of the “sequence” of sound frames is a “sequence of 
“wherein the audio signal is configured to minimize environmental interference…” [paragraphs 36 and 53 describe where different frequencies are determined to correspond to identical partial information and accordingly the encoding apparatus is capable of eliminating noise and errors resulting from a reverberant component generated when an identical frequency continues, paragraphs 66 and 85 describe where a reverberation’s frequency component can seriously affect a “next partial information” “when identical partial information continues”, paragraphs 66 and 85 describe where “reverberations may exist depending on an interior structure in an indoor recognition environment” and paragraph 111 describes “environmental reverberation”, which at least suggests where the sound code “audio signal” is configured to have different frequencies/”tones” for identical partial information, such that “interference” due to “environmental” reverberation is eliminated/”minimized”])
Jeong does not, but Werner suggests A method for encoding data for acoustic transmission, including encoding data into an audio signal using a sequence of tones, wherein the audio signal is configured to minimize environmental interference and wherein the audio signal is configured by configuring the sequence of tones to insert space between at least some tones of the sequence of tones within the audio signal (paragraphs 18-20, 24-27;

Werner describes where an audible signal which is for outputting information can comprise a sequence of tones [paragraphs 24-27], and where an audio signal including a sequence of tones can have pauses [paragraphs 24, 27, i.e. sound “spaces” between tones in the sequence of tones].  
Werner thus suggests “A method for encoding data for acoustic transmission, including encoding data into an audio signal using a sequence of tones, wherein the audio signal is configured to minimize environmental interference and wherein the audio signal is configured by configuring the sequence of tones to insert space between at least some tones of the sequence of tones within the audio signal”: where the sound code in Jeong is more specifically a sequence of tones including pauses between at least some of the tones [where, to generate a tone sequence including pauses between tones, it is at least suggested that pauses/”spaces” are placed/”inserted” between tones, because a pause between tones logically cannot be a pause between tones unless the tones bordering the pause are determined to exist])
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of audio signal for outputting information with another because the prior art teaches the claimed invention except for the substitution of an audio signal for outputting information which is not necessarily a sequence of tones which includes pauses between tones with an audio signal for outputting information which is.  Werner teaches that an audio signal for 

As per Claims 30-31, they are directed to apparatus and computer readable medium equivalents of claim 13 and so are rejected under similar rationale (in Jeong, see paragraphs 29 and 33-34 for hardware and/or software and apparatus embodiment [apparatus of claim 30, where the hardware embodiment can be interpreted as including hardware element/elements that performs the processes of the apparatus, such that the hardware element/elements can each be interpreted as a “processor”] and see paragraph 119 for computer readable medium embodiments [computer readable medium of claim 31])

	As per Claim 14, Jeong suggests wherein characteristics of at least some of the tones of the sequence of tones are modified to minimize the environmental interference (Figures 1-2, 5, 11-12; paragraphs 29, 33, 34, 36, 39, 50, 52-57, 59, 66, 68-69, 70-71, 76-77, 80, 85, 111, 119;
	Paragraphs 52-53 describe where respective characters correspond to respective frequencies, and [as discussed in the last paragraph of the rejection of claim 
	These portions suggest where, for at least two of three or more identical partial information, a frequency “characteristic” identifying the identical partial information’s corresponding “tone” in the “sequence of tones” sound code is determined to be something different from what the frequency would have been based on character-to-frequency mapping, such that the frequency “characteristic” is a “modified” frequency “characteristic”, where the “tones” corresponding to the at least two of the three or more identical partial information is “at least some of” “the tones and/or sequence of tones”, where the frequencies of the at least two of the three or more identical partial information are collectively multiple “characteristics” of the “at least some of the tones and/or sequence of tones”, and where the “modification” is to eliminate/”minimize” reverberation “environmental interference”])

	As per Claim 15, Jeong suggests wherein the characteristics are modified based upon predictions of interference caused to the sequence of tones when received by a receiver (Figures 1-2, 5, 11-12; paragraphs 29, 33, 34, 36, 39, 50, 52-57, 59, 66, 68-69, 70-71, 76-77, 80, 85, 111, 119;
Jeong suggests “wherein the characteristics are modified” for the reasons discussed in the rejection of claim 14.


	As per Claim 16, Jeong suggests wherein the predictions relate to interference generated by acoustic transmission of the sequence of tones (Figures 1-2, 5, 11-12; paragraphs 29, 33, 34, 36, 39, 50, 52-57, 59, 66, 68-69, 70-71, 76-77, 80, 85, 111, 119;
Paragraphs 36, 53, 66, 85, 111, where the determining of different frequencies for two or more identical/continued partial information is to eliminate a reverberant component that “has a great effect on next partial information”.  Jeong thus suggests where “the predictions” [a prediction that one identical partial information’s tone’s 

As per Claims 17-18, and 24, Jeong suggests wherein the interference generated is non-direct acoustic energy and wherein the environmental interference is reverberation (Figures 1-2, 5, 11-12; paragraphs 29, 33, 34, 36, 39, 50, 52-57, 59, 66, 68-69, 70-71, 76-77, 80, 85, 111, 119;
As discussed in the rejection of claim 13 [last paragraph], paragraphs 36 and 53 describe where different frequencies are determined to correspond to identical partial information and accordingly the encoding apparatus is capable of eliminating noise and errors resulting from a reverberant component generated when an identical frequency continues, paragraphs 66 and 85 describe where a reverberation’s frequency component can seriously affect a “next partial information” “when identical partial information continues”, paragraphs 66 and 85 describe where “reverberations may exist depending on an interior structure in an indoor recognition environment” and paragraph 111 describes “environmental reverberation”, which at least suggests where the sound code “audio signal” is configured to have different frequencies/”tones” for identical 
Jeong thus suggests “wherein the environmental interference is reverberation” [claims 18 and 24] 
Applicant’s Specification, page 8, lines 4-5 describes where reverberation is an example of “non-direct acoustic energies”, and therefore by suggesting “wherein the environmental interference is reverberation”, Jeong also suggests “wherein the interference generated is non-direct acoustic energy” [claim 17])

As per Claims 19-20, Jeong suggests wherein the audio signal is configured by configuring the sequence of tones such that frequencies of at least some of the tones of the sequence of tones are arranged from high to low, wherein the at least some of the frequencies correspond to a plurality of tones at the beginning of the signal (Figures 1-2, 5, 11-12; paragraphs 29, 33, 34, 36, 39, 50, 52-57, 59, 66, 68-69, 70-71, 76-77, 80, 85, 111, 119;
Paragraphs 36, 53, 66, 85, 111, where the determining of different frequencies for two or more identical/continued partial information is to eliminate a reverberant component that “has a great effect on next partial information”.  
Jeong thus suggests where two consecutive frequencies differ in value, and thus suggests an embodiment where an earlier frequency of two consecutive frequencies is a higher frequency than a later frequency of the two consecutive frequencies [because if the values differ, that suggests both embodiments where the “first” value is greater than the “second” value, and where the “second” value is greater than the “first” value].  
Additionally, paragraphs 76-77 describe an example where a sound code consists of 3 frames/characters which suggests an embodiment where the two identical characters/partial information that are assigned a higher frequency then a lower frequency [suggested as discussed in the previous paragraph] correspond to the first two frames/frequencies/tones of a 3-frame/frequency/tone sound code [since two consecutive frames corresponding to the identical partial information are logically either the first two or the last two frames of a 3-frame sound code].  Jeong thus suggests “wherein the at least some of the frequencies correspond to a plurality of tones at the beginning of the signal” [the tones/frequencies corresponding to the two identical partial information are, and thus correspond to, the first two tones that form the “beginning of the” sound code “signal” relative to the last tone])

As per Claim 23, Jeong suggests wherein the audio signal is configured by configuring the sequence of tones to avoid repeating same or similar frequency tones one after the other (Figures 1-2, 5, 11-12; paragraphs 29, 33, 34, 36, 39, 50, 52-57, 59, 66, 68-69, 70-71, 76-77, 80, 85, 111, 119;

Paragraph 36 also describes where different frequencies corresponding to two “or more” respective continued partial information are determined.  Paragraphs 52-53 also describe where characters are mapped to respective frequencies and where identical partial information may be mapped to different frequencies.
These portions suggest where reverberation errors/noise are eliminated by ensuring that [instead of outputting the same tone for two consecutive identical characters based on a mapping of the character to a particular frequency] different tones are produced for two consecutive instances of the same partial information/character, and where, based on this reverberation-eliminating technique, the sound code does not have any consecutive tones that are the same [since tones corresponding to different characters are different and tones corresponding to two 

As per Claim 25, Jeong suggests the step of acoustically transmitting the audio signal for receipt by a microphone (Figures 1-2, 5, 11-12; paragraphs 29, 33, 34, 36, 39, 50, 52-57, 59, 66, 68-69, 70-71, 76-77, 80, 85, 111, 119;
Figures 1-2 and 5, and paragraphs 33-34 and 39, describe where an encoding apparatus generates a sound code corresponding to information [paragraph 33], outputs the generated sound code through a sound wave output device which can be a speaker device [paragraph 34, Figures 1-2, i.e. “acoustically transmits” the sound code, where the sound code is “the audio signal” as discussed in the rejection of claim 13, above, based on Jeong] and where a decoding apparatus receives the sound code output from the encoding apparatus through a sound wave reception device/unit which can be a microphone [paragraphs 39, 69, Figures 1 and 5].
Jeong thus suggests “the step of acoustically transmitting the audio signal for receipt by a microphone”: “acoustically” outputting/”transmitting”, by the encoding apparatus via a speaker, the sound code “audio signal” so that it can be received by a microphone of the decoding apparatus).

Claims 22, 32-33 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jeong et al. (US 2015/0088495), hereafter Jeong, in view of Zinser, Jr. et al. (US 2003/0195745), hereafter Zinser.

As per Claim 22, Jeong suggests A method for encoding data for acoustic transmission, including encoding data into an audio signal using a sequence of tones, the audio signal including a plurality of tone signals, wherein the audio signal is configured to minimize environmental interference… (Figures 1-2, 5, 11-12; paragraphs 29, 33, 34, 36, 39, 50, 52-57, 59, 66, 68-69, 70-71, 76-77, 80, 85, 111, 119; [all paragraphs and Figures are cited for each limitation with “key” paragraphs and Figures pertaining to each limitation identified below, i.e. all other paragraphs and Figures not specifically referenced for any particular limitation are eligible to provide context and additional support]
Paragraph 33 describes “generating a sound code corresponding to information, by generating a multiple number of partial information corresponding to information, determining a frequency band corresponding to each of the of the multiple number of the generated partial information from an audible sound wave frequency band and a non-audible sound wave frequency band, determining a frequency corresponding to each of the multiple number of the generated partial information within the determined frequency band, generating a multiple number of sound signals corresponding to the multiple number of the determined frequencies, and combining the multiple number of the generated sound signals with one another depending on a preset time interval. In this case, the sound code may be represented by a sound QR code and generated based on frequencies within an audible sound wave frequency band or frequencies within a non-audible sound wave frequency band”.  Paragraph 34 describes outputting the generated sound code through a sound wave output device.  Paragraph 36 
	These portions suggest “A method for encoding data for acoustic transmission,”: encoding information/”data”, e.g., “01A” into a sound code to be output/”transmitted” by the sound wave output device to a decoder [paragraph 33-34 describes generating a sound code corresponding to information and outputting the generated sound code through a sound wave output device, see also paragraphs 50, 52-57 and 59 which describe where characters are “encoded” into a sound code including sounds with frequencies corresponding to the characters, paragraphs 76-77 at least suggest where “decoded” information corresponding to a sound code can be a sequence of characters, i.e. “data”, which suggests that the information which is “encoded” as a sound code by the process described in paragraph 33 is a sequence of characters, particularly because Figures 11-12 depict encoding and decoding processes that are at least suggested to be “mirrors”/”reverses”/”inverses” of each other]

“wherein the audio signal is configured to minimize environmental interference…” [paragraphs 36 and 53 describe where different frequencies are determined to correspond to identical partial information and accordingly the encoding apparatus is capable of eliminating noise and errors resulting from a reverberant component generated when an identical frequency continues, paragraphs 66 and 85 describe where a reverberation’s frequency component can seriously affect a “next partial information” “when identical partial information continues”, paragraphs 66 and 85 describe where “reverberations may exist depending on an interior structure in an 
Jeong does not, but Zinser suggests A method for encoding data for acoustic transmission, including encoding data into an audio signal using a sequence of tones, the audio signal including a plurality of tone signals, wherein the audio signal is configured to minimize environmental interference and wherein the audio signal is configured by sharpening an amplitude envelope of each of the plurality of tone signals within the audio signal (paragraph 61;
Zinser describes formant enhancement which sharpens spectral peaks and depresses valleys to produce a crisper sound that is more intelligible, and where formant enhancement can be used to produce a “better sounding speech output” [paragraph 61].  
Tones have been described in the prior art as signals having spectral peaks with narrow bandwidth [see e.g. Manjunath et al., US 2007/0174052, “Systems, methods, and apparatus for the detection of signals having spectral peaks with narrow bandwidth (also called "tonal components" or " tones") are described herein”, paragraph 38; see e.g. Steele, US 2010/0290641, “The delayed input signal is then passed to filter 236 which is designed to output a signal which when subtracted at 238 from the input signal 110 removes spectral peaks, or tones, from the input signal 110, to produce whitened input signal 133”, paragraph 42; see e.g. Lee, US 2013/0275126, “A formant is an amplitude peak in a frequency spectrum of a sound or tone”, paragraph 6]

Zinser thus suggests “A method for encoding data for acoustic transmission, including encoding data into an audio signal using a sequence of tones, the audio signal including a plurality of tone signals, wherein the audio signal is configured to minimize environmental interference and wherein the audio signal is configured by sharpening an amplitude envelope of each of the plurality of tone signals within the audio signal”: where a sharpening function [as described by Zinser] is added to Jeong’s device, where the sharpening is performed on the sound code “audio signal” that is made of the “sequence” of sound frames and that is a “sequence of tones” and “includ[es] a plurality of tone signals”, where each of the sound code’s “tones” [one per frame in the sequence of frames] are sharpened by sharpening, for each frame of the sound code, a respective spectral peak.  Sharpening a spectral peak is suggested to be “sharpening an amplitude envelope” of a tone signal because, in order to be sharpened, a single tone of a frame logically needs to have a shape that can be sharpened [i.e. if the spectral peak is a vertical line, then it cannot be made any sharper], and the envelope of a single tone/peak signal is logically the shape of the single tone/peak signal [because a single tone signal is at least suggested to include only a single frequency of the single tone])
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to combine prior art elements according to known methods because the prior art included each element claimed, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the 

As per Claims 32-33, they are directed to apparatus and computer readable medium equivalents of claim 22 and so are rejected under similar rationale (in Jeong, see paragraphs 29 and 33-34 for hardware and/or software and apparatus embodiment [apparatus of claim 32, where the hardware embodiment can be interpreted as including hardware element/elements that performs the processes of the apparatus, such that the hardware element/elements can each be interpreted as a “processor”] and see paragraph 119 for computer readable medium embodiments [computer readable medium of claim 33])

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
2004/0148166 teaches “Next an estimation of the spectrum relating to the input noise-contaminated speech signal is performed in a signal-plus-noise spectrum estimation module 22. The signal-plus-noise spectrum estimation module 22 first averages the magnitude or power spectrum S over three to five calculation frames of the input noise-contaminated speech signal, then calculates the estimation of the spectrum Sc relating to the input noise-contaminated speech signal using equation (1) 1 Firstly : D ( i ) = j = 1 k S ( i , j ) ; k = 3 5 , i = 0 , N ;” (paragraph 46)
2013/0064392 teaches “Average noise information: Information obtained by averaging magnitude (or power) of the same frequency component of a plurality of spectra derived by Fourier transform for the whole of known noise (over a plurality of frames). That is, so-called an average spectrum averaged along the time axis” (paragraph 67)
2015/0349841 teaches “The example RENC 119 filters out the cancellation part by modifying the spectral amplitudes of each frequency bins |E.sub.k(l)| in equation (4) by applying the gain estimates G.sub.k(l) as below” (paragraph 51) but does not appear to describe where the modifying of the spectral amplitudes of each frequency bin is in accordance with a magnitude value of the corresponding bin in a preceding frame.
6766300 teaches “The magnitude in each bin is then compared with the magnitude in the preceding frame at the same frequency bin, and a sum over all FFT 
2006/0253209 teaches “After the phase change has been reduced in range and offset by the appropriate amount, it is then used by block 1016 to calculate the desired phase angle of the output bin by adding the phase change to the phase angle of the previous FFT frame”.
2014/0074469 teaches “In the proposed method, to generate a compact signature of acoustic signal one should perform the following consecutive steps: [0022] (1) Firstly, the digitized sound signal shall be divided into (overlapped) frames. [0023] (2) Then (optionally) for each frame the smoothing window function (e.g. Hann window) shall be applied [0024] (3) After that, the Fourier transform (FT) for the current frame shall be computed and the output samples shall be squared. [0025] (4) Then, from each squared FT output value for the current frame the corresponding value for the previous frame shall be subtracted as D(n,k)=X(n,k)=X(n-l,k) where X(n,k) is a squared output of k-th Fourier transform bin for n-th frame. [0026] (5) After that, the differences D(n,k) shall be divided into M groups (m=1,2, . . . ,M) with l subgroups in each group; where each subgroup consists of fixed number (P.sub.m) of difference samples D(n,k). [0027] (6) Values of D(n,k), corresponding to each subgroup shall be accumulated, such that for each group one obtains a set of accumulated values S(n,m,i)” (paragraphs 21-31).  This reference similarly does not appear to alter the magnitudes of the current frame based on the magnitudes of the previous frame (a difference is calculated but the difference appears to be used in an audio signature comparison and search, and does 
10090003 teaches “An example in which an input audio signal is a broadband audio signal sampled at 16 kHz and the input audio signal uses 20 ms as a frame is used, former fast Fourier transform (FFT) of 256 points and latter FFT of 256 points are performed on a current audio frame of every 20 ms, two FFT windows are overlapped by 50%, and frequency spectrums (energy spectrums) of two subframes of the current audio frame are obtained, and are respectively marked as C.sup.0(i) and C.sup.1(i), i=0, 1, . . . , 127, where C.sup.x(i) denotes a frequency spectrum of an x.sup.th subframe. Data of a second subframe of a previous frame needs to be used for FFT of a first subframe of the current audio frame, where C.sup.x(i)=rel.sup.2(i)+img.sup.2(i), where rel(i) and img(i) denote a real part and an imaginary part of an FFT coefficient of the i.sup.th frequency bin respectively. The frequency spectrum C(i) of the current audio frame is obtained by averaging the frequency spectrums of the two subframes, where C(i)=1/2(C.sup.0(i)+C.sup.1(i)).”  This reference appears to calculate a frequency spectrum by averaging spectrums of subframes, not modifying the magnitude of a bin of an FFT based on a previous frame magnitude.
9270811 teaches “In addition, child options can include actionable indicia 430 that, in response to actuation, can instruct or otherwise direct the digital-to-audio option unit 260 to cause the communication device 200 to generate and/or transmit a sequence of tones corresponding to the option 320 and the information conveyed in the options 410 and 440. It should be appreciated that the sequence of tones also can include pauses and/or any trailing tones, such as the tone corresponding to a "#."”
4101885 teaches “Another object of this invention is to produce a melody card having contact areas sized and spaced to provide the length, sequence and pauses between the tones being played”
5048074 teaches “Typically, the features of a VMS are controlled by executing a sequence of Dual Tone Multifrequency (hereinafter referred to as DTMF) signals from a "Touch tone" type telephone. Each VMS is designed by its manufacturer to be controlled by varying sequences of DTMF signals from the user. For example, a sequence of tones and pauses might start playing a message, skip back 10 seconds, skip back to the beginning, skip forward 10 seconds, skip back to the beginning, skip forward 10 seconds, skip forward to the end of the message, or stop playing a message. Each manufacturer assigns its own communications standards which relate which touch tone signals or DTMF signals correspond to which functions of the VMS. Subscribers are provided with the proper dictionary of signals to function. VMS are thus a very useful means of conveying and retaining oral information. Moreover, much of the intent and focus of VMS is to remove the need for the generation of documents for the purpose of communication with others”
2002/0184010 teaches “FIG. 5 is a diagram illustrating the power transfer function of a noise-suppressing filter. It is noted that it has peaks at approximately the same frequencies as the spectrum in FIG. 4. The effect of applying this filter to the spectrum in FIG. 4 is to sharpen the peaks and to lower the valleys, as illustrated by FIG. 6, which is a diagram comparing the power transfer function of the original synthesis filter to the true and approximate noise suppressed filters” (paragraph 31, Figures 4-6).  This reference describes sharpening each of a plurality of peaks of a decoder end whereas claims 22 and 32-33 are directed to sharpening an audio signal’s tone envelopes at an encoder end.
10186251 teaches “The purpose of matching standard deviation in addition to mean squared error is to lessen the effect of "over-smoothing", which is a problem well known to those skilled in the art. Over-smoothing produces a muffling sound effect in the generated speech. If the training criterion is based on the MSE alone, the DNN parameters are trained to produce highly averaged features. This is one major cause of the over-smoothing problem. This muffling effect is evident in the FIG. 2B, which shows very wide formant peaks. The relatively wide peaks can be contrasted with the sharp and prominent peaks present in actual speech from a real person, as illustrated in FIG. 2A. Inclusion of the standard deviation in the training criterion reduces this averaging and produces more human-like spectral features. This is evident in FIG. 2C where the formant peaks in the frequency domain are more prominent. In some embodiments, the voice conversion system is configured to further sharpen the formants peaks using a high-pass filter in the log-spectrum domain, for example. High-pass filtering may be performed as a post-processing step to produce the waveform spectrogram illustrated in FIG. 2D”.  This reference describes sharpening multiple formant peaks.
6711538 teaches “The peak sharpening section 32 detects a peak value exceeding a predetermined threshold value, of the interpolated adaptive signal exc.sub.PW of the wide-band speech signal, forms the peak value to a more sharpened waveform by suppressing the sample values before and after the detected peak value, and outputs it to the adder 14 at a subsequent stage. As a result, higher-frequency components occur in the adaptive signal exc.sub.PW of the band-widened speech signal”.  This reference appears to be directed to sharpening a single peak.
4045616 teaches “In the disclosed vocoder system, the input speech signal (or other signal) is divided into frames of equal duration. A Laplace transform is taken on each frame, and the energy associated with each complex conjugate pole-pair is determined from the residue and damping rate. (The terms poles and pole-pairs are used interchangeably in the application. As may be seen from the model of the speech waveform each pole is in fact a pole-pair in the S-plane.) In one embodiment, the pole-pairs are ranked by energy, and the frequency, damping rate, magnitude and phase angle (and also the delay) for a number of pole-pairs, representing the highest energy, is transmitted. In another embodiment, the pole-pairs for transmission are selected by a thresholding means, after the input speech energy level is normalized. In the thresholding means, those poles whose energy content are above a predetermined level are selected for transmission. In the presently preferred embodiment, the Laplace transform is performed by " sharpening" the peaks of the Fourier transform representation of each frame of data. In this manner, interaction between the "skirts" of the peaks is minimized, allowing the frequencies (along the axis) of the peaks to be 
Tarr, E. W. (2013). Processing perceptually important temporal and spectral characteristics of speech (Order No. 3673304). Available from ProQuest Dissertations and Theses Professional. (1647737151). Retrieved from https://dialog.proquest.com/professional/docview/1647737151?accountid=131444 teaches “There are several reasons to consider processing the GSE as a method to enhance the speech signal for perception. First, listeners with sensorineural hearing loss can have broadened auditory filters. This was concluded because listeners with hearing loss show poorer frequency discrimination at high frequencies where sensitivity thresholds were diminished, but also at low frequencies where sensitivity thresholds were normal [Turner and Nelson(1982)]. Sharpening the spectral envelope of speech could compensate for the decreased spectral resolution for these listeners.  Second,
when children produce speech, their vocal tract resonances are not as sharp as adult
speakers. Processing the spectral envelope of children’s speech would be one method
to make it resemble adult’s speech more closely. Thirdly, when children perceive speech, they weight the dynamic spectral structure of speech as being significant.
It was found that children perceptually weight the spectral skeletons in their native language more than the temporal amplitude envelope [Nittrouer, Lowestein, and Packer(2009)]. Sharpening the spectral envelope may help children perceptually attend to the characteristics of a speech that are most important. Finally, when noise interferes with a speech signal, it is more difficult for listeners to perceive the speech
signal. Wide-band noise makes it more difficult to perceive speech in part because

be a way to help listeners perceive speech in the presence of noise” (Section 3.2) and “A pitch synchronous framework for representing acoustic characteristics of speech
has been presented. The primary application of this signal processing it to investigate
the perceptual importance of the acoustic characteristics of speech signals. There may
also be opportunities for enhancing the perception of speech, with one example being
processing the spectral envelope of speech to sharpen formant peaks” (Section 7.1).  This reference does describe sharpening multiple formant peaks (at least suggested to be formant peaks of a spectral envelope).  This reference is directed to enhancing perception of speech, and it is not clear if this concept is applicable to machine “listening” where machines may not have the same listening limitations as humans.
C. Beaugeant and H. Taddei, "Quality and computation load reduction achieved by applying smart transcoding between CELP speech codecs," 2007 15th European Signal Processing Conference, 2007, pp. 1372-1376.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC YEN whose telephone number is (571)272-4249.  The examiner can normally be reached on M-F 12:00PM -8:30PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RICHEMOND DORVIL can be reached on (571)272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.







/ERIC YEN/           Primary Examiner, Art Unit 2658