Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification

The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1,2,4-7,9-12,14-17,19, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Hetherington (20150302862) in view of Scheirer et al (20150237454) .

As per claim 1, Hetherington (20150302862) teaches an electronic apparatus (as adaptive equalization system improving the intelligibility of speech – para 0014) comprising: an inputter (as audio signal source input – fig. 1); and a processor (para 0016, processor) configured to,
based on receiving an audio signal through the inputter, obtain a speech intelligibility for the audio signal (as, measuring the speech intelligibility – Fig.2, subblock 208, based on the input audio signal, fig. 2, input to subblock 202; para 0023), and modify the audio signal so that the speech intelligibility becomes a target intelligibility (as using a weighted long term speech curves – para 0030, which are based on LTASS templates, based on the user environment where the speech is captured – para 0029), wherein the target intelligibility is set based on 
	As per claim 1, as discussed above, Hetherington (20150302862) teaches modifying the signal according to a speech intelligibility measure in the environment that the speech was captured, but does not explicitly teach “scene information regarding a type of audio”; Scheirer et al (20150237454) analyzed audio signals to determine content of the material and adjusts the audio accordingly (abstract; para 0002, 0004; para 0005 showing metadata identifying closed captioning, type of data, program data, para 0006 – episode content; and then adjusting the audio based on speech, music – para 0007).  Therefore, it would have been obvious to one of ordinary skill in the art of audio adjustments to expand the intelligibility measures of Hetherington (20150302862) beyond speech to include music as well as audio content, as taught by Scheirer et al (20150237454), because it would advantageously improve the delivery of the audio signal in the room, without having to separately adjust the audio (Scheirer et al (20150237454), para 0002).   

As per claim 2, the combination of Hetherington (20150302862) in view of Scheirer et al (20150237454) teaches the electronic apparatus of claim 1, wherein the processor is further configured to calculate the speech intelligibility based on a speech signal and a non-speech signal other than the speech signal, included in the audio signal (as, operating on non-speech sounds as taught in Scheirer et al (20150237454) – para 0007; in combination with Hetherington’s intelligibility measuring system).

As per claim 4, the combination of Hetherington (20150302862) in view of Scheirer et al (20150237454) teaches the electronic apparatus of claim 2, wherein the speech intelligibility is one of a signal to noise ratio (SNR) of the speech signal and the non-speech signal included in the audio signal (Hetherington, as calculating subband SNR – para 0026) and a speech intelligibility index (SII) based on the speech signal and the non-speech signal (Hetherington, with the speech intelligibility measurement – para 0026).

As per claim 5, the combination of Hetherington (20150302862) in view of Scheirer et al (20150237454) teaches the electronic apparatus of claim 4, wherein: the speech intelligibility is the SNR (Hetherington, para 0026, wherein the SNR on the subband is used for the speech intelligibility measurement); and the processor is further configured to adjust a gain of the speech signal by as much as a difference value between the target intelligibility and the obtained speech intelligibility to modify the audio signal (Hetherington, as, calculating the equalization coefficients operating on the subband, to increase the intelligibility – para 0039; wherein the recalculated equalization coefficient is based on an equalization gain – para 0034, see equations, especially Gn,k – related to n,k en,k/|Xn,k|2, wherein en,k is based on a differential between background noise |Bn,k|2 – Gn-1,k|Xn,k|2 )

As per claim 6, the combination of Hetherington (20150302862) in view of Scheirer et al (20150237454) teaches the electronic apparatus of claim, 4, wherein: the speech intelligibility is the SII (Hetherington – using the Speech Intelligibility Index, para 0029, 0030); the processor is further configured to calculate a gain adjustment value and adjust a gain of the speech signal by as much as the calculated gain adjustment value to modify the audio signal; the gain adjustment value is calculated according to: 
gain adjustment value=α*(SII.sub.target−SII.sub.measurement)+β; and SII.sub.target denotes the target intelligibility, SII.sub.measurement denotes the obtained speech intelligibility, and α and β denote constant values experimentally calculated through a change in a number of the SII over a change in the gain of the speech signal (see Hetherington, para 0039, calculating equalization coefficients G(n,k); reflecting back on para 0033 defining the error signal using previous coefficients measured against power spectrum of the speech signal compared to a weighted long-term speech curve; and then matching the above equations – looking at Hetherington para 0034, the claimed “alpha” matches to the scaling factor ‘gamma’ -- , and the claimed  matches to the last two terms in the error function in para 0034; examiner notes that the error function is based on the speech intelligibility measurement, the subband background noise, and the processed version of the input signal (sub_target) – see para 0033).

As per claim 7, the combination of Hetherington (20150302862) in view of Scheirer et al (20150237454) teaches the electronic apparatus of claim 1, wherein the processor is further configured to obtain at least one audio feature with respect to the audio signal and obtain the scene information based on the obtained at least one audio feature ( Scheirer et al (20150237454), as, determining content from the audio signal – as determining the dynamic range of the audio signal – para 0023, to determine if the programming is a commercial or the actual show being watched – para 0023).

As per claim 9, the combination of Hetherington (20150302862) in view of Scheirer et al (20150237454) teaches the electronic apparatus of claim 1, wherein the target intelligibility is set differently with respect to different audio types ( Hetherington teaching the intelligibility based on speech in noise or clean speech, based on the environment – para 0029; and then modified by Scheirer et al (20150237454), to expand to music, speech, audio, from the differing types of content – para 0004, 0005, 0016).

As per claim 10, the combination of Hetherington (20150302862) in view of Scheirer et al (20150237454) teaches the electronic apparatus of claim 1, wherein, based on the audio type being the sound effect, the target intelligibility is set to be higher than a case in which the audio type is the shouting (as, modifying the intelligibility based on loudness/volume – see Hetherington – para 0030 – clean quiet versus loud/noisy; and then modified by the teachings of Scheirer et al (20150237454), -- dealing with loud commercial sounds – para 0023; examiner notes that one of ordinary skill in the art of audio loudness in mixed signals of programming content and commercials, that the volume level of commercials is on a similar level of shouting vs normal listening levels).


Claims 11,12,14-17,19 are method claims whose steps are performed by the apparatus claims 1,2,4-7,9,10 and as such, claims 11,12,14-17,19 are similar in scope and content to the apparatus claims 1,2,4-7,9,10; therefore, claims 11,12,14-17,19 are rejected under similar rationale as presented against claims 1,2,4-7,9,10.

Claim 20 is an electronic apparatus claims whose elements are common with claims elements found in claims 1,2,4-7,9,10 and as such, claim 20 is similar in scope and content to the apparatus claims 1,2,4-7,9,10; therefore, claim 20 is rejected under similar rationale as presented against claims 1,2,4-7,9,10. 

Claim(s) 3,8,13,18 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Hetherington (20150302862) in view of Scheirer et al (20150237454) in further view of Crow et al (20190166435) .

As per claim 3, the combination of Hetherington (20150302862) in view of Scheirer et al (20150237454) teaches the electronic apparatus of claim 2, wherein the processor is further configured to extract the speech signal included in the audio signal Hetherington (20150302862) in view of Scheirer et al (20150237454) does not explicitly teach the use of artificial intelligence model (as defined in applicants spec as machine learning or cnn,rnn, neural networks); Crow et al (20190166435) teaches the use of machine learning/neural networks algorithms to separate audio sources based on audio artifacts as well as context parameters (para 0014, para 0030, para 0072).  Therefore, it would have been obvious to one of ordinary skill in the art of audio source analysis to modify the audio signal processing as taught by the combination of Hetherington (20150302862) in view of Scheirer et al (20150237454) with machine learning/neural network based audio source analysis, as taught by Crow et al (20190166435), because it would advantageously leverage processing power of the auxiliary device performing these signal analysis ( Crow et al (20190166435), para 0014).
Similarly to claim 8, the advantage of the machine learning/neural network artificial intelligence models as taught by Crow et al (20190166435) operates on the derived audio sources in Scheirer et al (20150237454), which are from ‘scene information’ – see Scheirer et al (20150237454), analyzing audio signals to determine content of the material and adjusts the audio accordingly (abstract; para 0002, 0004; para 0005 showing metadata identifying closed captioning, type of data, program data, para 0006 – episode content; and then adjusting the audio based on speech, music – para 0007).  
	Claims 13,18 are method claims whose steps are performed by the apparatus claims 3,8 and as such, claims 13,18 are similar in scope and content to the apparatus claims 3,8; therefore, claims 13,18 are rejected under similar rationale as presented against claims 3,8 above.


Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
The following references were found, towards the improvement of speech intelligibility:

Bruhn (20090018825) teaches speech intelligibility testing and improvements (para 0003, 0005, 007), using models such as GMM’s (para 0017)

Nash et al (9031838) teaches continuously measuring voice clarity and speech intelligibility by evaluating a plurality of telecommunications channels in real time. Voice clarity and speech intelligibility measurements may be formed from chained, configurable DSPs that can be added, subtracted, reordered, or configured to target specific audio features. Voice clarity and speech intelligibility may be enhanced by altering the media in one or more of the plurality of telecommunications channels. Analytics describing the measurements and enhancements may be displayed in reports, or in real time via a dashboard
 
Freedman (20040249650) teaches altering the speech intelligibility based on noise analysis and removal – para 0050. 


Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Opsasnick, telephone number (571)272-7623, who is available Monday-Friday, 9am-5pm. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Mr. Richemond Dorvil, can be reached at (571)272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Michael N Opsasnick/Primary Examiner, Art Unit 2658                                                                                                                                                                                                        06/17/2022