Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-20 rejected under 35 U.S.C. 103 as being unpatentable over Guo: 20150243284 and further in view of Wang: 20050228651.
Regarding claim 1, 14, 20
Guo teaches:
An apparatus, method and means for: 
receiving a digital representation of an audio signal (Guo: ¶ 65, 169, 184; Fig 3, 12, 22: a received noisy speech signal such as by microphone 2249 converted to digital representation such as at audio codec 2251, the means for receiving comprising a microphone and digital interface considered substantially similar to the microphone input and encoder of disclosed in the instant specification ¶ 39; Fig 2);
 identifying, based at least in part on receiving the digital representation of the audio signal, a database that is pre-encoded and that comprises a quantity of digital representations of other audio signals (Guo: ¶ 64-69, Fig 3, 12: a recognized speaker provides indicia thereof to dictionary selection module which is used to obtain dictionaries generated by offline learning module 348 and provided at speaker dictionary database 350; as compared to coding manager 625, 710 of the instant specification ¶ 73, 84-86),

wherein the quantity of digital representations of other audio signals satisfies a set of thresholds (Guo: ¶ 13, 51, 68-73, 91-98, 114-125, etc.: dictionary representing user specific speech signals selected, the speech signals used for the dictionary representation thereof corresponding to a particular threshold),
the set of thresholds comprising one or more of a power level threshold, a sampling frequency threshold, or a bit depth threshold (Guo: ¶ 13, 51, 68-73, 91-98, 114-125: speech signal corresponding to a determined threshold used to populate a particular speech dictionary based on relation of input signal to a power level threshold in the form of a signal to noise ratio threshold by which signal power is related to noise power);

 encoding the digital representation of the audio signal using a machine learning scheme and information from the database pre-encoded according to the coding standard (Guo: 8-12, 162-165; Fig 3, 12, 22: reconstructed encoding of the input speech signal generated by machine learning at least by training, converging, etc. a filter based on a provided dictionary and/or minimizing a reconstruction error based on a provided dictionary; further machine learning operates to generate the dictionaries in the dictionary database, each of which recursively applies machine learning to encoding and reconstructing of the input audio signal; see instant specification: ¶ 50; machine learning component trained to improve encoding efficiency); 
generating a bitstream of the digital representation that is compatible with the coding standard based at least in part on encoding the digital representation of the audio signal (Guo: ¶ 73; an output module selects an appropriate signal for output based on reconstruction error; instant specification: ¶ 49 Fig 3: a codec operative to output an audio signal compatible with parameters of the input audio signal); and outputting a representation of the bitstream (Guo: ¶ 73; Fig 3: selected signal provided at output).

Guo identifies a database comprising digital representations of audio signal as claimed and strongly suggests that the identified database is pre-encoded according to a coding standard. That is, broadly and reasonably, the pre encoded databases of Guo participate in a coding standard as they utilize a codec and are pre encoded based on the codec but such coding is more implicit than explicitly discussed. 

In a related field of endeavor Wang teaches a speech codec (Wang: Abstract; ¶ 13); suitable to provide a digital representation of input speech using and encoder (Wang: ¶65-67; Fig 2: speech is coded into a buffer in concert with a speech encoder to generate a packetized set of frames) wherein selection of a particular codebook is determined by a frame rate change, format, etc. (Wang: ¶ 143). To perform coding of a uniform quality the Wang codec monitors a relationship between a current frame quality and a threshold (Wang: 141) and thereby adjusts parameters of a rate controller (Wang: ¶ 69, 70, 141; Fig 2: controller 220) to change a rate, quality or loss resiliency of a coding process (Wang: ¶ 6, 70, 141; Table 1; Fig 2), the recited rate and quality comprising a sampling rate and a bit depth (Wang: ¶ 6, 7, 9-15; Table 1). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to utilize the Wang taught identification using a coding standard within the Guo system and method. It would have been further obvious to one of ordinary skill in the art before the effective filing date of the instant application to include the Guo taught power thresholding in combination with the Wang taught thresholding relative to sample rate and bit depth to adjust the coding efficiency of the Guo in view of Wang pre-encoded database. The average skilled practitioner would have been motivated to do so for the purpose of adapting a frame rate for quality with respect to a signal to noise ratio, matching an input rate with an output rate, or indeed to benefit from any of the well-known utility of codec standards comprising the recited thresholds without altering the functionality of the taught devices in full expectation of predictable results .

Regarding claim 2, 15
Guo in view of Wang teaches or suggests:
An apparatus, method and means, further comprising: 
pre-encoding the database according to the coding standard prior to receiving the digital representation of the audio signal (Guo: ¶ 168, 169: speaker dictionary encoded in concert with input/output parameters of a codec 2251; input speech selects among pre-encoded dictionaries); (Wang: ¶ 10, 13, 143: codec managers codebook compatibility with input/output signal); and 
selecting the pre-encoded database based at least in part on a criterion, wherein identifying the database pre-encoded according to the coding standard is based at least in part on the selecting (Wang: ¶ 10, 13, 143: selection of a particular codebook is determined by input and/or output properties of an audio stream; i.e. a frame rate change, format, etc. thereby generating a packetized set of frames following a particular protocol). The claim is considered obvious over Guo as modified by Wang as addressed in the base claims as it would have been obvious to apply the further teachings of Guo and/or Wang to the modified device, method, etc. of Guo and Wang.
Regarding claim 3
Guo in view of Wang teaches or suggests:
An apparatus, method and means, wherein the criterion comprises one or more of a format of the audio signal, a transmission rate associated with a transmission of the audio signal, or a network associated with the transmission of the audio signal (Wang: ¶ 10, 13, 143: selection of a particular codebook is determined by input and/or output properties of an audio stream; i.e. a frame rate change, format, etc. thereby generating a packetized set of frames following a particular protocol). The claim is considered obvious over Guo as modified by Wang as addressed in the base claims as it would have been obvious to apply the further teachings of Guo and/or Wang to the modified device, method, etc. of Guo and Wang.


Regarding claim 4, 16
Guo in view of Wang teaches or suggests:
An apparatus, method and means, wherein pre-encoding the database according to the coding standard comprises: 
encoding a set of packets according to the coding standard (Guo: ¶ 64-69, 169; Fig 3, 12, 22: input samples encoded by a codec and used to populate a dictionary database); (Wang: ¶65-67; Fig 2: speech is coded into a buffer in concert with a speech encoder to generate a packetized set of frames following a particular protocol), wherein one or more packets of the set of packets correspond to a database frame in the database (Guo: ¶ 128, etc.: frames of the input speech modelled based on a coding standard and used to determine a dictionary, codebook, etc. and resolve entries with the dictionary corresponding to the frame); (Wang: ¶ 11, 79-85; Fig 3: system operates to resolve and buffer frames at a mux wherein the buffered frames correspond to frame information in determined codebooks and the packets assembled based on classification, codebook and other predictive parameters); and inserting a set of reset frames between one or more packets of the encoded set of packets (Wang: ¶ 97-102; Fig 2, 5, 6: mux operates to buffer frames and insert a reset frame based on a variety of determined criteria said reset or intra frame interposed the output P packets wherein the set comprises a set of one or multiple reset frame(s)). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to utilize the Wang taught framing and intra frame assignments based on determinations made in concert with the Wang taught classifier, codebooks, etc. within the Guo system and method. The average skilled practitioner would have been motivated to do so for the purpose of providing an enhanced speech output while minimizing coding complexity, output error, etc. and would have expected predictable results therefrom.

Regarding claim 5, 17
Guo in view of Wang teaches or suggests:
An apparatus, method and means, further comprising: 
determining a set of reference points associated with the database based at least in part on the set of packets (Guo: ¶ 90: system determines a pitch peak and a sample count separating pitch peaks); (Wang: ¶ 19, 110-114; Table 3: system determines reset points, frame and sub frame boundaries based on parameters of the selected coding standard, thereby parsing the incoming audio samples into frames, sub-frames, etc.); and 
assigning the set of reference points in the database based at least in part on a parameter comprising a distance between reset frames of the set of reset frames, wherein inserting the set of reset frames is based at least in part on the assigning (Wang: ¶ 19, 110-114, Table 3: system maintains a distance between inter frame directive of intra frame insertions and manages the insertion of intra frames based thereon). The claim is considered obvious over Guo as modified by Wang as addressed in the base claims as it would have been obvious to apply the further teachings of Guo and/or Wang to the modified device, method, etc. of Guo and Wang.


Regarding claim 6
Guo in view of Wang teaches or suggests:
An apparatus, method and means, further comprising: selecting a value of the distance from a range of distance values, wherein assigning the set of reference points in the database based at least in part on the selecting (Wang: ¶ 19, 110-114, Table 3: system maintains a distance between inter frame directive of intra frame insertions and manages the insertion of intra frames based thereon wherein the assignment is based on selecting a distance with respect to a packet loss rate). The claim is considered obvious over Guo as modified by Wang as addressed in the base claims as it would have been obvious to apply the further teachings of Guo and/or Wang to the modified device, method, etc. of Guo and Wang.

Regarding claim 7, 18
Guo in view of Wang teaches or suggests:
An apparatus, method and means, wherein encoding the digital representation of the audio signal comprises: 
ignoring, based at least in part on the set of reset frames, one or more dependencies of a packet of the encoded set of packets with respect to one or more other packets of the encoded set of packets (Guo: ¶ 160: unvoiced frames are output with reconstructed speech modelling removed, ignored, discarded); (Wang: ¶ 102-109, Claim 1, 67: system operates to determine reset frames, intra frames, etc. and ignore long term prediction results by inclusion of a specially coded start vector thereby removing dependence on previous frame samples, silence, unvoiced frames rely ignore both long term prediction and intra frame coding) ; and 
encoding a current input frame of the audio signal based at least in part on the ignoring (Wang: ¶ 102-109: the specially coded start vector encoded with the output signal directs the decoder to ignore frames encoded therewith). The claim is considered obvious over Guo as modified by Wang as addressed in the base claims as it would have been obvious to apply the further teachings of Guo and/or Wang to the modified device, method, etc. of Guo and Wang.

Regarding claim 8
Guo in view of Wang teaches or suggests:
An apparatus, method and means, further comprising: determining a set of continuous packets of the encoded set of packets, wherein inserting the set of reset frames between the one or more packets of the encoded set of packets comprises: inserting a first reset frame prior to a first packet of the set of continuous packets of the encoded set of packets; and inserting a second reset frame after a last packet of the set of continuous packets of the encoded set of packets (Wang: ¶ 19, 110-114, Table 3: system maintains a distance between inter frame directive of intra frame insertions and manages the insertion of intra frames based thereon wherein the assignment is based on selecting a distance with respect to a packet loss rate). The claim is considered obvious over Guo as modified by Wang as addressed in the base claims as it would have been obvious to apply the further teachings of Guo and/or Wang to the modified device, method, etc. of Guo and Wang.
 
Regarding claim 9
Guo in view of Wang teaches or suggests:
An apparatus, method and means, further comprising: determining one or more of a coding mode or a pitch gain associated with the coding standard, wherein pre-encoding the database is based at least in part on one or more of the coding mode or the pitch gain associated with the coding standard (Wang: ¶ 50, 81, etc.: system comprises a codec switchable among multiple coding modes as well as pitch/gain values to encode data, said encoded data is shared with and used to derive and employ a pre encoded codebook). The claim is considered obvious over Guo as modified by Wang as addressed in the base claims as it would have been obvious to apply the further teachings of Guo and/or Wang to the modified device, method, etc. of Guo and Wang.

Regarding claim 10
Guo in view of Wang teaches or suggests:
An apparatus, method and means, further comprising: estimating a scan result associated with the digital representation of the audio signal and the database, wherein encoding the digital representation of the audio signal is based at least in part on the scan result (Guo: 75-79: signal reconstructed based on matching pursuit scanning of the determined dictionary for appropriate pitch/envelope values); (Wang: ¶ 97-104; Fig 2, 6: encoder scans upcoming frames to determine information suitable to justify insertion of reset frames). The claim is considered obvious over Guo as modified by Wang as addressed in the base claims as it would have been obvious to apply the further teachings of Guo and/or Wang to the modified device, method, etc. of Guo and Wang.

Regarding claim 11
Guo in view of Wang teaches or suggests:
An apparatus, method and means, further comprising: training the machine learning scheme to match one or more scanning approach decisions for one or more digital representations of one or more audio signals with respect to the database, wherein estimating the scan result is based at least in part on the training. Examiner has taken official notice which Applicant has failed to timely and explicitly traverse and it is thus accepted as Admitted Prior Art (APA: please see MPEP 2144.03) that recognition of database scanning approaches on the part of individual algorithms would have comprised an obvious inclusion. The average skilled practitioner would have been motivated to do so for the purpose of leveraging machine learning to reduce computational complexity and would have expected predictable results therefrom. 

Regarding claim 12
Guo in view of Wang teaches or suggests:
An apparatus, method and means, wherein encoding the digital representation of the audio signal comprises: encoding the digital representation jointly according to the coding standard and an additional coding standard different from the coding standard (Guo: ¶ 12, 13, 64-73; Fig 3: system encodes digital representation based on a first and second codebook and determines output of either the first or second representation based on a threshold, a first portion of an audio output above the threshold and a second portion of an audio output below the threshold are based on different codebooks). The claim is considered obvious over Guo as modified by Wang as addressed in the base claims as it would have been obvious to apply the further teachings of Guo and/or Wang to the modified device, method, etc. of Guo and Wang.

Regarding claim 13, 19
Guo in view of Wang teaches or suggests:
An apparatus, method and means, further comprising:
receiving a digital representation of a second audio signal (Guo: ¶ 65, 169, 184; Fig 3, 12, 22: a received noisy speech signal such as by microphone 2249 converted to digital representation such as at audio codec 2251; system operates to receive a first user speech and remains operable to receive subsequent, second, etc. user speech); 
identifying, based at least in part on the receiving of the digital representation of the second audio signal, a set of weighting coefficients of the machine learning scheme, wherein the set of weighting coefficients are associated with an additional coding standard different from the coding standard (Guo: ¶ 12, 13: a first input audio and/or second subsequent input audio determines first and second pre encoded dictionaries said dictionaries derived based on a first and second coding standard); 
encoding the digital representation of the second audio signal using the machine learning scheme based at least in part on one or more weighting coefficients of the set of weighting coefficients (Guo: ¶ 12, 13, 64-73, 93-98; Fig 3: first and/or second subsequent input audio encoded based on converged filter coefficients and/or matching pursuit determined pitch/envelope values, action coefficients thereof); 
generating a second bitstream of the digital representation of the second audio signal that is compatible with the additional coding standard based at least in part on the encoding of the digital representation of the second audio signal (Guo: ¶ 12, 13, 64-73, 93-98; Fig 3: system generates a residual noise suppressed signal in concert with the first dictionary and a reconstructed speech signal in concert with the second dictionary); and 
outputting a representation of the second bitstream  (Guo: ¶ 12, 13, 64-73, 93-98; Fig 3: a reconstruction error value below a threshold directs output of a second representation of the second, subsequent bitstream, a reconstruction error value above a threshold directs output of a first representation of the second, subsequent bitstream). The claim is considered obvious over Guo as modified by Wang as addressed in the base claims as it would have been obvious to apply the further teachings of Guo and/or Wang to the modified device, method, etc. of Guo and Wang.

Response to Arguments
Applicant's arguments filed 4/8/22 have been fully considered but they are not persuasive. Applicant argues that the rejection over Guo and Wang has not shown how a database comprising of a generic speaker dictionary is the same as "a database … that comprises a quantity of digital representations of other audio signals, wherein the quantity of digital representations of other audio signals satisfies a set of thresholds," as recited in independent claim 1.
Examiner respectfully disagrees. The amended claims are addressed in the rejection of the independent claims presented supra. In the rejection it is shown that Guo teaches the claimed subject matter including a database comprising independent speaker dictionaries and that the dictionaries are encoded based on a threshold and that the threshold comprises a power threshold (please see the rejection to independent claims 1, 14, 20 supra; Guo: ¶ 13, 51-53, 68-73, 91-98, 114-125). That is, Guo discusses a database comprising dictionaries for speech encoding representative of digital signals according to a coding standard, as well as, the amended subject matter. The dictionary representations comprise a quantity of digital representations of other audio signals and they satisfy thresholds. Guo teaches that speech dictionaries generally, as well as a dictionary specific to a user are generated using clean speech, representations thereof, etc. and that said clean speech is selected based on a signal to noise ratio exceeding a threshold (Guo: ¶ 51-53, 114, 125, etc.), as the instant specification as filed 10/2/19 is not specific with regard to the composition of the recited “power threshold” the signal to noise ratio threshold, which relates signal power to noise power, is considered a power threshold. As such Guo clearly teaches the amended subject matter. Further, Wang amplifies the recited thresholds to additionally include sampling frequency and/or bit depth thresholds in as much as Wang teaches adapting a codec, that is, a coding standard by which the Guo in view of Wang database is pre-encoded. The Wang codec is adapted with respect to a signal quality threshold and the signal quality threshold operates as an equivalent to a sampling frequency and/or bit depth threshold in as much as these are the parameters which comprise quality and are additionally the parameters which are adjusted when the threshold is not met. (please see the rejection to independent claims 1, 14, 20 supra; Wang: 6, 7, 9-15, 65-70, 141-143; Table 1). As such Applicant arguments regarding the teachings of the Guo in view of Wang taught system, method and means are not considered persuasive.
Applicant additionally argues that the Office Action does not explain how the proposed modification of Guo would provide "a database that is pre-encoded according to a coding standard," and that the Office Action does not establish that the asserted combination of Guo and Wang is the same as the features recited in independent claim 1. Applicant alleges that the Office Action has not shown that a person having ordinary skill in the art would arrive at all of the features of independent claim independent claim 1 by any combination of Guo and Wang.
	Examiner respectfully disagrees. Examiner has shown supra that Guo teaches the recitations of claim 1, strongly suggesting but lacking explicit disclosure of a database pre encoded according to a coding standard. In the rejection Examiner held that “the pre encoded databases of Guo participate in a coding standard as they utilize a codec and are pre encoded based on the codec but such coding is more implicit than explicitly discussed.” Next Examiner cited Wang which teaches a speech codec (Wang: Title; Abstract; etc.) and held that the Wang teachings would be obvious to combine and provided sufficient motivation concluding with a reasonable expectation of success as no expected results would arise therefrom. As such Examiner has provided the appropriate findings to establish a prima facie case of obviousness in keeping with at least MPEP 2143 G. As such Applicant’s arguments are not persuasive and no claims currently stand allowable.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL C MCCORD whose telephone number is (571)270-3701. The examiner can normally be reached 730-630 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VIVIAN CHIN can be reached on 5712727848. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PAUL C MCCORD/Primary Examiner, Art Unit 2654