DETAILED ACTION

Notice of  AIA  Status
The present application, filed on or after November 11, 2020, is being examined under the first inventor to file provisions of the AIA . 

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 11/11/2020 have being considered by the examiner.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:

A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.




Claims 1, 2, 3, 6, 8, 9,10, 13,15, 16 and 17 are rejected under 35 U.S.C. 102 as being anticipated by Nyayate et al.(US 20210360349 A1), hereinafter referenced as Nyayate. 

Regarding Claim 1, Nyayate teaches a system comprising: a processor (Para. [0088], Fig.8, item 802); and one or more memory units (Para. [0089], Fig.8, item 804); 
a first digital audio file stored in the one or more memory units, the first digital audio file comprising a plurality of spoken instructions (Para. [0048], lines 9-13, the captured speech or utterances can produce the first digital audio file);
an audio quality enhancement module stored in the one or more memory units, the audio quality enhancement module executed by the processor and configured to: access the first digital audio file (Para. [0051], lines 1-4, a digital audio signal is provided as input to an audio denoiser pipeline. (Fig.1, item 114)); 
convert the first digital audio file to a first spectrogram image (Para.[0051], lines4-8, the output of the feature extractor ( fig. 1, item 116) is an audio spectrum);
apply a filter to determine whether an image quality of the first spectrogram image is below a predetermined image quality (Para.[0053], lines 7-16, Para.[0054],lines 1-5, Fig.1, item 118 and 120, the noise model and the post processing model determine the image quality of the spectrogram);
in response to determining that the image quality of the first spectrogram image is below the predetermined image quality, generate a second spectrogram image from the first spectrogram image using a training model (Para.[0055]),lines 7-9, spectrogram of input noisy audio is multiplied by a noise mask to obtain a second spectrogram of clean speech.)
the second spectrogram image having a higher image quality than the image quality of the first spectrogram image; and convert the second spectrogram image to a second digital audio file (Para.[0055], lines 9-11, second clean spectrogram is converted to a time domain audio containing clean speech.);
and a requirements clustering module stored in the one or more memory units, the requirements clustering module executed by the processor and configured to: convert, using an encoder, the second digital audio file into a plurality of vectors, each vector corresponding to a particular one of the plurality of spoken instructions (Para.[0056]: “an encoder with a two dimensional (2D) convolution stack 204…..outputs of these paths can then be concatenated into a single large array, which acts as a set of all feature vectors”);
identify a plurality of related vectors from the plurality of vectors (Para.[0057]: “a first GRU layer tries to learn important patterns from this concatenated array, and following layers can attempt to reconstruct desired patterns”);
concatenate the plurality of related vectors together in order to create a plurality of concatenated vectors (Para.[0057]: “a first GRU layer tries to learn important patterns from this concatenated array, and following layers can attempt to reconstruct desired patterns”);
generate, using a decoder on the plurality of concatenated vectors, a third digital audio file, the third digital audio file comprising concatenated spoken instructions from the first digital audio file (Para. [0227]: instruction decoder 2028; Para. [0228]: "micro-instructions" or "micro-operations" (also called "micro ops" or "uops"- third digital file));
and store the third digital audio file in the one or more computer-readable non-transitory storage media (Para.[0228]: lines 9-10, “stored within microcode ROM 2032”).

Regarding Claim 2, Nyayate teaches the system of claim 1, Nyayate further discloses wherein the first digital audio file is converted to the first spectrogram image using short-time Fourier transform (STFT) (Para. [0055], lines 6-7).

Regarding Claim 3, Nyayate teaches the system of claim 1, Nyayate further discloses wherein the second spectrogram image is converted to the second digital audio file using inverse short-time Fourier transform (ISTFT) (Para.[0055], lines 9-11 ).

Regarding Claim 6, Nyayate teaches the system of claim 1, Nyayate further discloses wherein converting, using the encoder, the second digital audio file into the plurality of vectors comprises using Tensorflow (Para.[0080], line 10).

Claims 8,9,10,13 are method claims performing the steps in system claims 1,2,3,6 above and as such, claims 8,9,10,13 are similar in scope and content to claims 1,2,3,6 and therefore, claims 8,9,10,13 are rejected under similar rationale as presented against claims 1,2,3,6 above.

Claims 15,16,17 are non-transitory storage media (Para.[0411]) claims performing the steps in system claims 1,2,3 above and as such, claims 15,16,17 are similar in scope and content to claims 1,2,3 and therefore, claims 15,16,17 are rejected under similar rationale as presented against claims 1,2,3 above.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 4, 5, 7, 11, 12, 14, 18, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Nyayate as stated above, in view of Sharma (US 20190332840 A1), hereinafter referenced as Sharma.

Regarding Claim 4, Nyayate teaches the system of claim 1, Nyayate further discloses wherein generating the second spectrogram image from the first spectrogram image using the training model (Para.[0055]), Nyayate fail to teach comprises: increasing a pixel density of the first spectrogram image using bicubic interpolation; and whitening out noise in the first spectrogram image.

However, Sharma explicitly teaches  comprises: increasing a pixel density of the first spectrogram image using bicubic interpolation (Para.[0612], line 7, bi-cubic interpolation, Para.[0613], processing pixels); 
and whitening out noise in the first spectrogram image (Para.[0348], Gaussian white noise ).

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Sharma’s teaching of bi-cubic interpolation and whitening out noise into the method and system of removing noise and enhancing digital audio signal as taught by Nyayate, because of a good tradeoff between computational speed and accuracy, on modest processors and to assess robustness. (Sharma, Para.[0612], [0348]).

Claims 11 and 18 are similar in scope and content of claim 4, and therefore, are rejected under similar rationale.

Regarding Claim 5, Nyayate teaches the system of claim 1, Nyayate further discloses wherein applying the filter to determine whether the image quality of the first spectrogram image is below the predetermined image quality (Para.[0055]), Nyayate fail to teach comprises: determining whether a dots-per-inch (DPI) of the first spectrogram image is less than a predetermined DPI amount; or determining whether a signal-to-noise ratio of the first spectrogram image is less than a predetermined noise ratio amount.

However, Sharma explicitly teaches  comprises: determining whether a dots-per-inch (DPI) of the first spectrogram image is less than a predetermined DPI amount (Para.[0454-0456], detector DPI comparison); 
or determining whether a signal-to-noise ratio of the first spectrogram image is less than a predetermined noise ratio amount (Para.[0236]).

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Sharma’s teaching of dots per inch and signal to noise ratio into the method and system of removing noise and enhancing digital audio signal as taught by Nyayate, because  this would enable the user to measure the resolution to improve the quality of an image (Sharma, Para.[0455]).

Claims 12 and 19 are similar in scope and content of claim 6, and therefore, are rejected under similar rationale.

Regarding Claim 7, Nyayate teaches the system of claim 1,  wherein identifying the plurality of related vectors from the plurality of vectors (Para.[0057]), Nyayate fail to teach comprises: calculating a standard deviation of each of the plurality of vectors; and comparing the standard deviations of each of the plurality of vectors in order to identify the plurality of related vectors.

However, Sharma explicitly teaches comprises: calculating a standard deviation of each of the plurality of vectors; and comparing the standard deviations of each of the plurality of vectors in order to identify the plurality of related vectors (Para.[0479],[0480], comparison of standard deviation to evaluate image).

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Sharma’s teaching of standard deviation into the method and system of removing noise and enhancing digital audio signal as taught by Nyayate, because this would enable the user to measure the average value, and/or the standard deviation which is then used to evaluate the image to see if it is encoded with Protocol A, B or C. (Sharma, Para.[0479]).

Claims 14 and 20 are similar in scope and content of claim 7, and therefore, are rejected under similar rationale.



Conclusion
Listed below are the prior arts made of record and not relied upon but are considered pertinent to applicant's disclosure.
Visser et al.  (US 20160284346 A1) Disclosed is a feature extraction and classification methodology wherein audio data is gathered in a target environment under varying conditions. From this collected data, corresponding features are extracted, labeled with appropriate filters (e.g., audio event descriptions), and used for training deep neural networks (DNNs) to extract underlying target audio events from unlabeled training data. Once trained, these DNNs are used to predict underlying events in noisy audio to extract therefrom features that enable the separation of the underlying audio events from the noisy components thereof.[Abstract].
Weber et al. (US 20210110841 A1) A computer implemented method and system of transforming an audio signal into a haptic data to fit into a haptic perceptual bandwidth of an electronic device having at least one actuator is disclosed. The method and system receives the audio signal; filters the audio signal into one or more frequency bands with each frequency band having a center frequency and time-amplitude values; authors the one or more frequency bands by modifying the time-amplitude values by changing, appending or deleting one or more time amplitude values to create an authored audio descriptor data. [Abstract].
 Casado et al. (US 20150235637 A1) The technology described in this document can be embodied in a computer-implemented method that includes receiving, at a processing system, a first signal including an output of a speaker device and an additional audio signal. The method also includes determining, by the processing system, based at least in part on a model trained to identify the output of the speaker device, that the additional audio signal corresponds to an utterance of a user. The method further includes initiating a reduction in an audio output level of the speaker device based on determining that the additional audio signal corresponds to the utterance of the user.[Abstract].
Wingate et al. (US 20170178664 A1) Use of spoken input for user devices, e.g. smartphones, can be challenging due to presence of other sound sources. Blind source separation (BSS) techniques aim to separate a sound generated by a particular source of interest from a mixture of different sounds. Various BSS techniques disclosed herein are based on recognition that providing additional information that is considered within iterations of a nonnegative tensor factorization (NTF) model improves accuracy and efficiency of source separation.[Abstract]

Any inquiry concerning this communication or earlier communications from the examiner should be directed to NADIRA SULTANA whose telephone number is (571)-272-4048.  The examiner can normally be reached on 7:30AM-5:00PM (EST); M-F. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571)-272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/N.S./Examiner, Art Unit 2658     

/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658