DETAILED ACTION
Notice of AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Regarding U.S. Provisional Patent Application No. 62/988,337, Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) is acknowledged. 
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 03/10/2021 has been considered by the examiner.
However, the information disclosure statement filed on 09/07/2022 fails to comply with 37 CFR 1.98(a)(1), which requires the following: (1) a list of all patents, publications, applications, or other information submitted for consideration by the Office; (2) U.S. patents and U.S. patent application publications listed in a section separately from citations of other documents; (3) the application number of the application in which the information disclosure statement is being submitted on each page of the list; (4) a column that provides a blank space next to each document to be considered, for the examiner’s initials; and (5) a heading that clearly indicates that the list is an information disclosure statement.  The information disclosure statement has been placed in the application file, but the information referred to therein has not been considered.
	For clarity, the references provided in the 09/07/2022 IDS on form PTO/SB/08a (01-22), including 3 U.S. patents, 31 U.S. Patent Application Publications, 2 foreign patent documents, and 23 non-patent literature documents have been considered by the examiner.
	However, the table on page 2 of the “transmittal letter of information disclosure statement” provided on 09/07/2022, has not been considered.  Per 37 CFR 1.98(a)(1), the table on page 2 of the transmittal letter does not identify “U.S. patents and U.S. patent application publications … in a section separately from citations of other documents” and further does not provide a column for the examiner’s initials.  Further, per 37 CFR 1.98(a)(2), applicant has not provided a legible copy of each foreign patent and each cited pending unpublished U.S. application.  
The examiner further notes the following from MPEP 2004:
13. It is desirable to avoid the submission of long lists of documents if it can be avoided. Eliminate clearly irrelevant and marginally pertinent cumulative information. If a long list is submitted, highlight those documents which have been specifically brought to applicant’s attention and/or are known to be of most significance. See Penn Yan Boats, Inc. v. Sea Lark Boats, Inc., 359 F. Supp. 948, 175 USPQ 260 (S.D. Fla. 1972), aff’d, 479 F.2d 1338, 178 USPQ 577 (5th Cir. 1973), cert. denied, 414 U.S. 874 (1974). 

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because in Fig. 1, reference character “72” has been used to designate both a user (see para. 0038) and a machine learning model (see para. 0111).  
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because in Figs. 9 and 15, reference character “900” has been used to designate both a step for receiving feature-based voice data (Fig. 9, para. 0121) and an error minimization process (Fig. 15, para. 0182).
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference character(s) not mentioned in the description: Fig. 1, reference 74 (speech processing system)
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
The disclosure is objected to because of the following informalities:
Para. 0011, “and an speech” should read “and a speech”
Para. 0032, “client applications 22, 24, 26, 28, 66” should read “client applications 22, 24, 26, 28, 68”
Para. 0033, line 3, “speech-to-text (SST)” should read “speech-to-text ( STT)”
Appropriate correction is required.
35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

The examiner notes that independent claims 1, 8, and 15 recite a computer-implemented method, computer program product, and computing system, respectively, where each claim recites generating, via a machine learning model, one or more augmentations of the feature-based voice data.  The examiner notes that these claims cannot practically be performed in the human mind, e.g., the claims are not directed towards a mental process, and in particular, the “generating, via a machine learning model, one or more augmentations of the feature-based voice data based upon, at least in part, the feature-based voice data and the one or more data augmentation characteristics” cannot practically be performed in the human mind, because the human mind cannot practically generate such augmentations.  The examiner further notes that after a reasonable search, the examiner was unable to find sufficient evidence that use of a machine learning model for voice data augmentation was well-known, conventional, or routine as of the effective filing date of the present application.  Therefore, the examiner finds that claims 1, 8, and 15 are subject-matter eligible under 35 U.S.C. section 101, and because claims 2-7, 9-14, and 16-20 dependent from claims 1, 8, and 15, respectively, such claims are similarly subject-matter eligible under 35 U.S.C. section 101.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 3, 10, and 17 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 3 recites the limitation “the one or more feature-based augmentations” in lines 3-4.  There is insufficient antecedent basis for this limitation in the claim.  This limitation will be interpreted to refer to “augmentations of the feature-based voice data” generated in claim 1.
Claim 10 recites the limitation “the one or more feature-based augmentations” in line 3.  There is insufficient antecedent basis for this limitation in the claim.  This limitation will be interpreted to refer to “augmentations of the feature-based voice data” generated in claim 8.
Claim 17 recites the limitation “the one or more feature-based augmentations” in line 3.  There is insufficient antecedent basis for this limitation in the claim.  This limitation will be interpreted to refer to “augmentations of the feature-based voice data” generated in claim 15.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 4, 6, 8, 11, 13, 15, 18, and 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Hsu, Wei-Ning, et al. "Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation." 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2017, pp. 16-23, hereinafter referenced as HSU..

Regarding claim 1, HSU discloses:
A computer-implemented method, executed on a computing device, comprising: (Data augmentation performed using a variational auto-encoder (VAE), utilizing a “computational network toolkit” for neural network-based acoustic model training; p. 16, section 1, p. 19, sections 5.1-5.3)
receiving feature-based voice data; (ASR system receives input speech is represented by frames and each frame is represented using filter-bank features with delta and delta-delta coefficients, which is then input into the VAE, e.g., received by the VAE; p. 16, section 1 and p. 19, section 5.3)
receiving one or more data augmentation characteristics; and (input speech signal is augmented with nuisance attributes, e.g., data augmentation characteristics, by replacing latent nuisance representation from the VAE, e.g., VAE receives nuisance attributes for generating new augmentations (e.g., using perturbation), where nuisance attributes are factors that affect the surface form of a speech utterance, including speaker identity, channel, and background noise; p. 16, section 1, pp. 17-19 sections 3.1 and 3.3, and p. 19, section 5.2)
generating, via a machine learning model, one or more augmentations of the feature-based voice data based upon, at least in part, the feature-based voice data and the one or more data augmentation characteristics. (seq2seq LSTM VAE, e.g., a machine learning model, performs data augmentation on speech and generates a diverse dataset for training a neural network-based acoustic model, by taking the input speech represented as frames, e.g., feature-based voice data and the nuisance attributes, e.g., data augmentation characteristics; pp. 16-17, sections 1, 2 and p. 19, section 5.2)
Regarding claim 4, HSU discloses:
performing, via the machine learning model, one or more gain-based augmentations on at least a portion of the feature-based voice data based upon, at least in part, the feature-based voice data and the one or more data augmentation characteristics. (seq2seq LSTM VAE, e.g., a machine learning model, performs gain-based data augmentation on speech and generates a diverse dataset for training a neural network-based acoustic model, by taking the input speech represented as frames, e.g., feature-based voice data, and the nuisance attributes, e.g., data augmentation characteristics, where the nuisance attributes in the latent nuisance representation are perturbed using a perturbation vector, where the perturbation is performed using perturbation ratio γ, where γ = 0.5 to decrease the latent nuisance representation space and make the signal closer to the clean original data signal, and γ = 1.5 and 2.0 to increase the latent nuisance representation space and amplify the nuisance space; pp. 16-17, sections 1, 2, p. 19, section 5.2, pp. 20-21, sections 6.2-6.4 and Table 1)

Regarding claim 6, HSU discloses:
performing, via the machine learning model, one or more audio feature-based augmentations on at least a portion of the feature-based voice data based upon, at least in part, the feature-based voice data and the one or more data augmentation characteristics. (seq2seq LSTM VAE, e.g., a machine learning model, performs feature-based data augmentation on speech and generates a diverse dataset for training a neural network-based acoustic model, by taking the input speech represented as frames, e.g., feature-based voice data, and the nuisance attributes, e.g., data augmentation characteristics, where the nuisance attributes in the latent nuisance representation are perturbed using a perturbation vector, background noise in four recording locations – bus, café, pedestrian area, and street junction, are represented in the latent nuisance representation space; pp. 16-17, sections 1, 2, p. 19, sections 5.2 and 6, pp. 20-21, sections 6.2-6.4 and Table 1; the examiner notes that the broadest reasonable interpretation of “audio feature-based augmentation” includes augmentations pertaining to noise components, such as road noise or noise from an open window; see para. 0124 in instant specification)

Regarding claim 8, HSU discloses:
A computer program product residing on a non-transitory computer readable medium having a plurality of instructions stored thereon which, when executed by a processor, cause the processor to perform operations comprising: (Data augmentation performed using a variational auto-encoder (VAE), utilizing a “computational network toolkit” (CNTK) for neural network-based acoustic model training, where the CNTK open-source toolkit is provided by Microsoft Research; p. 16, section 1, p. 19, sections 5.1-5.3, p. 23 ref [29]; pursuant to MPEP 2131.01 II., extrinsic evidence may be used to explain the meaning of a term used in the primary reference; the examiner cites to the GitHub repository for the Microsoft Cognitive Tooklkit (CNTK), available at archive.org as of Jan. 12, 2019, https://web.archive.org/web/20190112114726/https://github.com/microsoft/CNTK, which defines the CNTK a unified deep learning toolkit that describes neural networks and can be implemented across multiple GPUs and servers, where reference source code is provided that includes instructions that can be executed by a GPU, e.g., a processor)
The remaining claimed limitations in claim 8 pertaining to instructions causing the claimed processor to perform operations correspond to the computer-implemented method of claim 1 and therefore claim 8 is rejected under the same grounds, i.e., under 35 U.S.C. 102 in view of HSU, as set forth above with respect to claim 1.

Claim 11 depends from claim 8 and claims a computer program product having instructions that when executed correspond to the computer-implemented method of claim 4 and therefore claim 11 is rejected under the same grounds as claims 4 and 8 above.
Claim 13 depends from claim 8 and claims a computer program product having instructions that when executed correspond to the computer-implemented method of claim 6 and therefore claim 13 is rejected under the same grounds as claims 6 and 8 above.

Regarding claim 15, HSU discloses:
A computing system comprising: a memory; and a processor configured to (Data augmentation performed using a variational auto-encoder (VAE), utilizing a “computational network toolkit” (CNTK) for neural network-based acoustic model training, where the CNTK open-source toolkit is provided by Microsoft Research; p. 16, section 1, p. 19, sections 5.1-5.3, p. 23 ref [29]; pursuant to MPEP 2131.01 II., extrinsic evidence may be used to explain the meaning of a term used in the primary reference; the examiner cites to the GitHub repository for the Microsoft Cognitive Tooklkit (CNTK), available at archive.org as of Jan. 12, 2019, https://web.archive.org/web/20190112114726/https://github.com/microsoft/CNTK, which defines the CNTK a unified deep learning toolkit that describes neural networks and can be implemented across multiple GPUs and servers, where reference source code is provided, e.g., on GitHub servers having memory, that includes instructions that can be executed by a GPU, e.g., a processor, and together the GPUs and servers with memory comprise a computing system)
The remaining claimed limitations in claim 15 pertaining to configuration of the claimed processor correspond to the computer-implemented method of claim 1 and therefore claim 15 is rejected under the same grounds, i.e., under 35 U.S.C. 102 in view of HSU, as set forth above with respect to claim 1.

Claim 18 depends from claim 15 and claims a computing system having a processor configured to perform a process that corresponds to the computer-implemented method of claim 4 and therefore claim 18 is rejected under the same grounds as claims 4 and 15 above.
Claim 20 depends from claim 15 and claims a computing system having a processor configured to perform a process that corresponds to the computer-implemented method of claim 6 and therefore claim 20 is rejected under the same grounds as claims 6 and 15 above.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 2, 3, 5, 9, 10, 12, 16, 17, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over by HSU in view of Salamon, Justin, et al. "Deep convolutional neural networks and data augmentation for environmental sound classification." IEEE Signal processing letters (2017), pp. 279-283, hereinafter referenced as SALAMON.

Regarding claim 2, the HSU discloses the computer-implemented method of claim 1, including the step of “wherein receiving the feature-based voice data” (see claim 1).  HSU further discloses:
converting the signal from the time domain to the feature domain, thus defining the feature-based voice data. (HSU discloses performing data augmentation on speech data, where speech audio signals are represented by frames and each frame is represented using filter-bank features with delta and delta-delta coefficients; HSU, p. 16, section 1 and p. 19, section 5.3)

However, HSU fails to explicitly teach:
extracting acoustic metadata from a signal

	However, in a related field of endeavor, SALAMON pertains to using a CNN architecture to classify sound clips into environmental sound classes, including air conditioning, car horn, children playing, dog barking, drilling, engine idling, gunshots, jackhammer, siren, and street music.  (p. 281, section II.C).  SALAMON further discloses utilizing data augmentation to increase a particular environmental sound dataset.  (pp. 279-280, section 1).

The HSU-SALAMON combination makes obvious:
extracting acoustic metadata from a signal (SALAMON discloses that a log-scaled mel-spectogram representation of an audio signal is input into a CNN used to classify the audio into environmental sound classes, including air conditioning, car horn, children playing, dog barking, drilling, engine idling, gunshots, jackhammer, siren, and street music; e.g., the classification is the acoustic metadata; SALAMON, p. 280, section II.A and p. 281, section II.C; the HSU-SALAMON combination now utilizes the SALAMON classifier to classify environmental sounds in the input speech signal of HSU, HSU, p. 16, section 1 and p. 19, section 5.2 and SALAMON, p. 280, section II.A and p. 281, section II.C; the examiner notes that the broadest reasonable interpretation of “extracting acoustic metadata” includes determining properties of the signal without exposing or describing any speech content, such as determining if the signal has speech content, noise content, etc.; para. 0044 in instant specification)

	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the environmental sound classification teachings of SALAMON and HSU.  As disclosed in SALAMON, one of ordinary skill would be motivated to detect and classify environmental sounds for applications such as complex aware computing, surveillance, and noise mitigation, e.g., understand the type of environmental noise so it can be filtered out/removed.  (p. 279, section 1). As disclosed in SALAMON, one of ordinary skill would further be motivated to apply the teachings of SALAMON to HSU because SALAMON also pertains to data augmentation to create a larger dataset of environmental sounds to create a training dataset. (p. 279, section 1).

Regarding claim 3, the HSU-SALAMON combination discloses the computer-implemented method of claim 2, including the step “wherein generating, via the machine learning model, the one or more augmentations of the feature-based voice data includes” (see claim 1).  The HSU-SALAMON combination makes obvious
generating, via the machine learning model, the one or more feature-based augmentations of the feature-based voice data based upon, at least in part, the feature-based voice data, the one or more data augmentation characteristics, and the acoustic metadata. (SALAMON discloses classifying an audio signal into environmental sound classes, e.g., the classification is the acoustic metadata; SALAMON, p. 280, section II.A and p. 281, section II.C; SALAMON further discloses mixing samples with background noise sounds; SALAMON , p. 281, section II.B; HSU discloses a seq2seq LSTM VAE for performing feature-based data augmentation, e.g., augmentation utilizing a latent nuisance representations from noisy backgrounds like on a bus, in a café, street junction or pedestrian area, on speech input data converted to frames represented by filter bank coefficients; HSU, p. 16, section 1 and p. 19, sections 5.3 and 6; the HSU-SALAMON combination now performs feature-based data augmentation, e.g., augmentation utilizing latent nuisance representations from noisy backgrounds like on a bus, in a café, street junction or pedestrian area, on speech input data, and mixes in background noises identified by the SALAMON classifier, e.g. acoustic metadata, to generate additional augmentations; HSU, pp. 16-17, sections 1, 2, p. 19, sections 5.3 and 6, pp. 20-21, sections 6.2-6.4 and Table 1 with SALAMON, pp. 280-281, sections II.A-C)

Regarding claim 5, HSU discloses the computer-implemented method of claim 1, including the step “wherein generating, via the machine learning model, the one or more augmentations of the feature-based voice data includes” (see claim 1).  However, HSU fails to explicitly teach:
rate-based augmentations

However, in a related field of endeavor, SALAMON discloses utilizing data augmentation to increase a particular environmental sound dataset, including utilizing time stretching, e.g., slowing down or speeding up an audio sample while keeping pitch unchanged;.  (pp. 279-280, section 1, p. 281, section II.B).

The HSU-SALAMON combination makes obvious:
performing, via the machine learning model, one or more rate-based augmentations on at least a portion of the feature-based voice data based upon, at least in part, the feature-based voice data and the one or more data augmentation characteristics. (SALAMON discloses data augmentation using time stretching, e.g., slowing down or speeding up an audio sample while keeping pitch unchanged; SALAMON, p. 281, section II.B; HSU discloses a seq2seq LSTM VAE for performing data augmentation, e.g., augmentation utilizing latent nuisance representations on speech input data converted to frames represented by filter bank coefficients; HSU, p. 16, section 1 and p. 19, sections 5.3 and 6; the HSU-SALAMON combination now performs data augmentation as in HSU, and additionally time-stretches the data augmentation output of HSU as disclosed in SALAMON to create additional data augmentations; HSU, pp. 16-17, sections 1, 2, p. 19, sections 5.3 and 6, pp. 20-21, sections 6.2-6.4 and Table 1 with SALAMON, pp. 280-281, sections II.A-C)

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the environmental sound classification teachings of SALAMON with HSU.  As disclosed in SALAMON, one of ordinary skill would be motivated to detect and classify environmental sounds for applications such as complex aware computing, surveillance, and noise mitigation, e.g., understand the type of environmental noise so it can be filtered out/removed.  (p. 279, section 1). As disclosed in SALAMON, one of ordinary skill would further be motivated to apply the teachings of SALAMON to HSU because SALAMON also pertains to data augmentation to create a larger dataset of environmental sounds to create a training dataset. (p. 279, section 1).

Claim 9 depends from claim 8 and claims a computer program product having instructions that when executed correspond to the computer-implemented method of claim 2 and therefore claim 9 is rejected under the same grounds as claims 2 and 8 above.
Claim 10 depends from claim 9 and claims a computer program product having instructions that when executed correspond to the computer-implemented method of claim 3 and therefore claim 10 is rejected under the same grounds as claims 3 and 9 above.
Claim 12 depends from claim 8 and claims a computer program product having instructions that when executed correspond to the computer-implemented method of claim 5 and therefore claim 12 is rejected under the same grounds as claims 5 and 8 above.
Claim 16 depends from claim 15 and claims a computing system having a processor configured to perform a process that corresponds to the computer-implemented method of claim 2 and therefore claim 16 is rejected under the same grounds as claims 2 and 15 above.
Claim 17 depends from claim 16 and claims a computing system having a processor configured to perform a process that corresponds to the computer-implemented method of claim 3 and therefore claim 17 is rejected under the same grounds as claims 3 and 16 above.
Claim 19 depends from claim 15 and claims a computing system having a processor configured to perform a process that corresponds to the computer-implemented method of claim 5 and therefore claim 19 is rejected under the same grounds as claims 5 and 15 above.

Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over HSU in view of Laput, et al., US 20200051544 A1, hereinafter referenced as LAPUT.

Regarding claim 7, HSU discloses the computer-implemented method of claim 1, including the step “wherein generating, via the machine learning model, the one or more augmentations of the feature-based voice data includes” (see claim 1).  However, HSU fails to explicitly teach:
performing, via the machine learning model, one or more reverberation-based augmentations on at least a portion of the feature-based voice data based upon, at least in part, the feature-based voice data and the one or more data augmentation characteristics.

However, in a related field of endeavor, LAPUT discloses various techniques for performing data augmentation to generate an improved training data set assembled from augmented sound effects, including reverberation effects.  (paras. 0006, 0034).  Reverberation sound effects can include sounds reverberated across a variety of physical spaces (e.g., bathroom, atrium, kitchen, office, exterior”.  (para. 0034).

The HSU-LAPUT combination makes obvious:
performing, via the machine learning model, one or more reverberation-based augmentations on at least a portion of the feature-based voice data based upon, at least in part, the feature-based voice data and the one or more data augmentation characteristics. (LAPUT discloses capturing reverberation sound effects from particular rooms, such as bathroom, atrium, kitchen, workshop, small office; LAPUT, para. 0034; HSU discloses a seq2seq LSTM VAE for performing data augmentation, e.g., augmentation utilizing a latent nuisance representations from noisy backgrounds recorded in specific locations including on a bus, in a café, street junction or pedestrian area, on speech input data converted to frames represented by filter bank coefficients; HSU, p. 16, section 1 and p. 19, sections 5.3 and 6; pp. 20-21, sections 6.2-6.4 and Table 1; the HSU-LAPUT combination now records additional background noise in spaces where reverberation sound effects can be captured, such as in a bathroom, atrium, kitchen, or small office as disclosed in LAPUT, and now adds those scenes to the bus, café, street junction, and pedestrian area scenes as in HSU, and performs data augmentation on a dataset that now includes reverberation sound effects; HSU, p. 16, section 1 and p. 19, sections 5.3 and 6; pp. 20-21, sections 6.2-6.4 and Table 1 with LAPUT, para. 0034).

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to apply the teachings of LAPUT to HSU.  As disclosed in LAPUT, one of ordinary skill would be motivated to utilize the teachings of LAPUT to generate an augmented set of sound effects to generate an improved training data set for a machine learning model. (paras. 0003, 0006).  One of ordinary skill would further be motivated to apply the teachings of LAPUT to HSU because LAPUT explains that a class of persistence augmentation includes reverberations and non-linear dampening and further explains types of rooms, e.g., bathroom, atrium, kitchen, where reverberation sound effect samples can be captured.  (paras. 0034, 0047-0050).

Claim 14 depends from claim 8 and claims a computer program product having instructions that when executed correspond to the computer-implemented method of claim 7 and therefore claim 14 is rejected under the same grounds as claims 5 and 8 above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Nishizaki, Hiromitsu. "Data augmentation and feature extraction using variational autoencoder for acoustic modeling." 2017 (APSIPA ASC). IEEE, 2017, pp. 1-6.  Discloses a data augmentation and feature extraction method using a variational autoencoder for acoustic modeling.
US 20180342258 A1 (Huffman et al.) discloses a system and method for creating timbres, including by using a generative adversarial neural network.  (paras. 0091-0092).
US 20200335086 A1 (Paraskevopoulos et al.) discloses speech data augmentation using Generative Adversarial Networks (GANs).  (see para. 0029).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL C LEE whose telephone number is (571)272-4933. The examiner can normally be reached M-F 9:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on 571-272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/MICHAEL C. LEE/Examiner, Art Unit 2655          

/ANDREW C FLANDERS/Supervisory Patent Examiner, Art Unit 2655