DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 7, 13 and 17 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Applicant has amended the claims to include “training a neural multi-speaker generative model” a new search was made and art was found to Fan which teaches modeling multiple speakers a DNN-Based TTS synthesis with a general deep neural network (DNN), where the same hidden layers are shared among different speakers while the output layers are composed of speaker-dependent nodes explaining the target of each speaker. The experimental results show that a significant improvement in the quality of synthesized speech is achieved, see abstract. In DNN-based TTS synthesis, DNN is used as regression model for linguistic and acoustic feature mapping. DNN can be viewed as a layer-structured model, that jointly learns a complicated linguistic feature transformation in hidden layers and a speaker-specific acoustic space in regression layer. With such structure understanding of DNN, we can decompose DNN into two parts (linguistic transformation and acoustic regression) to benefit DNN-based TTS synthesis by multi-speakers’ data and solve the adaptation problem by shared hidden representation. the shared linguistic feature transformation can be even transferred to a new speaker, which 


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1, 2, 7, 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ohtani U.S. PAP 2017/0076715 A1 in view of Fan “Multi-Speaker Modeling and Speaker adaptation for DNN-based TTS Synthesis”.

Regarding claim 1 Ohtani teaches a computer-implemented method for synthesizing audio from an input text (training apparatus for speech synthesis, see abstract), comprising:
and using a neural multi-speaker generative model comprising a second set of trained model parameters, the input text, and the speaker embedding for the new speaker generated by the speaker encoder model comprising the first set of trained model parameters to generate a synthesized audio representation for the input text in which the synthesized audio includes speech characteristics of the new speaker (the editing part edits the target speaker acoustic model by adding speaker characteristics represented by the perception representation score information, the synthesizing part receives the target speaker acoustic model with the speaker characteristics from the editing part and the text from the input part and performs speech synthesis, see par. [0097-0099]).), 
wherein the neural multi-speaker generative model comprising the second set of trained parameters was trained using as inputs, for a speaker, (1) a training set of text-audio pairs, in which a text-audio pair comprises a text and a corresponding audio of that text by the speaker (training speaker information is stored with association of acoustic data, language data for each training speaker, see par. [0033]), and (2) a speaker embedding corresponding to a speaker identifier for that speaker (the target speaker acoustic model with the speaker characteristics from the editing part and the text from the input part and performs speech synthesis, see par. [0097-0099]).

In the same field of endeavor Fan teaches modeling multiple speakers a DNN-Based TTS synthesis with a general deep neural network (DNN), where the same hidden layers are shared among different speakers while the output layers are composed of speaker-dependent nodes explaining the target of each speaker. The experimental results show that a significant improvement in the quality of synthesized speech is achieved, see abstract. In DNN-based TTS synthesis, DNN is used as regression model for linguistic and acoustic feature mapping. DNN can be viewed as a layer-structured model, that jointly learns a complicated linguistic feature transformation in hidden layers and a speaker-specific acoustic space in regression layer. With such structure understanding of DNN, we can decompose DNN into two parts (linguistic transformation and acoustic regression) to benefit DNN-based TTS synthesis by multi-speakers’ data and solve the adaptation problem by shared hidden representation. the shared linguistic feature transformation can be even transferred to a new speaker, which is a derivative of transfer learning, see section I introduction. Speaker adaptation is build on the top of well-trained multi-speaker system. The training procedure for adaptation is quite straightforward. Due to training data for adaptation is very limited, the hidden layers transferred from multiple speakers’ data should be fixed and only the regression layer will be updated. Considering there is only a linear regression between the shared hidden layers’ output and target, parameter estimation is much 
It would have been obvious to one of ordinary skill in the art to combine the Ohtani invention with the teachings of Fan for the benefit of improving the quality of synthesized speech output, see abstract. 
Regarding claim 2 Ohtani teaches the computer-implemented method of claim 1 wherein the first set of trained model parameters for the neural speaker encoder model and the second sets of trained model parameters for the neural multi-speaker generative model were obtain by performing the steps comprising: 
training the neural multi-speaker generative model, using as inputs, for a speaker, the training set of text-audio pairs and a speaker embedding corresponding to the speaker identifier for that speaker, to obtain the second set of trained model parameters for the neural multi-speaker generative model and to obtain a set of speaker embeddings corresponding to the speaker identifiers (figure 1 illustrates training apparatus acquisition part obtains an acoustic model,. Training speaker information and the perception representation score information, see par. [0016]); 

Regarding claim 7 Ohtani teaches a generative text-to-speech system (training apparatus for speech synthesis, see abstract) comprising: 
one or more processors (processor, see par. [0113]); 
and a non-transitory computer-readable medium or media comprising one or more sequences of instructions which (storage medium, see par. [0108]), when executed by at least one of the one or more processors, causes steps to be performed comprising: 
and using a neural multi-speaker generative model comprising a second set of trained model parameters, the input text, and the speaker embedding for the new speaker generated by the speaker encoder model comprising the first set of trained model parameters to generate a synthesized audio representation for the input text in which the synthesized audio includes speech characteristics of the new speaker (the editing part edits the target speaker acoustic model by adding speaker characteristics represented by the perception representation score information, the synthesizing part receives the target speaker acoustic model with the speaker characteristics 
wherein the neural multi-speaker generative model comprising the second set of trained parameters was trained using as inputs, for a speaker, (1) a training set of text-audio pairs, in which a text-audio pair comprises a text and a corresponding audio of that text by the speaker (training speaker information is stored with association of acoustic data, language data for each training speaker, see par. [0033]), and (2) a speaker embedding corresponding to a speaker identifier for that speaker (the target speaker acoustic model with the speaker characteristics from the editing part and the text from the input part and performs speech synthesis, see par. [0097-0099]).
However Ohtani does not teach , training a neural multi-speaker generative model, or using the neural multi-speaker generative model; nor does it teach given a limited set of one or more audios of a new speaker that was not part of training data used to train of a neural multi-speaker generative model, using a speaker encoder model comprising a first set of trained model parameters to obtain a speaker embedding, for the new speaker given the limited set of one or more audios as an input to the speaker encoder model. 
In the same field of endeavor Fan teaches modeling multiple speakers a DNN-Based TTS synthesis with a general deep neural network (DNN), where the same hidden layers are shared among different speakers while the output layers are composed of speaker-dependent nodes explaining the target of each speaker. The experimental results show that a significant improvement in the quality of synthesized speech is achieved, see abstract. In DNN-based TTS synthesis, DNN is used as regression model for linguistic and acoustic feature mapping. DNN 
It would have been obvious to one of ordinary skill in the art to combine the Ohtani invention with the teachings of Fan for the benefit of improving the quality of synthesized speech output, see abstract. 

claim 8 Ohtani teaches the generative text-to-speech system of claim 7 wherein the first set of trained model parameters for the speaker encoder model and the second sets of trained model parameters for the neural multi-speaker generative model were obtain by performing the steps comprising: 
training the neural multi-speaker generative model, using as inputs, for a speaker, the training set of text-audio pairs and a speaker embedding corresponding to the speaker identifier for that speaker, to obtain the second set of trained model parameters for the neural multi-speaker generative model and to obtain a set of speaker embeddings corresponding to the speaker identifiers (figure 1 illustrates training apparatus acquisition part obtains an acoustic model,. Training speaker information and the perception representation score information, see par. [0016]); 
and training the speaker encoder model, using a set of audios selected from the training set of text-audio pairs and corresponding speaker embeddings for the speakers of the set of audios from the set of speaker embeddings, to obtain the first set of trained model parameters for the speaker encoder model (the perception representation model is trained by the training part for each perception representation of each training speaker, and trains the perception representation model from the standard acoustic model and voice features of the training speaker represented by the training speaker information, see par. [0044]). 


Claims 5, 6, 11-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ohtani U.S. PAP 2017/0076715 A1, in view of Fan “Multi-Speaker Modeling and Speaker adaptation for DNN-based TTS Synthesis”, further in view of Chun U.S. PAP 2018/0268806 A1.

Regarding claim 5 Ohtani does not teach the computer-implemented method of claim 1 wherein the first set of trained model parameters for the neural speaker encoder model and the second sets of trained model parameters for the neural multi-speaker generative model were obtain by performing the steps comprising: 
performing joint training of the neural multi-speaker generative model and the neural speaker encoder model to obtain the first set of trained model parameters for the speaker encoder model and the second set of trained model parameters for the neural multi-speaker generative model by comparing synthesized audios generated by the neural multi-speaker generative model using speaker embeddings from the speaker encoder model to ground truth audios corresponding to the synthesized audios.
In the same field of endeavor Chun teaches a text-to-speech synthesis using an encoder, see abstract. performing joint training of the neural multi-speaker generative model and the speaker encoder model to obtain the first set of trained model parameters for the speaker encoder model and the second set of trained model parameters for the neural multi-speaker generative model by comparing synthesized audios generated by the neural multi-speaker generative model using speaker embeddings from the speaker encoder model to ground truth audios corresponding to the synthesized audios (The linguistic encoder and the acoustic encoder are both trained to generate speech unit representations for a speech unit based on different types 
It would have been obvious to one of ordinary skill in the art to combine the Ohtani invention with the teachings of Chun for the benefit of minimizing the differences between speech unit representation s of the two encoders, see par. [0013].
Regarding claim 6 Chun teaches the computer-implemented method of claim 1 wherein the neural speaker encoder model comprises a neural network architecture comprising: 
a spectral processing network component that computes a spectral audio representation for input audio and passes the spectral audio representation to a prenet component comprising one or more fully-connected layers with one or more non-linearity units for feature 
a temporal processing network component in which temporal contexts are incorporated using a plurality of convolutional layers with gated linear unit and residual connections (a network with a temporal bottleneck layer can represent each unit of the database with a single embedding. An embedding may be generated so that the embedding satisfies some basic conditions for it to be useful for unit-selection, see par. [0084]); 
and a cloning sample attention network component comprising a multi-head self-attention mechanism that determines weights for different audios and obtains aggregated speaker embeddings (The network can be trained using back-propagation through time with a stochastic gradient descent. Additionally, the network can use a squared error cost at the output of the decoder. Since the output of the encoder is only taken at the end of a unit, error back-propagation is truncated at unit boundaries. Specifically, the error back-propagation truncates on a fixed number of frames, which may result in weight updates that do not account for the start of a unit, see par. [0091]).
Regarding claim 11 Ohtani does not teach the generative text-to-speech system of claim 7 wherein the first set of trained model parameters for the neural speaker encoder model and the second sets of trained model parameters for the neural multi-speaker generative model were obtain by performing the steps comprising: performing joint training of the neural multi-speaker generative model and the neural speaker encoder model to obtain the first set of trained model 
In the same field of endeavor Chun teaches a text-to-speech synthesis using an encoder, see abstract. performing joint training of the neural multi-speaker generative model and the speaker encoder model to obtain the first set of trained model parameters for the speaker encoder model and the second set of trained model parameters for the neural multi-speaker generative model by comparing synthesized audios generated by the neural multi-speaker generative model using speaker embeddings from the speaker encoder model to ground truth audios corresponding to the synthesized audios (The linguistic encoder and the acoustic encoder are both trained to generate speech unit representations for a speech unit based on different types of input. The linguistic encoder is trained to generate speech unit representations based on linguistic information. The acoustic encoder is trained to generate speech unit representations based on acoustic information, such as feature vectors that describe audio characteristics of the speech unit. The autoencoder network is trained to minimize a distance between the speech unit representations generated by the linguistic encoder and the acoustic encoder, see par. [0005].The encoder includes a neural network that was trained as part of an autoencoder network that includes the encoder, a second encoder, and a decoder. The encoder is arranged to produce speech unit representations in response to receiving data indicating linguistic units. The second encoder is arranged to produce speech unit representations in response to receiving data indicating acoustic features of speech units; the encoder, the second encoder, and the decoder are trained jointly using a cost function configured to minimize (i) differences between acoustic 
It would have been obvious to one of ordinary skill in the art to combine the Ohtani invention with the teachings of Chun for the benefit of minimizing the differences between speech unit representations of the two encoders, see par. [0013].
Regarding claim 12 Chun teaches the generative text-to-speech system of claim 7 wherein the neural speaker encoder model comprises a neural network architecture comprising 
a spectral processing network component that computes a spectral audio representation for input audio and passes the spectral audio representation to a prenet component comprising one or more fully-connected layers with one or more non-linearity units for feature transformation (TTS system seeks to determine a spectral match between consecutive candidate diphone embeddings corresponding to consecutive layers in the lattice. The TTS system seeks to match energy and loudness between consecutive candidate diphone embeddings corresponding to consecutive layers, see par. [0073]); 
a temporal processing network component in which temporal contexts are incorporated using a plurality of convolutional layers with gated linear unit and residual connections (a network with a temporal bottleneck layer can represent each unit of the database with a single embedding. An embedding may be generated so that the embedding satisfies some basic conditions for it to be useful for unit-selection, see par. [0084]); 

Regarding claim 13 Ohtani teaches a computer-implemented method for synthesizing audio from an input text, comprising: 
receiving a limited set of one or more texts and corresponding ground truth audios of a new speaker that was not part of training data used to train a neural multi-speaker generative model, which training results in speaker embedding parameters for a set of speaker embeddings, in which a speaker embedding is a low-dimension representation of speaker characteristics of a speaker (perception presentation model was trained using training apparatus which includes training speaker information 102, see par. [0094]; the acoustic model 105 may be an acoustic model of training speaker that is utilized for training of the perception representation model, an acoustic model of speaker that is not utilized for training and the average voice model M0, see par. [0096]);
inputting the limited set of one or more texts and corresponding ground truth audios for the new speaker and at least one or more of the speaker embeddings comprising speaker embedding parameters into the neural multi-speaker generative model comprising pre-trained 
and using the neural multi-speaker generative model comprising trained model parameters, the input text, and the speaker embedding for the new speaker to generate a synthesized audio representation for the input text in which the synthesized audio includes speaker characteristics of the new speaker (training speaker information is stored with association of acoustic data, language data for each training speaker, see par. [0033]), and (2) a speaker embedding corresponding to a speaker identifier for that speaker (the target speaker acoustic model with the speaker characteristics from the editing part and the text from the input part and performs speech synthesis, see par. [0097-0099]. 
However Ohtani does not teach , training a neural multi-speaker generative model, or using the neural multi-speaker generative model; nor does it teach given a limited set of one or more audios of a new speaker that was not part of training data used to train of a neural multi-speaker generative model, using a speaker encoder model comprising a first set of trained model parameters to obtain a speaker embedding, for the new speaker given the limited set of one or more audios as an input to the speaker encoder model. 
In the same field of endeavor Fan teaches modeling multiple speakers a DNN-Based TTS synthesis with a general deep neural network (DNN), where the same hidden layers are shared among different speakers while the output layers are composed of speaker-dependent nodes 

However Ohtani in view of Fan does not teach using a comparison of a synthesized audio generated by the neural multi-speaker generative model to its corresponding ground truth audio to adjust at least some of the speaker embedding parameters to obtain a speaker embedding that represents speaker characteristics of the new speaker.
In the same field of endeavor Chun teaches a text-to-speech system includes an encoder trained as part of an autoencoder network. The encoder is configured to receive linguistic information for a speech unit, such as an identifier for a phone or diphone, and generate an output indicative of acoustic characteristics of the speech unit in response (embeddings). To select a speech unit to use in unit-selection speech synthesis, an identifier of a linguistic unit can be provided as input to the encoder. The resulting output of the encoder can be used to retrieve candidate speech units from a corpus of speech units (ground truths). For example, a vector that includes at least the output of the encoder can be compared with vectors comprising the encoder outputs for speech units in the corpus, see par. [0004]. Through training, the linguistic encoder 114 learns to produce a speech unit representation or "embedding" for a linguistic unit. The linguistic encoder 114 receives data indicating a linguistic unit, such as a phoneme, and provides an embedding representing acoustic characteristics that express the linguistic unit. The embeddings provided by the linguistic encoder 114 each have the same fixed size, even though they may represent speech units of different sizes. After training, the linguistic encoder 114 is able to produce embeddings that 
It would have been obvious to one of ordinary skill in the art to combine the Ohtani in view of Fan invention with the teachings of Chun for the benefit of minimizing the differences between speech unit representations of the two encoders, see par. [0013].

Regarding claim 14 Ohtani teaches the computer-implemented method of claim 13 wherein: the neural multi-speaker generative model was trained using as inputs, for a speaker: (1) a training set of text-audio pairs, in which a text-audio pair comprises a text and a corresponding audio of that text spoken by the speaker, and (2) a speaker embedding corresponding to a speaker identifier for that speaker (training speaker information is stored with association of acoustic data, language data for each training speaker, see par. [0033]), and (2) a speaker embedding corresponding to a speaker identifier for that speaker (the target speaker acoustic model with the speaker characteristics from the editing part and the text from the input part and performs speech synthesis, see par. [0097-0099]). 
Regarding claim 15 Chun teaches the computer-implemented method of claim 13 wherein the steps of using a comparison of a synthesized audio generated by the neural multi-speaker generative model to its corresponding ground truth audio to adjust: at least some of the speaker embedding parameters to obtain a speaker embedding that represents speaker 
and at least some of the pre-trained model parameters of the neural multi-speaker generative model to obtain the trained model parameters (the TTS system 102 obtains training data from the data storage 104. The training data can include many different speech units representing many different linguistic units. The training data can also include speech from multiple speakers, see par. [0030]). 
Regarding claim 16 Chun teaches the computer-implemented method of claim 13 wherein a speaker embedding is correlated to a speaker identity via a look-up table (the TTS system 102 may use a lookup table or other data structure to determine the linguistic unit identifier for a linguistic unit, see par. [0067]). 
Regarding claim 17 Ohtani teaches a generative text-to-speech system comprising 
one or more processors (processor, see par. [0113]); 
and a non-transitory computer-readable medium or media comprising one or more sequences of instructions which (storage medium, see par. [0108]), when executed by at least one of the one or more processors, causes steps to be performed comprising: 

inputting the limited set of one or more texts and corresponding ground truth audios for the new speaker and at least one or more of the speaker embeddings comprising speaker embedding parameters into the neural multi-speaker generative model comprising pre-trained model parameters or trained model parameters (the editing part edits the target speaker acoustic model by adding speaker characteristics represented by the perception representation score information, the synthesizing part receives the target speaker acoustic model with the speaker characteristics from the editing part and the text from the input part and performs speech synthesis, see par. [0097-0099]); 
and using the neural multi-speaker generative model comprising trained model parameters, the input text, and the speaker embedding for the new speaker to generate a synthesized audio representation for the input text in which the synthesized audio includes speaker characteristics of the new speaker (training speaker information is stored with association of acoustic data, language data for each training speaker, see par. [0033]), and (2) a 
However Ohtani does not teach , training a neural multi-speaker generative model, or using the neural multi-speaker generative model; nor does it teach given a limited set of one or more audios of a new speaker that was not part of training data used to train of a neural multi-speaker generative model, using a speaker encoder model comprising a first set of trained model parameters to obtain a speaker embedding, for the new speaker given the limited set of one or more audios as an input to the speaker encoder model. 
In the same field of endeavor Fan teaches modeling multiple speakers a DNN-Based TTS synthesis with a general deep neural network (DNN), where the same hidden layers are shared among different speakers while the output layers are composed of speaker-dependent nodes explaining the target of each speaker. The experimental results show that a significant improvement in the quality of synthesized speech is achieved, see abstract. In DNN-based TTS synthesis, DNN is used as regression model for linguistic and acoustic feature mapping. DNN can be viewed as a layer-structured model, that jointly learns a complicated linguistic feature transformation in hidden layers and a speaker-specific acoustic space in regression layer. With such structure understanding of DNN, we can decompose DNN into two parts (linguistic transformation and acoustic regression) to benefit DNN-based TTS synthesis by multi-speakers’ data and solve the adaptation problem by shared hidden representation. the shared linguistic feature transformation can be even transferred to a new speaker, which is a derivative of transfer learning, see section I introduction. Speaker adaptation is build on the top of well-trained multi-
It would have been obvious to one of ordinary skill in the art to combine the Ohtani invention with the teachings of Fan for the benefit of improving the quality of synthesized speech output, see abstract. 
However Ohtani in view of Fan does not teach using a comparison of a synthesized audio generated by the neural multi-speaker generative model to its corresponding ground truth audio to adjust at least some of the speaker embedding parameters to obtain a speaker embedding that represents speaker characteristics of the new speaker.
In the same field of endeavor Chun teaches a text-to-speech system includes an encoder trained as part of an autoencoder network. The encoder is configured to receive linguistic information for a speech unit, such as an identifier for a phone or diphone, and generate an output indicative of acoustic characteristics of the speech unit in response (embeddings). To . The resulting output of the encoder can be used to retrieve candidate speech units from a corpus of speech units (ground truths). For example, a vector that includes at least the output of the encoder can be compared with vectors comprising the encoder outputs for speech units in the corpus, see par. [0004]. Through training, the linguistic encoder 114 learns to produce a speech unit representation or "embedding" for a linguistic unit. The linguistic encoder 114 receives data indicating a linguistic unit, such as a phoneme, and provides an embedding representing acoustic characteristics that express the linguistic unit. The embeddings provided by the linguistic encoder 114 each have the same fixed size, even though they may represent speech units of different sizes. After training, the linguistic encoder 114 is able to produce embeddings that encode acoustic information from linguistic information alone. This allows the linguistic encoder 114 to receive data specifying a linguistic unit and produce an embedding that represents the audio characteristics for a speech unit that would be appropriate to express the linguistic unit, see par. [0027].
It would have been obvious to one of ordinary skill in the art to combine the Ohtani in view of Fan invention with the teachings of Chun for the benefit of minimizing the differences between speech unit representations of the two encoders, see par. [0013].
Regarding claim 18 Ohtani teaches the generative text-to-speech system of claim 17 wherein: the neural multi-speaker generative model was trained using as inputs, for a speaker: (1) a training set of text-audio pairs, in which a text-audio pair comprises a text and a corresponding audio of that text spoken by the speaker, and (2) a speaker embedding 
Regarding claim 19 Chun teaches the generative text-to-speech system of claim 17 wherein the steps of using a comparison of a synthesized audio generated by the neural multi-speaker generative model to its corresponding ground truth audio to adjust at least some of the speaker embedding parameters to obtain a speaker embedding that represents speaker characteristics of the new speaker further comprises: 
using a comparison of a synthesized audio generated by the neural multi-speaker generative model to its corresponding ground truth audio to adjust: at least some of the speaker embedding parameters to obtain a speaker embedding that represents speaker characteristics of the new speaker (After training, the linguistic encoder 114 is able to produce embeddings that encode acoustic information from linguistic information alone. This allows the linguistic encoder 114 to receive data specifying a linguistic unit and produce an embedding that represents the audio characteristics for a speech unit that would be appropriate to express the linguistic unit, see par. [0027]); 
and at least some of the pre-trained model parameters of the neural multi-speaker generative model to obtain the trained model parameters (the TTS system 102 obtains training data from the data storage 104. The training data can include many different speech units 
Regarding claim 20 Chun teaches the generative text-to-speech system of claim 17 wherein the neural multi-speaker generative model comprises: 
an encoder, which converts textual features of an input text into learned representations (The encoder is configured to receive linguistic information for a speech unit, see par. [0004]); 
and a decoder, which decodes the learned representations with a multi-hop convolutional attention mechanism into low-dimensional audio representation ( The decoder is arranged to generate output indicating acoustic features of speech units in response to receiving speech unit representations for the speech units from the encoder or the second encoder; decoder each include one or more long short-term memory layers, see par. [0011/0012]). 
Allowable Subject Matter
Claims 3, 4 and 9-10 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Pertinent prior art available on form 892.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Ortiz-Sanchez whose telephone number is (571)270-3711.  The examiner can normally be reached on Monday- Friday 9AM-6PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.







/MICHAEL ORTIZ-SANCHEZ/Primary Examiner, Art Unit 2656