Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claims 1-20 are pending. Claims 1, 9 and 17 are independent.
This Application was published as U.S. 2021/0280202.
            Apparent priority: 25 September 2020.

    PNG
    media_image1.png
    262
    803
    media_image1.png
    Greyscale

	Instant Application appears to be directed to a Speech Synthesizer with the Speech to Speech Conversion capabilities: receiving the speech of a first speaker and applying the timbre of speech of another speaker to the voice of the first speaker in order to arrive at speech that has the same pitch of the voice of the speaker but the timbre of the voice of a second speaker.  Including actual names of the acoustic features in the Figures as opposed to first, second … sixth, would be more clear.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1, 9, and 17 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kim (U.S. 20200082806).
Regarding Claim 1, Kim teaches:
1. A voice conversion method, 
comprising: 
acquiring a source speech of a first user and a reference speech of a second user; [Kim, Figures 8, “Input Speech of First Language” / “source speech of a first user” is being input to the “speech translation system 800.”  From this speech several speech/sound related types of parameters are extracted.  However, the written description of this drawing indicates that such sound/speech related parameters may be extracted from the input speech of another speaker which teaches the input of “a reference speech of a second user” of the Claim. “[0075] In FIG. 8, it is shown that all of the articulatory feature, the emotion feature, and the prosody feature are extracted from the input speech of the first language to synthesize a speech. However, the present invention is not limited thereto. In another embodiment, at least one of the articulatory feature, the emotion feature, and the prosody feature may be extracted from an input speech of another speaker. For example, the emotion feature and the prosody feature may be extracted from the input speech of the first language, and the articulatory feature may be extracted from another input speech (e.g., a celebrity's speech) to synthesize a speech. In this case, the emotion and prosody of the speaker who utters the input speech of the first language are reflected in the synthesized speech, but the voice of a speaker (e.g., a celebrity) who utters another input speech may be reflected.”]
extracting first speech content information and a first acoustic feature from the source speech; [Kim, Figure 8, “speech recognizer 810” generating “input text of first language” as “first speech content information” of the Claim.  The various “… extractors 840, 850, 860” teach “extracting … a first acoustic feature from the source speech.”   “[0067] The received input speech of the first language may be delivered to the speech recognizer 810, the articulatory feature extractor 840, the emotion feature extractor 850, and the prosody feature extractor 860.….”  “[0068] The articulatory feature extractor 840 may extract a feature vector from the input speech of the first language and generate the articulatory feature of a speaker who utters the input speech of the first language….”  “[0070] The prosody feature extractor 860 may extract a prosody feature from the input speech of the first language….”  Emotion can be extracted from sound or content and this case it is extracted from sound.  So any of the 3 extractors shown can generate the “first acoustic feature” of the Claim. “[0075] In FIG. 8, it is shown that all of the articulatory feature, the emotion feature, and the prosody feature are extracted from the input speech of the first language to synthesize a speech. However, the present invention is not limited thereto….”] 
extracting a second acoustic feature from the reference speech; [Kim, Figure 8, teaches that at least one of the extracted features may be extracted from the speech of a second speaker / “reference speech.”  “[0075] … In another embodiment, at least one of the articulatory feature, the emotion feature, and the prosody feature may be extracted from an input speech of another speaker. For example, the emotion feature and the prosody feature may be extracted from the input speech of the first language, and the articulatory feature may be extracted from another input speech (e.g., a celebrity's speech) to synthesize a speech. In this case, the emotion and prosody of the speaker who utters the input speech of the first language are reflected in the synthesized speech, but the voice of a speaker (e.g., a celebrity) who utters another input speech may be reflected.]
acquiring a reconstructed third acoustic feature by inputting the first speech content information, the first acoustic feature, and the second acoustic feature into a pre-trained voice conversion model, in which the pre-trained voice conversion model is acquired by training based on speeches of a third user; and [Kim, Figure 8, uses a “speech synthesizer 830” which includes a “pre-trained voice conversion model” that is trained on speech of others than the first or second speaker/user.  The “speech synthesizer 830” receives the “first speech content information” (which has been translated to a second language) as well as the output of the “… extractors 840, 850, 860” and in the example of [0075] two of these “acoustic” features are from the first speaker/user and one (the “articulatory feature,” e.g.) is from a particular second user/speaker such as a celebrity voice.  Not trained on the voice of the first (or second) speaker:  “[0074] When the speaker's feature is extracted from the input speech of the first language and used to synthesize a translated speech, the voice of the corresponding speaker may be simulated to generate the output speech of the second language with a similar voice even when the voice of the corresponding speaker is not pre-learned….”  See also Figure 10, which show a speech synthesizer and includes “encoder 1010” which is stated to be “pre-trained.”  “[0093] The multilingual text-to-speech synthesizer 1000 based on an artificial neural network is trained using a large database existing as a pair of a multilingual learning text and a corresponding learning speech signal….”  “[0079] … The encoder 1010 may use a pre-trained machine learning model in order to convert the divided input text into the text embedding vector….”  See also “[0082] … The decoder 1020 may use a pre-trained machine learning model in order to convert the speaker ID into the speaker embedding vector s….”]
synthesizing a target speech based on the third acoustic feature. [Kim, Figure 8, “speech synthesizer 830” generating “output speech …”]

Claim 9 is a device claim with limitations corresponding to the limitations of Claim 1 and is rejected under similar rationale.  The hardware components are shown in Kim, Figure 15.
Claim 17 is a computer program product system claim with limitations corresponding to the limitations of method Claim 1 and is rejected under similar rationale. The hardware components are shown in Kim, Figure 15.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 2, 10, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Sun (U.S. 20180012613).
Regarding Claim 2, Kim teaches that its “speech synthesizer 830” is multi-lingual: “[0078] FIG. 10 is a diagram showing a configuration of a multilingual text-to-speech synthesizer 1000 according to an embodiment of the present disclosure….”  Kim does not expressly teach that the “speech recognizer 810” is also multi-lingual.  However, the translation context suggests that the inputs can be in different languages and thus the ASR is also multi-lingual.  Kim does not mention posterior grams.
Sun teaches:
2. The method as claimed in claim 1, wherein extracting the first speech content information from the source speech comprises: 
acquiring a phonetic posterior gram by inputting the source speech into a pre-trained multilingual automatic speech recognition model; and [Sun is directed to a phonetic posteriorgram (PPG) that converts a source speech to a target speech and in the process uses a speaker independent speech recognition (SI-ASR) system and where the source and target languages could be different thus requiring ASR in different languages. See [0004].  Sun teaches that converting the source speech into the converted speech and generating the mapping between the PPG and the acoustic features of the target speech may include a trained DBLSTM.  See [0005].  See Figure 3 for training of the SI-ASR in training stages 1 and 2 (302, 304) and then using the “Trained SI-ASR model” in the “conversion stage 306.”]
using the phonetic posterior gram as the first speech content information.[Sun teaches that converting the source speech into speech by the SI-ASR as a method of equalizing different speakers.  [0004].  Figure 3 the use of “Trained SI-ASR model” to generate the PPGs as its output.  ASR yields content.  Content output of this ASR is PPG.]
Kim and Sun pertain to speech to speech transformation and it would have been obvious to combine the speech recognizer of Sun which is described in more detailed and can be applied to any language with the system of Kim which does not discuss its speech recognizer in detail. This combination falls under combining prior art elements according to known methods to yield predictable results or simple substitution of one known element for another to obtain predictable results. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Claim 10 is a device claim with limitations corresponding to the limitations of Claim 2 and is rejected under similar rationale.
Claim 18 is a computer program product system claim with limitations corresponding to the limitations of Claim 2 and is rejected under similar rationale.

Claims 3, 8, 11, 16, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Rafii (U.S. 10997970).
Regarding Claim 3, Kim teaches:
3. The method as claimed in claim 1, wherein the first acoustic feature, the second acoustic feature, and the third acoustic feature are Mel features. [Kim teaches that the “speech synthesizer 830” generates the output speech from mel-spectrograms and that the output speech “may have a me-spectrogram form.”  See [0088]-[0090].  Thus, the “third acoustic features” are expressly shown as Mel features.  Further, Figure 10 shows receiving the “speaker ID” and the “speaker embedding vector S” and generating the Mel-Spectrogram in the r frames which are themselves inputs as shown by the arrows going back in at 1024 and 1026. Thus, the “first and second acoustic features” are impliedly shown as Mel features.   “[0085] … For example, r frames acquired at the initial time-step 1022 may be inputs for a subsequent time-step 1024. Also, the r frames output at the time-step 1024 may be inputs for a subsequent time-step 1026.”  “[0088] … The output speech may have a mel-spectrogram form and may include r frames.”]
Kim is not express regarding the extracted acoustic features (first and second) being Mel features.
Rafii expressly teaches:
wherein the first acoustic feature, the second acoustic feature, and the third acoustic feature are Mel features. [Rafii, Figure 4C, showing at 430 that mel features of the audio input are used.  Figures 3A, 3B, 3C, and 3D showing the extracted and generated Mel-spectrograms.  “A preferred representation that better captures the human auditory system is the Mel-spectrogram ….”  Col. 13, line 56 to Col. 14, line 40 describe the Mel- scale and why it is used.]
Kim and Rafii pertain to speech to speech transformation and making an input speech sound differently at output and it would have been obvious to combine the features of the two and in particular the use of Mel features as extracted acoustic features as explained by Rafii in columns 13 and 14 of this reference. This combination falls under combining prior art elements according to known methods to yield predictable results or simple substitution of one known element for another to obtain predictable results. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Regarding Claim 8, this Claim is directed to steps of training using voices of new speakers/users.
Kim teaches training its TTS machine learning model over time with additional sets of data.  See Figure 14 for the change in error rate as retraining is conducted.  See [0079] for retraining of a pre-trained encoder 1010 with new input text as well as [0082] for retraining the pre-trained decoder 1020 which takes in the speaker embedding vector s that has the acoustic features.
Rafii teaches:
8. The method as claimed in claim 1, further comprising: [Rafii, Figure 4B shows a flowchart of a training process for the speech synthesizer.]
acquiring a first speech and a second speech of the third user; [Rafii, Figure 4B, “select next audio segment 310.”]
extracting second speech content information and a fourth acoustic feature from the first speech; [Rafii, Figure 4B, 330: “Transcription_feature = encode2 (input_mel)” and 335: “Acoustic_feature= encode1(input_mel).”  Figure 4C 450 and 440.
extracting a fifth acoustic feature from the second speech; [Rafii, Figure 4A, 230: “Annotate Phrase for Different Enunciation.” Rafii does not change the audio according to voice of another user and instead of the voice of another user, uses the annotation process of Figure 4A as a guide for modifying the input audio.  This step is taught by Kim, Figure 8, as applied to Claim 1.]
acquiring a reconstructed sixth acoustic feature by inputting the second speech content information, the fourth acoustic feature, and the fifth acoustic feature into a voice conversion model to be trained; [This step is also taught by Kim, Figure 8, as applied to Claim 1.]
adjusting model parameters in the voice conversion model to be trained based on a difference between the sixth acoustic feature and the fourth acoustic feature, and returning to the acquiring the first speech and the second speech of the third user until the difference between the sixth acoustic feature and the fourth acoustic feature satisfies a preset training end condition; and [This is the essence and definition of training, there is an input to the system and a particular output is desired.  When the actual output and the desired output become the same (or very close), training is complete.  Before that, Figure 4B, 345: Error= distance (Output_mel, Label_mel) is still large and the “stopping criteria met? 355” is No, so the system keeps iterating.]
determining the voice conversion model to be trained after a last adjusting of model parameters as the pre-trained voice conversion model. [Rafii, Figure 4B, 355: “Learn parameters by running optimizer” after 350: “Back-propagate the derivative of error with respect to model parameters 350” which conveys the change in the error.]
Kim and Rafii pertain to speech to speech transformation and both begin with a trained model and keep improving it by retraining and it would have been obvious to combine the more expressly steps of training from Rafii with the method of Kim which teaches re-training but does not discuss the details. This combination falls under combining prior art elements according to known methods to yield predictable results or simple substitution of one known element for another to obtain predictable results. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Claim 11 is a device claim with limitations corresponding to the limitations of Claim 3 and is rejected under similar rationale.
Claim 16 is a device claim with limitations corresponding to the limitations of Claim 8 and is rejected under similar rationale.
Claim 20 is a computer program product system claim with limitations corresponding to the limitations of Claim 8 and is rejected under similar rationale.

Claims 4-5, 7, 12-13, 15, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Chen (U.S. 20140088968).
Regarding Claim 4, Kim teaches:
4. The method as claimed in claim 1, wherein the voice conversion model comprises a hidden-variable network, a timbre network, and a reconstruction network, and [Kim Figures 8 and 10 show the “speech synthesizer 830” which is shown in Figure 10 as a Deep Neural Network with layers of hidden states and thus teach the “hidden-variable network” of the Claim and a “decoder 1020” and “first and second vocoder 1030, 1040” teach the “reconstruction network” of the Claim because they reconstruct the “frames” and generate the “speech output.”  See Figure 10, “h: Encoder Hidden States” and “[0080] The encoder 1010 may input the text embedding vector to a deep neural network (DNN) module including a fully-connected layer. The DNN may be a general feedforward layer or linear layer.” “[0081] The encoder 1010 may input an output of the DNN to a module including at least one of a convolutional neural network (CNN) and a recurrent neural network (RNN)….. The module including at least one of the CNN and the RNN may output hidden states h of the encoder 1010.” See [0080]-[0083].]
acquiring the reconstructed third acoustic feature by inputting the first speech content information, the first acoustic feature, and the second acoustic feature into the pre-trained voice conversion model comprises: 
acquiring a fundamental frequency and an energy parameter by inputting the first acoustic feature into the hidden-variable network; [Kim teaches extracting prosody information at 860 in Figure 8 and prosody extraction includes pitch/ “fundamental frequency” and is input to the “speech synthesizer 830” which includes the “hidden variable network” shown in Figure 10.  “[0011] …, the prosody feature includes at least one of information on utterance speed, information on accentuation, information on voice pitch, and information on pause duration.”  The articulatory feature (840) also may include: “[0057] … For example, the articulatory feature of the speaker may include the speaker's voice tone or voice pitch….”]
acquiring a timbre parameter by inputting the second acoustic feature into the timbre network; and 
acquiring the third acoustic feature by inputting the first speech content information, the fundamental frequency and energy parameter, and the timbre parameter into the reconstruction network. [Kim teaches in Figure 8 that the text/content and the extracted features (840, 850, 860) are input into the “speech synthesizer 830” for “reconstruction.”]

Kim does not mention acquiring an “energy parameter” while energy is generally implicit.
Kim does not teach “acquiring a timbre parameter by inputting the second acoustic feature into the timbre network.”  
Chen is directed to “… speech recognition using timbre vectors” and teaches:
…  wherein the voice conversion model comprises a hidden-variable network, a timbre network, and a reconstruction network, and [Chen in Figures 1 and 7 shows the output as speech and teaches the speech synthesis/voice conversion model of the Claim.  Chen teaches that acoustic processing of an input speech typically includes the us of Hidden Markov Models which teach the “hidden-variable network” of the Claim and a series of steps that constitute the “synthesis 721” in Figure 7 or “voice manipulation 116-127” in Figure 1 and teach the “reconstruction network” of the Claim.]
acquiring the reconstructed third acoustic feature by inputting the first speech content information, the first acoustic feature, and the second acoustic feature into the pre-trained voice conversion model comprises: [Chen, Figure 7, shows speech synthesis from an “input text 722” / “speech content information” and acoustic features of the “source speaker 702” / “second acoustic feature” as well as “phoneme sequence and prosody information 725” / “first acoustic feature” in the steps of synthesis starting from 726 to the output at 736.]
acquiring a fundamental frequency and an energy parameter by inputting the first acoustic feature into the hidden-variable network; [Chen, Figure 7, right hand side shows the input of the text/content and the acoustic features that are supposed to go with that content.  “Base pitch” / “fundamental frequency” is expressly shown as part of “speaker identity 723.”  The “various prosodic variations (calm, emotional, up to shouting)” in [0009]  indicate “energy” and “prosody feature” is another input on the right hand side of the drawing at 723.  “[0046] In the synthesis unit 721, the input text 722 together with synthesis parameters 723, are fed into the frontend 724. Detailed instructions about the phonemes, intensity and pitch values 725, for generating the desired speech are generated, then input to a processing unit 726….”  See also “[0042] Intensity profiling. Because the intensity parameter I is a property of a frame, it can be changed to produce any stress pattern required by prosody input.”]
acquiring a timbre parameter by inputting the second acoustic feature into the timbre network; and [Chen, the “timbre vectors 712” are obtained from the voice of the “source speaker” and thus the acoustic features of the “source speaker 702” teach the “second acoustic feature” of the Claim.]
acquiring the third acoustic feature by inputting the first speech content information, the fundamental frequency and energy parameter, and the timbre parameter into the reconstruction network. [Chen, Figure 7, the content/input text, fundamental frequency/base pitch on the right hand side and timbre vectors 712 from the left hand side are both input to the “processing unit 726” which generates the “output speech 736” and along the way obtains the “third acoustic feature” which could be the “amplitude spectrum 729” which leads to the “acoustic waves 733.”]
Kim and Chen pertain to speech synthesis and speech transformation and both apply the characteristics of one voice to change the sound of another voice and both use acoustic features of input sounds and it would have been obvious to bring in timbre (and energy) which are expressly stated in Chen as acoustic features that are probably used by Kim as well or may be used just the same. This combination falls under combining prior art elements according to known methods to yield predictable results or simple substitution of one known element for another to obtain predictable results. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Regarding Claim 5, Kim in Figure 8 teaches the extraction of prosody and articulatory feature and emotion which include both fundamental frequency/pitch and amplitude (energy level).
Additionally, Chen teaches and therefore suggests:
5. The method as claimed in claim 4, wherein acquiring the fundamental frequency and energy parameter by inputting the first acoustic feature into the hidden-variable network comprises: [Chen teaches the steps of this Claim as applied to the “second acoustic feature” whereas the pitch and energy of the “first acoustic feature” are input directly (not extracted) in Chen.  Chen teaches that these parameters can be extracted from an input voice and these teachings suggest that the input parameters on the right-hand side (first acoustic feature) can also be extracted from the voice of the speaker with “speaker identity 723.”]
inputting the first acoustic feature into the hidden-variable network, such that the hidden- variable network compresses the first acoustic feature on a frame scale, and extracts the fundamental frequency and energy parameter from the compressed first acoustic feature. [Chen, Figure 7, “processing unit 705” generating the “frames 708.”  And application of “Fourier Analysis 709” to generate the “Amplitude Spectrum” which includes both the pitch (spectral content) and energy (amplitude).]
Rationale for combination as provided for Claim 4.

Regarding Claim 7, Kim teaches:
7. The method as claimed in claim 4, wherein acquiring the third acoustic feature by inputting the first speech content information, the fundamental frequency and energy parameter, and the timbre parameter into the reconstruction network comprises: 
inputting the first speech content information, the fundamental frequency and energy parameter, and the timbre parameter into the reconstruction network, such that the reconstruction network performs an acoustic feature reconstruction on the first speech content information, the fundamental frequency and energy parameter, and the timbre parameter by a deep recurrent neural network to acquire the third acoustic feature. [Kim Figure 10 teaches the use of an RNN encoder 1010 and RNN decoder 1020 for speech synthesis.  This Claim is referring to the last phase of speech synthesis (acquiring the third acoustic feature” and Figure 8 shows the input of the content and the acoustic features of the voice of the speaker and another speaker (celebrity) to the “speech synthesizer 830” and Figure 10 shows that the “speech synthesizer 830” is implemented in RNN.]
The input of timbre as one of the types of acoustic feature used for modifying the output sound is taught by Chen as applied to Claim 1.
Rationale for combination as provided for Claim 4.

Claim 12 is a device claim with limitations corresponding to the limitations of Claim 4 and is rejected under similar rationale.
Claim 13 is a device claim with limitations corresponding to the limitations of Claim 5 and is rejected under similar rationale.
Claim 15 is a device claim with limitations corresponding to the limitations of Claim 7 and is rejected under similar rationale.
Claim 19 is a computer program product system claim with limitations corresponding to the limitations of Claim 4 and is rejected under similar rationale.

Claims 6 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Chen and further in view of Rafii.
Regarding Claim 6, Kim Figure 10 teaches the use of an RNN encoder 1010 and RNN decoder 1020 for speech synthesis.  Chen teaches extracting the timbre acoustic feature but does not teach the use of neural networks.
Rafii teaches:
6. The method as claimed in claim 4, wherein acquiring the timbre parameter by inputting the second acoustic feature into the timbre network comprises: 
inputting the second acoustic feature into the timbre network, such that the timbre network abstracts the second acoustic feature by a deep recurrent neural network and a variational auto encoder to acquire the timbre parameter. [Rafii is directed to modifying an input speech using a neural network such as an RNN and extracts MFCCs of the input sound and one of the types of acoustic features that it extracts is timbre.  See Figure 4C and col. 18, lines 15-45.  Rafii uses a GAN which is similar to a variational auto-encoder. “…  In another embodiment of the present invention, the transformation model inputs the entire vocal qualities of speech (namely, pitch, rhythm and timbre) in the form of a time series. Next the speech is encoded to a set of latent states, and then decoded from these latent states directly to the target output speech. Such use of time domain speech transformation are further depicted and described with reference to FIG. 1A, FIG. 1B, and FIG. 1C.” Col. 6, lines 31-51.  “The architecture of a model composed of many serial layers is called a deep model. As contrasted to shallow models that typically have a single inner layer, deep models have more learning capacity for complex tasks such as the design goals provided by embodiments of the present invention. Accordingly, FIG. 1B preferably is implemented as a deep model.”  Col. 9, lines 34-40.  “An important class of hierarchical CNN is called autoencoders; an autoencoder essentially copies its input to its output….”   Col. 9, lines 55-60.   “…. The role of encoder 70-1 is to primarily capture the essence of acoustic features (namely, pitch, rhythm and timbre) of audio 20 (e.g., essentially the voice properties of the human speaker-source of audio in module block 20)….”  Col. 10, lines 16021. 33 “While CNN models behave like a directed graph, and function as feedforward machines, another important category of neural network models called Recurrent Neural Networks (RNN) have feedback nodes. RNN are useful models for processing text and speech input because they maintain a short-lived temporal context of their input. ….”  Col. 12, lines 19-41.]
Kim/Chen and Rafii pertain to speech transformation and apply the characteristics of one voice to change the sound of another voice and both use extracted acoustic features of input sounds and it would have been obvious to use the RNN of Rafii which used for feature extraction to extract the timbre feature of the audio input in place of the method of Chen which uses some earlier means. This combination falls under combining prior art elements according to known methods to yield predictable results or simple substitution of one known element for another to obtain predictable results. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Claim 14 is a device claim with limitations corresponding to the limitations of Claim 6 and is rejected under similar rationale.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Hwang (U.S. 11410679) (U.S. 20200176017)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARIBA SIRJANI whose telephone number is (571)270-1499. The examiner can normally be reached on 9 to 5, M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Desir can be reached on 571-272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Fariba Sirjani/
Primary Examiner, Art Unit 2659