DETAILED ACTION
Introduction
1.	This office action is in response to Applicant’s submission filed on 10/31/2022.   Claims 1-12 and 14-23 are pending in the application and have been examined.  Claim 13 is canceled and independent Claims 1, 7, 14, and 20 are amended.

Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
3.	Applicant’s arguments with respect to the rejections of Claims 1-12 and 14-23 under 35 USC 103 have been considered but are moot because the new ground of rejection based on US Pat. App. Pub. No. 20190156194 (Burr).

Claim Objections
4.	Claims 1, 7, 14, and 20 are objected to because of the following informalities: the term “the first domain” does not have antecedent basis.  For the purposes of examination, it will be assumed this is meant to be “the first domain audio data.”  Appropriate correction is required.
	Claim 7 is objected to as reciting “toa” instead of “to a.”  Appropriate correction is required.

Claim Rejections - 35 USC § 103
5.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


6.	Claims 1-12 and 14-19 are rejected under 35 U.S.C. 103 as being unpatentable over US Pat. App. Pub. No. 20220108689 (Tripathi et al., hereinafter “Tri”) in view of US Pat. App. Pub. No. 20210280170 (Chen et al., hereinafter “Chen”) and US Pat. App. Pub. No. 20190156194 (Burr).
With regard to Claim 1, Tri describes:
“A computer-implemented method for customizing a recurrent neural network transducer (RNN-T), comprising:
feeding the synthesized first domain audio data into a trained encoder of the recurrent neural network transducer (RNN-T) having an initial condition, wherein the encoder is updated using the synthesized first domain audio data [[and the first domain text data]]; (paragraph 39 describes that audio encoder 300 receives and is updated by acoustic frames 110,  paragraph 25 describes that encoder 300 can be an RNN-T)
feeding the synthesized second domain audio data into the updated encoder of the recurrent neural network transducer (RNN-T), (paragraph 39 describes that audio encoder 300 receives and is updated by acoustic frames 110. There are multiple acoustic frames 110 that are fed into the audio encoder 300)
Tri does not explicitly describe:
“synthesizing first domain audio data from first domain text data;
synthesizing second domain audio data from second domain text data;
wherein the prediction network is updated using the synthesized second domain audio data and the second domain text data; and
resetting the updated encoder to a pre-customized state by restoring weights on the updated encoder to the initial condition including a state trained on the first domain.”
However, Chen describes:
“synthesizing first domain audio data from first domain text data; (paragraph 29 describes converting first text data to first audio data (acoustic frames 110))
synthesizing second domain audio data from second domain text data; and (paragraph 29 describes converting second text data to second audio data, as any text data input can be converted to audio data)
wherein the prediction network is updated using the synthesized second domain audio data and the second domain text data. (paragraph 29 describes updating a predictive network based on the input acoustic frames 110)
As the audio data in Chen is based on the text data, Tri in view of Chen also describes updating an encoder and predictor using audio and text data.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the text to audio data inputs as described by Chen into the system of Tri to perform natural language understanding processing on the text, as described at paragraph 29 of Chen.
Tri in view of Chen does not explicitly describe “resetting the updated encoder to a pre-customized state by restoring weights on the updated encoder to the initial condition including a state trained on the first domain.”
However, paragraph 28 of Burr describes an encoder that is restored to the initial weights.  Paragraph 34 describes that the initial condition corresponds to a point when it receives its first shared column (cited as “the initial condition including a state trained on the first domain.”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the encoder reset as described by Burr into the system of Tri in view of Chen to suppress conductance-update asymmetries, as described at paragraph 27 of Burr.
With regard to Claim 2, Tri describes “the recurrent neural network transducer (RNN-T) includes a joiner that combines an output of the encoder with an output of the predictor.”  Figure 2 of Tri shows joiner 230 which combines the output of encoder 300 with predictor 220.
With regard to Claim 3, Tri describes “the joining produces an output as an induced local field, zt,u, that is fed into a softmax function.”  Figure 2 of Tri shows joiner 230 which outputs field Ju,t which is fed into softmax 240.
With regard to Claim 4, Tri describes “the softmax function generates a posterior probability, P(ylt, u).”  Paragraph 41 describes that softmax 240 generates probability P(zlx, t, y).
With regard to Claim 5, Tri describes “the posterior probability generator P(ylt,u) generates an output that is an output sequence y = (yi, y2, ... yu-1, yu ) that is a length U output sequence based on an input feature sequence, x, that is a time-ordered sequence of acoustic features represented as vectors.” Paragraph 41 describes that softmax 240 generates probability P(zlx, t, y), which includes sequence y based on input feature sequence x (described at paragraph 39).
With regard to Claim 6, Tri describes “the input feature sequence, x, is derived from the synthesized first domain audio data.” Paragraph 39 describes that input data x is based on audio frames 110.
With regard to Claim 7, Tri describes:
“A system for customizing a recurrent neural network transducer (RNN-T), comprising:
one or more processor devices; (paragraph 60, processor 610)
a memory in communication with at least one of the one or more processor devices; (paragraph 60, memory 620) and
a display screen; (paragraph 60, display 680)
wherein the memory includes:
an encoder configured to receive synthesized first domain audio data [[generated from first domain text data]], (paragraph 39 describes that audio encoder 300 receives acoustic frames 110.)
wherein the encoder is a trained encoder of the recurrent neural network transducer (RNN-T) having an initial condition, (paragraph 25 describes that encoder 300 can be an RNN-T)
wherein the encoder is configured to be updated from the initial condition using the synthesized first domain audio data and the first domain text data, (paragraph 39 describes that audio encoder 300 is updated by acoustic frames 110.)
wherein the encoder is further configured to receive synthesized second domain audio data [[generated from second domain text data]]; (paragraph 39 describes that audio encoder 300 receives and is updated by acoustic frames 110. There are multiple acoustic frames 110 that are fed into the audio encoder 300) and
an output sequence generator that produces output symbol sequence, y, based on an input feature sequence, x, that is a time-ordered sequence of acoustic features represented as vectors. (Paragraph 41 describes that softmax 240 generates probability P(zlx, t, y), which includes sequence y based on input feature sequence x (described at paragraph 39).
Tri does not explicitly describe “the encoder being configured to be reset to a pre-customized state by restoring weights on the encoder to the initial condition including a state trained on the first domain,” “synthesized first domain audio data generated from first domain text data,” and “synthesized second domain audio data generated from second domain text data.”
However, Chen describes “synthesized first domain audio data generated from first domain text data” and “synthesized second domain audio data generated from second domain text data.” (paragraph 29 describes converting text data to audio data, and any number of text data input can be converted to audio data)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the text to audio data inputs as described by Chen into the system of Tri to perform natural language understanding processing on the text, as described at paragraph 29 of Chen.
Tri in view of Chen does not explicitly describe “the encoder being configured to be reset to a pre-customized state by restoring weights on the encoder to the initial condition including a state trained on the first domain.”
However, paragraph 28 of Burr describes an encoder that is restored to the initial weights.  Paragraph 34 describes that the initial condition corresponds to a point when it receives its first shared column (cited as “the initial condition including a state trained on the first domain.”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the encoder reset as described by Burr into the system of Tri in view of Chen to suppress conductance-update asymmetries, as described at paragraph 27 of Burr.
With regard to Claim 8, Tri describes “the memory further includes a joiner that is configured to combine an output of the trained encoder with an output of the predictor.”  Figure 2 of Tri shows joiner 230 which combines the output of encoder 300 with predictor 220.
With regard to Claim 9, Tri describes “the joiner produces an induced local field, zt,u, as the output.” Figure 2 of Tri shows joiner 230 which outputs field Ju,t which is fed into softmax 240.
With regard to Claim 10, Tri describes “the memory further includes a softmax function that is configured to receive induced local field, zt,u, and generate an output.” Paragraph 41 describes that softmax 240 generates probability P(zlx, t, y).
With regard to Claim 11, Tri describes “the output sequence y = (yi, y2,... yu-1, yu ) is a length U output sequence based on an input feature sequence, x, that is a time-ordered sequence of acoustic features represented as vectors. Paragraph 41 describes that softmax 240 generates probability P(zlx, t, y), which includes sequence y based on input feature sequence x (described at paragraph 39).
With regard to Claim 12, Tri does not explicitly describe “the memory further includes a synthesizer that is configured to synthesize first domain audio data from first domain text data, and to synthesize second domain audio data from second domain text data.”
However, Chen describes:
“the memory further includes a synthesizer that is configured to synthesize first domain audio data from first domain text data (paragraph 29 describes converting first text data to first audio data (acoustic frames 110)), and to synthesize second domain audio data from second domain text data.” (paragraph 29 describes converting second text data to second audio data, as any text data input can be converted to audio data)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the text to audio data inputs as described by Chen into the system of Tri to perform natural language understanding processing on the text, as described at paragraph 29 of Chen.
With respect to Claims 14-19, computer program product Claim 14 and method Claim 1 are related as computer program product programmed to perform the same method, with each claimed product step function corresponding to each claimed method step. Further, Tri describes a computer program product (paragraph 60, memory 620) and a computer (paragraph 60, processor 610).  Accordingly, Claims 14-19 are similarly rejected under the same rationale as applied above with respect to Claims 1-6.

7.	Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Tri in view of Chen and Burr and further in view of US Pat. App. Pub. No. 20210225369 (Hu et al., hereinafter “Hu”). 
With regard to Claim 20, Tri describes:
“A computer-implemented method for customizing a recurrent neural network transducer (RNN-T), comprising:
feeding the synthesized first domain audio data into a trained encoder of the recurrent neural network transducer (RNN-T) having an initial condition, wherein the encoder is updated using the synthesized first domain audio data [[and the first domain text data]]; (paragraph 39 describes that audio encoder 300 receives and is updated by acoustic frames 110,  paragraph 25 describes that encoder 300 can be an RNN-T)
feeding the [[acoustic embedding,]] at, to a joiner; (Figure 2 of Tri shows joiner 230 which combines the output of encoder 300 with predictor 220.)
feeding the synthesized second domain audio data into the updated encoder; (paragraph 39 describes that audio encoder 300 receives and is updated by acoustic frames 110. There are multiple acoustic frames 110 that are fed into the audio encoder 300)
feeding the output sequence from the joiner into a predictor of the recurrent neural network transducer (RNN-T).  (paragraph 40 describes that output sequence y of softmax 240 (based on the output of joiner 230) in an input into predictor 220.)
Tri does not explicitly describe:
“synthesizing first domain audio data from first domain text data;
encodes the synthesized first domain audio data into acoustic embedding, at, wherein the acoustic embedding, at, compresses the synthesized first domain audio data into a smaller feature space;
wherein the updated encoder encodes the synthesized second domain audio data into acoustic embedding, bt, wherein the acoustic embedding, bt, compresses the synthesized second domain audio data into a smaller feature space;
synthesizing second domain audio data from second domain text data;
wherein the predictor is updated using the output sequence from the synthesized second domain audio data and the second domain text data;
resetting the updated encoder to a pre-customized state by restoring weights on the updated encoder to the initial condition including a state trained on the first domain.”
However, Chen describes:
“synthesizing first domain audio data from first domain text data; (paragraph 29 describes converting first text data to first audio data (acoustic frames 110))
synthesizing second domain audio data from second domain text data; and (paragraph 29 describes converting second text data to second audio data, as any text data input can be converted to audio data)
wherein the predictor is updated using the output sequence from the synthesized second domain audio data and the second domain text data. (paragraph 29 describes updating a predictive network based on the input acoustic frames 110)
As the audio data in Chen is based on the text data, Tri in view of Chen also describes updating an encoder and predictor using audio and text data.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the text to audio data inputs as described by Chen into the system of Tri to perform natural language understanding processing on the text, as described at paragraph 29 of Chen.
Tri in view of Chen does not explicitly describe:
“encodes the synthesized first domain audio data into acoustic embedding, at, wherein the acoustic embedding, at, compresses the synthesized first domain audio data into a smaller feature space;
wherein the updated encoder encodes the synthesized second domain audio data into acoustic embedding, bt, wherein the acoustic embedding, bt, compresses the synthesized second domain audio data into a smaller feature space; and
resetting the updated encoder to a pre-customized state by restoring weights on the updated encoder to the initial condition including a state trained on the first domain.”
However, paragraph 28 of Burr describes an encoder that is restored to the initial weights.  Paragraph 34 describes that the initial condition corresponds to a point when it receives its first shared column (cited as “the initial condition including a state trained on the first domain.”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the encoder reset as described by Burr into the system of Tri in view of Chen to suppress conductance-update asymmetries, as described at paragraph 27 of Burr.
Tri in view of Chen and Burr does not explicitly describe:
“encodes the synthesized first domain audio data into acoustic embedding, at, wherein the acoustic embedding, at, compresses the synthesized first domain audio data into a smaller feature space;
wherein the updated encoder encodes the synthesized second domain audio data into acoustic embedding, bt, wherein the acoustic embedding, bt, compresses the synthesized second domain audio data into a smaller feature space.”
However, paragraph 34 of Hu describes using acoustic embeddings to compress audio data into a smaller space.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the acoustic embeddings as described by Hu into the system of Tri in view of Chen and Burr to reduce the size of the audio data, as described at paragraph 34 of Hu.
8.	Claims 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over Tri in view of Chen, Burr, and Hu and further in view of US Pat. App. Pub. No. 20200402501 (Prabhavalkar et al., hereinafter “Pra”). 
With regard to Claim 21, Tri in view of Chen, Burr, and Hu does not explicitly describe “the joiner combines the acoustic embedding, at, with an embedding from the predictor through a weighted summation.”  However, paragraph 63 of Pra describes the output of a joiner is the weighted summation of embeddings htg.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the weighted summation of embeddings as described by Pra into the system of Tri in view of Chen, Burr, and Hu to enable the injection of words for which the pronunciation is difficult to predict from the spelling, as described at paragraph 64 of Pra.
With regard to Claim 22, Tri describes “the joiner produces an output as an induced local field, zt,u, that is fed into a softmax function.” Figure 2 of Tri shows joiner 230 which outputs field Ju,t which is fed into softmax 240.
With regard to Claim 23, Tri describes “the softmax function generates a posterior probability, P(ylt,u).”  Paragraph 41 describes that softmax 240 generates probability P(zlx, t, y).

Conclusion
9.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US Pat. App. Pub. No. 20220028444 (PAPAGEORGIOU et al.) also describes resetting weights in a neural network to an initial condition.
10.	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EDWARD TRACY whose telephone number is (571)272-8332. The examiner can normally be reached Monday-Friday 9 AM- 5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/EDWARD TRACY JR./Examiner, Art Unit 2656                                                                                                                                                                                                        
/BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656