Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Should applicant desire to obtain the benefit of foreign priority under 35 U.S.C. 119(a)-(d) prior to declaration of an interference, a certified English translation of the foreign application must be submitted in reply to this action. 37 CFR 41.154(b) and 41.202(e).
Failure to provide a certified translation may result in no benefit being accorded for the non-English application.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 07/14/2020 and 11/19/2021 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-18 are rejected under 35 U.S.C. 103 as being unpatentable over Kang (Document ID: US 20220044463 A1) in view of Zhou (Document ID: "Visemenet: Audio-driven animator-centric speech animation").
Regarding claims 1, 7, and 13, Kang teach A method for predicting a mouth-shape feature, comprising: 
recognizing a phonetic posterior gram (PPG) of a phonetic feature (Fig 4 and Paragraph 0052-0053); and 
performing a prediction on the PPG by using a neural network model, to predict a mouth-shape feature of the phonetic feature, (Paragraph 0062-0063; mentioned a pre-trained neural network being used to get expression parameter for mouth shape), 
	Even though Kang does mention a pre-trained neural network being used to get mouth shape parameter (Fig 4), it fails to specifically mention the training details relating to the pre-trained neural network being used. Therefore, it fails to teach the claimed limitation of: “the neural network model being obtained by training with training samples and an input thereof including a PPG and an output thereof including a mouth-shape feature, and the training samples including a PPG training sample and a mouth-shape feature training sample”
Zhou does teach the claimed limitation of, the neural network model being obtained by training with training samples and an input thereof including a PPG and an output thereof including a mouth-shape feature, and the training samples including a PPG training sample and a mouth-shape feature training sample (Fig 3 presents an overview of the training model for LSTM neural network which include phoneme probability group vector/ graph and associated facial landmark to get the facial result shown in Fig 5; See Page 5, Col 2 Paragraph 2-3; Also see section 5: “Training” for details on the training model). Here, phoneme probability group vector/ graph (shown in fig 3) can be associated to PPG mentioned in the claim as both are aimed towards presenting the probability of the phoneme being used. Furthermore, the facial landmark mentioned in the training stage can be also equated to mouth-shape feature sample mentioned in the claim. Zhou is considered analogous to the claimed invention because it is also aimed towards animated face modeling using phoneme features. Therefore, it would have been obvious to one skilled in the art before the effective filling date of the claimed invention to have modified Kang to incorporate neural network training model as taught by Zhou to improve performance of the system (Page 10, col 1, line 5-7).
As seen in the claim set, claims 1, 7, and 13 cover similar scope of invention. However, claim 1 is a method claim while claims 7 and 13 are device and computer readable medium claim respectively. Claims 1 method of using correspond with each claimed element in claim 7 and 13. Furthermore, Kang also mention of processor and memory (Fig 12 and Paragraph 0010 which include storing of the program code), and a computer readable medium (Paragraph 0011 and 0132) mentioned within claim 7 and 13. Therefore, claims 7 and 13 are rejected under same rationale as applied to claim 1.
Regarding claims 2, 8, and 14, Kang in view of Zhou does teach the method according to claim 1, the electronic device according to claim 7, and the medium according to claim 13; wherein the PPG training sample comprises: PPGs of target phonetic features, the target phonetic features being obtained based on dynamic slicing and having complete semantics (Zhou, Fig 2 show the divided/sliced phoneme groups being used to get the probability distribution shown to be used in Fig 3 and Section 5: “Training”); and the mouth-shape feature training sample comprises: mouth-shape features corresponding to the PPGs of the target phonetic features (Zhou, Fig 2 shows the mouth shapes corresponding to phoneme group being used; Also see, fig 3 and Section 5: “Training”). Zhou is considered analogous to the claimed invention because it is also aimed towards animated face modeling using phoneme features. Therefore, it would have been obvious to one skilled in the art before the effective filling date of the claimed invention to have modified Kang to incorporate phoneme and corresponding mouth shapes for neural network training as taught by Zhou to improve performance of the system (Page 10, col 1, line 5-7).
Regarding claims 3, 9, and 15, Kang in view of Zhou teaches the method according to claim 2, the electronic device according to claim 8, and the medium according to claim 14; wherein a frequency of a target phonetic feature matches a frequency of a mouth-shape feature corresponding to a PPG of the target phonetic feature (Zhou, Fig 2, show the phoneme group list being synchronically matched with a specific mouth shape (landmark); also see Page 7, col 1, paragraph 4, lines 1-3 and Fig 3). Here, frequency matching of both elements is inherent as each phoneme group has specific mouth-shape (landmark) being assigned. Zhou is considered analogous to the claimed invention because it is also aimed towards animated face modeling using phoneme features. Therefore, it would have been obvious to one skilled in the art before the effective filling date of the claimed invention to have modified Kang to incorporate phoneme and corresponding mouth shapes for neural network training as taught by Zhou to improve performance of the system (Page 10, col 1, line 5-7).
Regarding claim 4, 10, and 16, Kang in view of Zhou teaches the method according to claim 1, the electronic device according to claim 7, and the medium according to claim 13; wherein the neural network model is a recurrent neural network (RNN) model having an autoregressive mechanism, and a process of training the RNN model includes (Zhou, show an LSTM model being trained which a type of RNN model; see fig 3 and Page 7, col 1, paragraph 4): 2120A12189US 
performing the training by using a mouth-shape feature training sample of a frame preceding a current frame as an input, by using a PPG training sample of the current frame as a condition constraint, and a mouth-shape feature 5training sample of the current frame as a target (Zhou, See Fig 3 and Page 7, col 1, paragraph 4 through col 2, Paragraph 7). Here, it can be seen that loss functions are being incorporated for training which include classification loss, regression loss, smoothness loss, and joint loss. The loss such as regression and smoothness are used for landmark displacement which can be equated to mouth-shape feature training mentioned in the claim. 
Zhou is considered analogous to the claimed invention because it is also aimed towards animated face modeling using phoneme features. Therefore, it would have been obvious to one skilled in the art before the effective filling date of the claimed invention to have modified Kang to incorporate phoneme and corresponding mouth shapes for neural network training as taught by Zhou to improve performance of the system (Page 10, col 1, line 5-7).
	Regarding claims 5, 11, and 17, Kang in view of Zhou teaches the method according to claim 1, the electronic device according to claim 7, and the medium according to claim 13; wherein the neural network model is a multi-branch neural network model (Zhou, Fig 3), and the mouth-shape feature of the phonetic feature includes at least two of: a regression mouth-shape point (Zhou, Fig 4, show landmark points associated to mouth shape which can be equated to mouth-shape points mentioned in the claim. Furthermore, a regression loss is also shown to be found using the landmark, see equation 3), a mouth-shape thumbnail (Zhou, Fig 2 show the mouth shape thumbnail for each phoneme group); also, see Zhou, page 3, col 1, paragraph 3, line 10-15 and Page 2, col 1, line 29-34. Zhou is considered analogous to the claimed invention because it is also aimed towards animated face modeling using phoneme features. Therefore, it would have been obvious to one skilled in the art before the effective filling date of the claimed invention to have modified Kang to incorporate phoneme and corresponding mouth shapes feature for neural network as taught by Zhou to improve performance of the system (Zhou, Page 10, col 1, line 5-7).
	Regarding claims 6, 12, and 18, Kang in view of Zhou teaches the method according to claim 1, the electronic device according to claim 7, and the medium according to claim 13; further comprising: performing predictions on PPGs of pieces of pieces of real speech data using the neural network model, to obtain mouth-shape features of the pieces of real speech data (Kang, fig 4 and Paragraph 0054-0055, show PPG being found; and Paragraph 0062-0063, mentioned a pre-trained neural network being used to get expression parameter for mouth shape). Kang however fails to specifically mention a mouth shape library being formed to use for mouth shape virtual image. 
Zhou does teach the claimed limitation of constructing a mouth-shape feature index library based on the mouth-shape features of the pieces of real speech data, the mouth-shape feature index library being used for synthesizing a mouth shape of a virtual image (Fig 2 show the identified 20 visual groups along with relevant mouth shape output and Viseme; also, see Page 3, section 3: “Algorithm Design”, Paragraph 1-5, for Viseme and phoneme prediction model for outputting the virtual output mouth shape shown in fig 2). Zhou is considered analogous to the claimed invention because it is also aimed towards animated face modeling using phoneme features. Therefore, it would have been obvious to one skilled in the art before the effective filling date of the claimed invention to have modified Kang to incorporate mouth shape output and feature library as taught by Zhou. Furthermore, one of ordinary skill in the art would have recognized that result of the combination was predictable since the use of that known technique provides the rationale to arrive at a conclusion of obviousness. See KSR International Co. v. Teleflex lnc., 82 USPQ2D 1385 (U.S. 2007).
Conclusion
The analogous prior art made of record and not relied upon is considered to applicant’s disclosure.
Sagar (Document ID: US 20220108510 A1) teaches phoneme feature being used to get speech animation of mouth.
Liu (Document ID: "Video-audio driven real-time facial animation.") teaches phoneme probability and audio-visual data being incorporated to get mouth animation.
Edwards (Document ID: US 20180253881 A1) teaches phoneme being used to configure lip movement for speech animation.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NEEL P. KARELIA whose telephone number is (571)272-4377. The examiner can normally be reached Monday-Friday 6:30 am - 4:00 pm (every other Friday Off)).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on (571)272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/NEEL PIYUSHKUMAR KARELIA/Examiner, Art Unit 2659                                                                                                                                                                                                        
/PIERRE LOUIS DESIR/Supervisory Patent Examiner, Art Unit 2659