DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 05/20/2022 has been entered.
This communication is in response to the Amendments and Arguments filed on   05/20/2022. 
Claims 1, 3, 4, 6, 7, 9-11, 13, 14, 16, 17, and 19-26 are pending and have been examined.
All previous objections/rejections not mentioned in this Office Action have been withdrawn by the examiner. 
Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 
Response to Arguments
Applicant's arguments filed 05/20/2022 have been fully considered but they are not persuasive. 
Applicant asserts on pages 14-16 that neither Silverstein nor Foster, in their entirety, teach or suggest updating a table in the data store to record lyrics in association with a user and an estimated emotion of the user. In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). Silverstein teaches updating a user-specified parameter table, where the parameters include descriptors associated with emotions, where the emotions are user-selectable (8:19-23),(45:46-57),(143:14-34). The parameters in a table being associated with an emotion that is user-selectable is combined with Luan’s teaching that the emotion is determined based on analysis of the user input speech (5:46-59). Foster teaches storing data in a table that includes text transcribed from a speech-to-text module, where the text is stored in association with a sender (4:60-67),(7:53-8:3). Finally, Luan teaches that the text processed by the system is specifically lyrics (6:27-29,47-49). Thus, it is the combination of the three references that teaches limitation in question, rather than one reference teaching the entirety of the limitation. 
Hence, Applicant’s arguments are not persuasive.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 24-26 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Luan et al. (U.S. Patent No. 10891928), hereinafter Luan.

Regarding claims 24 and 26, Luan teaches 
(claim 24) A singing voice synthesis method comprising (a solution of supporting a machine to automatically generate a song (2:26-28)):
(claim 26) A singing voice synthesis system comprising (a system for automatic song generation (4:14-16)):
(claim 26) a Central Processing Unit (CPU) configured to (a processing unit, such as a central processing unit (3:15-22)):

detecting a trigger for singing voice synthesis, wherein the trigger is input by a user (the module acts in response, i.e. detecting a trigger, to a user input, i.e. trigger is input by a user, such as an instruction to synthesize a song, that includes generating a singing waveform, i.e. singing voice synthesis (11:16-30,39-52));
determining specific content based on the detected trigger (the creation intention of the user is input, i.e. based on the trigger, refers to one or more features expected to be expressed by the song (4:48-52), and is used by the lyrics generating module to select or generate lyrics conforming to the creation intention, i.e. determining specific content, and is also used by the template generating module to generate the melody (6:27-29,47-49),(8:4-7,18-22),(10:34-38));
decomposing the specific content into a plurality of pieces of partial content (the template generating module may divide the lyrics, i.e. decomposing the specific content, into a plurality of lyrics segments, i.e. into a plurality of pieces of partial content (8:18-22));
determining a piece of partial content from the plurality of pieces of partial content as a target content based on a position of the piece of partial content in the specific content (generating a template includes dividing the lyrics, i.e. specific content into a plurality of lyrics segments, i.e. plurality of pieces of partial content, where a melody segment matching with each lyrics segment, i.e. determining a piece of partial content...as a target content, is chosen based on smoothness among the melody segments chosen for adjacent lyrics segments, i.e. based on a position of the piece of partial content in the specific content (8:47-9:2));
synthesizing, by a processor, a singing voice based on the determination of the piece of partial content as the target content (the song synthesizing module generates a song singing waveform for the song performance, i.e. synthesizing a singing voice (11:39-52), using the generated lyrics segments and associated melody segments, i.e. based on the determination of the piece of partial content as the target content, as well as the voice characteristics of a singer for the song, i.e. based on acquired plurality of parameters and the acquired lyrics (6:27-29),(8:4-7,18-22),(8:47-9:2),(11:39-52),(12:13-14), where the processing unit performs the various processes, i.e. by a processor (3:15-22)); and
 controlling a sound output device to output the synthesized singing voice (the module provides the song formed by the lyrics and melody, such as a song singing waveform, i.e. synthesized singing voice, to the output device for output, i.e. controlling a sound output device to output (3:61-4:9),(11:39-52)).  

	Regarding claim 25, Luan teaches claim 24, and further teaches
replacing a part of the target content with a sound (a melody segment is chosen to match each lyrics segment, i.e. a part of the target content (8:47-9:2), where the lyrics are sung according to the melody, i.e. replacing...with a sound (11:39-52)).  
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3, 4, 6, 7, 9, 11, 13, 14, 16, 17, 19, and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luan, in view of Silverstein (U.S. Patent No. 9721551), hereinafter Silverstein, and further in view of Foster et al. (U.S. Patent No. 8423366), hereinafter Foster.

Regarding claims 1, 11, and 21, Luan teaches
(claim 1) A singing voice synthesis method comprising (a solution of supporting a machine to automatically generate a song (2:26-28)):
(claim 11) A singing voice synthesis system comprising (a system for automatic song generation (4:14-16)):
(claim 11) a Central Processing Unit (CPU) configured to (a processing unit, such as a central processing unit (3:15-22)):
(claim 21) A singing voice synthesis method, comprising (a solution of supporting a machine to automatically generate a song (2:26-28)):

detecting a trigger for singing voice synthesis, wherein the trigger is input by a user (the module acts in response, i.e. detecting a trigger, to a user input, i.e. trigger is input by a user, such as an instruction to synthesize a song, that includes generating a singing waveform, i.e. singing voice synthesis (11:16-30,39-52));
estimating an emotion of the user (if the input of the user, i.e. user who has input the trigger, includes audio, such as through a voice input (3:59-61),(4:28-30), the spectrum properties of the speech may be analyzed to determine the emotions expressed by the input audio, i.e. estimating an emotion (5:46-59));
acquiring a plurality of parameters … based on the trigger and the estimated emotion of the user, wherein … the plurality of parameters for the singing voice synthesis in association with the user... (the creation intention of the user is input, i.e. based on the trigger, and refers to one or more features expected to be expressed by the song, including an emotion, which can be determined by analyzing the input audio, i.e. estimated emotion of the user (4:48-65),(5:46-59), and is used by the lyrics generating module to select existing lyrics, i.e. acquiring a plurality of parameters, conforming to the creation intention (6:27-29), the template generating module may use candidate segments that are melody segments of one or more existing songs, i.e. acquiring a plurality of parameters, (8:4-7,18-22), and the song synthesizing module uses voice characteristics of a singer for the song that may be a voice personalized for the user, i.e. acquiring a plurality of parameters, to generate a song singing waveform for the song performance, i.e. plurality of parameters for the singing voice synthesis in association with the user (11:39-52),(12:13-14));
(claims 1 and 11) acquiring lyrics for the singing voice synthesis based on the trigger and the estimated emotion of the user (the creation intention of the user is input, i.e. based on the trigger, and refers to one or more features expected to be expressed by the song, including an emotion, which can be determined by analyzing the input audio, i.e. estimated emotion of the user (4:48-65),(5:46-59), and is used by the lyrics generating module to select or generate lyrics conforming to the creation intention, i.e. acquiring lyrics (6:27-29,47-49), where the song synthesizing module generates a song singing waveform for the song performance using the lyrics, i.e. singing voice synthesis (11:39-52));
(claim 21) acquiring a melody for the singing voice synthesis based on the trigger (the creation intention of the user is input, i.e. trigger, refers to one or more features expected to be expressed by the song (4:48-52), and the template generating module may use candidate segments of one or more existing songs or generate the melody, i.e. acquiring a melody (8:4-7,18-22),(10:34-38), and candidate segments may be selected based on the creation intention, i.e. based on the trigger (10:4-10), where the song synthesizing module generates a song singing waveform for the song performance using the lyrics, i.e. singing voice synthesis (11:39-52));
(claim 21) correcting the melody based on lyrics of the singing voice synthesis (the pre-divided candidate melody segments may be classified according to respective creation intentions, such as emotion, and may be further selected to match each lyrics segment (10:4-16), where a lyrics segment may have a predefined length or may be divided by a structure of words, i.e. based on the lyrics, and the candidate melody segment is selected as matching with the distribution of words, i.e. correcting the melody (8:18-29), where the song synthesizing module generates a song singing waveform for the song performance using the lyrics, i.e. singing voice synthesis (11:39-52));
updating...to record the ((claims 1 and 11) acquired) lyrics in association with the ...emotion ... (the storage device can be used for storing information or data to be accessed by the computing device, i.e. record (3:37-41), where lyrics can be stored in association with tag information including emotion, i.e. in association with...the...emotion (6:27-46), and where the generated lyrics can be output in the form of a text, i.e. updating...to record (7:27-47));
(claims 1 and 11) synthesizing, by a processor, a singing voice based on...the acquired plurality of parameters, and the acquired lyrics (the song synthesizing module generates a song singing waveform for the song performance, i.e. synthesizing a singing voice (11:39-52), using the generated lyrics, the melody indicated by the template, and the voice characteristics of a singer for the song, i.e. based on acquired plurality of parameters and the acquired lyrics (6:27-29),(8:4-7,18-22),(11:39-52),(12:13-14), where the processing unit performs the various processes, i.e. by a processor (3:15-22)); and
(claim 21) synthesizing, by a processor, a singing voice based on...the acquired plurality of parameters, and the corrected melody (the song synthesizing module generates a song singing waveform for the song performance, i.e. synthesizing a singing voice (11:39-52), using the generated lyrics, the melody indicated by the template, and the voice characteristics of a singer for the song, i.e. based on acquired plurality of parameters and the corrected melody (6:27-29),(8:4-7,18-22),(11:39-52),(12:13-14), where the processing unit performs the various processes, i.e. by a processor (3:15-22)); and
controlling a sound output device to output the synthesized singing voice (the module provides the song formed by the lyrics and melody, such as a song singing waveform, i.e. synthesized singing voice, to the output device for output, i.e. controlling a sound output device to output (3:61-4:9),(11:39-52)).  
While Luan provides the use of stored information and trained models to generate lyrics, a melody, and a singing waveform, Luan does not specifically teach that the information is stored in a table associated with the user, and thus does not teach
acquiring a plurality of parameters from a table based on the trigger and the estimated emotion of the user, wherein the table includes the plurality of parameters for the singing voice synthesis in association with the user and a plurality of emotions of the user, and the plurality of emotions includes the emotion;
updating the table to record the ((claims 1 and 11) acquired) lyrics in association with the user who has input the trigger and the estimated emotion of the user;
synthesizing, by a processor, a singing voice based on the updated table, the acquired plurality of parameters, and the ((claims 1 and 11) acquired lyrics/(claim 21) corrected melody).
Silverstein, however, teaches acquiring a plurality of parameters from a table based on the trigger and the estimated emotion of the user (user preferences include table parameters, i.e. acquiring a plurality of parameters from a table (41:18-24),(143:14-34), and include descriptors associated with different emotions, such as a pop song in 4/4 meter being Happy, and a rock song in ¾ meter being Sad, i.e. based on…the estimated emotion of the user Fig. 36B, (45:46-57), and the user accesses the system to initiate the system to compose and generate music, i.e. based on the trigger (8:19-23)), wherein the table includes the plurality of parameters for the singing voice synthesis in association with the user and a plurality of emotions of the user, and the plurality of emotions includes the emotion (user preferences such as musical experience descriptors and table parameters, i.e. in association with the user, are used during automated music composition Fig. 27B6, (41:18-24),(143:14-34), where parameters loaded within the subsystems, i.e. table further includes the plurality of parameters, include descriptors associated between emotions, i.e. a plurality of emotions of the user, where emotions can be a user selectable parameter, i.e. a plurality of emotions of the user, and the plurality of emotions includes the emotion Fig. 36B, (45:46-57));
updating the table... in association with the user who has input the trigger and the estimated emotion of the user (the user-specified parameter tables can be updated and saved, i.e. updating the table...in association with the user (143:14-34), where the user initiates the system to compose and generate music, i.e. who has input the trigger (8:19-23), and the parameters include descriptors associated between emotions, i.e. estimated emotion of the user Fig. 36B, (45:46-57));
synthesizing, ... based on the updated table, the acquired plurality of parameters, and the ((claims 1 and 11) acquired lyrics/(claim 21) corrected melody) (the system generates a digital audio sample of the composed music, i.e. synthesizing (135:18-34), where the music is composed using user preferences such as musical experience descriptors and table parameters, i.e. based on the updated table, the acquired plurality of parameters, and user supplied lyrics, i.e. acquired lyrics, are used during automated music composition, and composition includes generation of a melody rhythm and pitch, i.e. melody Fig. 27B6, (10:49-58),(41:18-24),(54:23-46),(143:14-34),(163:48-53)).
Where Luan teaches that the parameters, lyrics, and corrected melody are used in the generation of a singing waveform using specific voice characteristics (11:39-52).
Luan and Silverstein are analogous art because they are from a similar field of endeavor in automatic generation of music based on user input. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of stored information and trained models to generate lyrics, a melody, and a singing waveform teachings of Luan with the selection of specific parameters from parameter tables saved in user preferences as taught by Silverstein. The motivation to do so would have been to achieve a predictable result of enabling the system to use saved user preferences to accurately and quickly satisfy the user’s requests (Silverstein (143:14-23)).
While Luan in view of Silverstein provides generating and storing lyrics in classifications that are used for the synthesis of a singing voice, Luan in view of Silverstein does not specifically teach that the lyrics are stored in a table associated with a user, and thus does not teach 
updating the table to record the ((claims 1 and 11) acquired) lyrics in association with the user....
Foster, however, teaches updating the table to record the ((claims 1 and 11) acquired) lyrics in association with the user...(the server device includes a data store in the form of arrays, lists, or tables, i.e. the table, where a speech-to-text module transcribes a voicemail message into text, i.e. acquired lyrics, and stores the text based messages in association with a message sender, i.e. updating...to record the acquired lyrics…in association with the user (4:60-67),(7:53-8:3)).  
Where Luan teaches that the text is lyrics (6:27-29,47-49).
Luan, Silverstein, and Foster are analogous art because they are from a similar field of endeavor in audible synthesis of information for a user to hear. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the storage of lyrics in classifications that are used for the synthesis of a singing voice teachings of Luan, as modified by Silverstein, with the storage of text messages in association with a particular individual as taught by Foster. The motivation to do so would have been to achieve a predictable result of enabling messages to be read aloud using the voice characteristics associated with the user who sent the message (Foster (5:16-25)).

Regarding claims 3 and 13, Luan in view of Silverstein and Foster teaches claims 1 and 11, and Luan further teaches 
analyzing a voice of the user (if the input of the user includes audio, such as through a voice input, i.e. a voice of the user (3:59-61),(4:28-30), the spectrum properties of the speech may be analyzed, i.e. analyzing (5:46-59)); and 
estimating the emotion of the user based on a result of the analysis (the spectrum properties of the speech may be analyzed, i.e. based on a result of the analysis, to determine the emotions expressed by the input audio, i.e. estimating the emotion of the user (5:46-59)).  

Regarding claims 4 and 14, Luan in view of Silverstein and Foster teaches claims 3 and 13, and Luan further teaches
estimating the emotion based on:
contents of the voice of the user (the spectrum properties of the speech, i.e. contents of the voice of the user, may be analyzed to determine the emotions expressed by the input audio, i.e. estimating the emotion (5:46-59)), or  
at least one of a pitch, a volume, or a change in speed regarding the voice of the user.  

Regarding claims 6 and 16, Luan in view of Silverstein and Foster teaches claims 1 and 11, and Luan further teaches
selecting a database from a plurality of databases based on the trigger, wherein the selected database includes voice fragments of a plurality of singers (the voice model may be trained using a plurality of voice segments of a certain singer and saved in a database as a predefined voice model, i.e. selected database includes voice fragments (3:37-41),(11:61-12:12), where the system may determine a corresponding voice of a singer for the song, such as the user expecting a personalized voice, i.e. based on the trigger (11:39-46),(12:13-20), by obtaining a voice model saved in a particular location, i.e. selecting a database, where voice models can be trained for one or more specific speakers based on inputted voice segments from those singers, i.e. voice fragments of a plurality of singers, and saved in multiple locations, i.e. plurality of databases (11:61-12:12)); and
synthesizing the singing voice based on the voice fragments in the selected database (the voice model used to synthesize the song singing waveform, i.e. synthesizing the singing voice (11:39-46), may be trained using a plurality of voice segments of a certain singer, i.e. based on the voice fragments, and saved in a database as a predefined voice model, i.e. in the selected database (3:37-41),(11:61-12:12)).  

Regarding claims 7 and 17, Luan in view of Silverstein and Foster teaches claims 1 and 11, and Luan further teaches 
selecting a plurality of databases based on the trigger, wherein the plurality of databases includes first voice fragments of a plurality of singers (the voice model may be trained using a plurality of voice segments of a certain singer or singers and saved in a database as a predefined voice model, i.e. databases includes first voice fragments of a plurality of singers (3:37-41),(11:61-12:12), where the system may determine a corresponding voice of a singer for the song, i.e. based on the trigger (11:39-46),(12:13-20), by obtaining the average voice model, which is a voice model that has been trained using a plurality of voice segments of different singers, i.e. voice fragments acquired from a plurality of singers, and using additional voice segments, i.e. first voice fragments, input by the user to adjust the predefined average voice model, i.e. plurality of databases (3:37-41),(11:61-12:12));
obtaining second voice fragments by combining the first voice fragments in the selected plurality of databases (the average voice model is obtained, which is a voice model that has been trained using a plurality of voice segments of different singers, and using additional voice segments, i.e. first voice fragments, input by the user, i.e. plurality of databases, to adjust the predefined average voice model, i.e. obtaining second voice fragments by combining (3:37-41),(11:61-12:12)); and
synthesizing the singing voice based on the obtained second voice fragments (the voice model used to synthesize the song singing waveform, i.e. synthesizing the singing voice (11:39-46), may be trained using a plurality of voice segments of multiple singers, including voice segments input from the user, i.e. based on the obtained second voice fragments (3:37-41),(11:61-12:20).  

Regarding claims 9 and 19, Luan in view of Silverstein and Foster teaches claims 1 and 11, and Luan further teaches
acquiring the lyrics from a source selected from a plurality of sources (the lyrics generating module selects lyrics, i.e. acquiring the lyrics (6:27-29,47-49), where the lyrics are from one or more pieces of existing lyrics classified according to theme, content, or emotions, i.e. from a plurality of sources, where the lyrics are selected based on their conforming to the creation intention, i.e. from one source (6:27-46)).  

Claim(s) 10 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luan, in view of Silverstein, in view of Foster, and further in view of Kobayashi (U.S. Patent No. 7241947), hereinafter Kobayashi.

Regarding claims 10 and 20, Luan in view of Silverstein and Foster teaches claims 1 and 11.
While Luan in view of Silverstein and Foster provides the generation of a song singing waveform, the generation of a musical score that matches input user lyrics, and the synthesis of instruments according to the musical score, Luan in view of Silverstein and Foster does not specifically teach the generation of an accompaniment for a synthesized voice, and thus does not teach
generating an accompaniment corresponding to the synthesized singing voice;
synchronizing the synthesized singing voice and the generated accompaniment; and
controlling, based on the synchronization, the sound output device to output the synthesized singing voice and the accompaniment.  
Kobayashi, however, teaches generating an accompaniment corresponding to the synthesized singing voice (performance data is used to generate both a signing voice waveform, i.e. synthesized singing voice, and an accompaniment waveform, i.e. generating an accompaniment corresponding Fig. 1, (5:30-35),(6:47-55),(8:30-33));
synchronizing the synthesized singing voice and the generated accompaniment (the mixer synchronizes and overlays the singing voice waveform and accompaniment waveform, i.e. synchronizing the synthesized singing voice and the generated accompaniment Fig.1, (8:34-42)); and
controlling, based on the synchronization, the sound output device to output the synthesized singing voice and the accompaniment (the mixer reproduces the synchronized waveforms as an output waveform, i.e. based on the synchronization Fig.1, (8:34-42), where the output waveform is output via a sound system, i.e. controlling…the sound output device to output the synthesized singing voice and the accompaniment (11:22-23)).  
Luan, Silverstein, Foster, and Kobayashi are analogous art because they are from a similar field of endeavor in generation of music to be output to a user. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the generation of a song singing waveform and the generation of a musical score that matches input user lyrics teachings of Luan, as modified by Silverstein and Foster, with the additional generation of an accompaniment waveform synchronized to the singing voice waveform as taught by Kobayashi. The motivation to do so would have been to achieve a predictable result of enabling a robot apparatus to support human activity through entertainment (Kobayashi (11:26-33)).

Claim(s) 22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luan, in view of Silverstein, and further in view of Hoshikawa et al. (JP4298612B2), as presented in the IDS, and utilizing the translated description from Espacenet/Global Dossier enclosed in a previous Office Action, hereinafter Hoshikawa.

Regarding claim 22, Luan teaches
A singing voice synthesis method comprising (a solution of supporting a machine to automatically generate a song (2:26-28)):
detecting a trigger for singing voice synthesis, wherein the trigger is input by a user (the module acts in response, i.e. detecting a trigger, to a user input, i.e. trigger is input by a user, such as an instruction to synthesize a song, that includes generating a singing waveform, i.e. singing voice synthesis (11:16-30,39-52));
acquiring a plurality of parameters … based on the trigger, wherein … the plurality of parameters for the singing voice synthesis in association with the user (the creation intention of the user is input, i.e. based on the trigger, and refers to one or more features expected to be expressed by the song (4:48-52), and is used by the lyrics generating module to select existing lyrics, i.e. acquiring a plurality of parameters, conforming to the creation intention (6:27-29), the template generating module may use candidate segments that are melody segments of one or more existing songs, i.e. acquiring a plurality of parameters, (8:4-7,18-22), and the song synthesizing module uses voice characteristics of a singer for the song that may be a voice personalized for the user, i.e. acquiring a plurality of parameters, to generate a song singing waveform for the song performance, i.e. plurality of parameters for the singing voice synthesis in association with the user (11:39-52),(12:13-14));
acquiring lyrics for the singing voice synthesis based on the trigger (the creation intention of the user is input, i.e. based on the trigger, refers to one or more features expected to be expressed by the song (4:48-52), and is used by the lyrics generating module to select or generate lyrics conforming to the creation intention, i.e. acquiring lyrics (6:27-29,47-49), where the song synthesizing module generates a song singing waveform for the song performance using the lyrics, i.e. singing voice synthesis (11:39-52));
synthesizing, by a processor, a singing voice based on acquired plurality of parameters and the … lyrics (the song synthesizing module generates a song singing waveform for the song performance, i.e. synthesizing a singing voice (11:39-52), using the generated lyrics, the melody indicated by the template, and the voice characteristics of a singer for the song, i.e. based on acquired plurality of parameters and the … lyrics (6:27-29),(8:4-7,18-22),(11:39-52),(12:13-14), where the processing unit performs the various processes, i.e. by a processor (3:15-22)); and
controlling a sound output device to output the synthesized singing voice (the module provides the song formed by the lyrics and melody, such as a song singing waveform, i.e. synthesized singing voice, to the output device for output, i.e. controlling a sound output device to output (3:61-4:9),(11:39-52)).  
While Luan provides the use of stored information and trained models to generate lyrics, a melody, and a singing waveform, Luan does not specifically teach that the information is stored in a table associated with the user, and thus does not teach
acquiring a plurality of parameters from a table based on the trigger, wherein the table includes the plurality of parameters...in association with the user.
Silverstein, however, teaches acquiring a plurality of parameters from a table based on the trigger, wherein the table includes the plurality of parameters … in association with the user (user preferences, i.e. in association with the user, such as musical experience descriptors and table parameters, i.e. table includes the plurality of parameters, are used during automated music composition Fig. 27B6, 36B, (41:18-24),(143:14-34), where an example use is the selection of the song form based on the song form sub-phrase parameter table, i.e. acquiring a plurality of parameters (14:1-10), and the user accesses the system to initiate the system to compose and generate music, i.e. based on the trigger (8:19-23)).
Where Luan teaches that the parameters are used in the generation of a singing waveform using specific voice characteristics (11:39-52).
Luan and Silverstein are analogous art because they are from a similar field of endeavor in automatic generation of music based on user input. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of stored information and trained models to generate lyrics, a melody, and a singing waveform teachings of Luan with the selection of specific parameters from parameter tables saved in user preferences as taught by Silverstein. The motivation to do so would have been to achieve a predictable result of enabling the system to use saved user preferences to accurately and quickly satisfy the user’s requests (Silverstein (143:14-23)).
While Luan in view of Silverstein provides the generation of lyrics and a melody, Luan in view of Silverstein does not specifically teach the correction of the lyrics based on the melody, and thus does not teach
causing a delimiter part of the lyrics to correspond with a delimiter part of a melody of the singing voice synthesis by adjustment of the lyrics;
synthesizing...a singing voice based on the acquired plurality of parameters and the adjustment of the lyrics.
Hoshikawa, however, teaches causing a delimiter part of the lyrics to correspond with a delimiter part of a melody of the singing voice synthesis by adjustment of the lyrics (the number of syllables can be adjusted based on an excess or deficiency with respect to notes for song generation, i.e. melody of the singing voice synthesis [0024], such as when syllables are added to repeat one or more words sets to eliminate a shortage of syllables, i.e. adjustment of the lyrics, when the music note set of the music data is longer character data of the word set, i.e. based on a melody [0044], where a syllable constituting a word set, i.e. delimiter part of the lyrics, is assigned to each note in a note set, i.e. delimiter part of a melody [0037]);
synthesizing...a singing voice based on the acquired plurality of parameters and the adjustment of the lyrics (a sound synthesis waveform is processed, i.e. synthesizing a singing voice, for each syllable, i.e. based on…the adjustment of the lyrics, and generated on the basis of the note length and frequency indicated by the corresponding note, i.e. based on acquired plurality of parameters [0058]).
Luan, Silverstein, and Hoshikawa are analogous art because they are from a similar field of endeavor in automatic generation of music based on user input. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the generation of lyrics and a melody teachings of Luan, as modified by Silverstein, with the adjustment of the number of syllables to match the music note set as taught by Hoshikawa. The motivation to do so would have been to achieve a predictable result of generating a song that does not have unnatural qualities (Hoshikawa [0008]).

Claim(s) 23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luan, in view of Silverstein.

Regarding claim 23, Luan teaches
A singing voice synthesis method comprising (a solution of supporting a machine to automatically generate a song (2:26-28)):
detecting a trigger for singing voice synthesis, wherein the trigger is input by a user (the module acts in response, i.e. detecting a trigger, to a user input, i.e. trigger is input by a user, such as an instruction to synthesize a song, that includes generating a singing waveform, i.e. singing voice synthesis (11:16-30,39-52));
acquiring a plurality of parameters … based on the trigger, wherein … the plurality of parameters for the singing voice synthesis in association with the user (the creation intention of the user is input, i.e. based on the trigger, and refers to one or more features expected to be expressed by the song (4:48-52), and is used by the lyrics generating module to select existing lyrics, i.e. acquiring a plurality of parameters, conforming to the creation intention (6:27-29), the template generating module may use candidate segments that are melody segments of one or more existing songs, i.e. acquiring a plurality of parameters, (8:4-7,18-22), and the song synthesizing module uses voice characteristics of a singer for the song that may be a voice personalized for the user, i.e. acquiring a plurality of parameters, to generate a song singing waveform for the song performance, i.e. plurality of parameters for the singing voice synthesis in association with the user (11:39-52),(12:13-14));
acquiring a melody for the singing voice synthesis based on the trigger (the creation intention of the user is input, i.e. trigger, refers to one or more features expected to be expressed by the song (4:48-52), and the template generating module may use candidate segments of one or more existing songs or generate the melody, i.e. acquiring a melody (8:4-7,18-22),(10:34-38), and candidate segments may be selected based on the creation intention, i.e. based on the trigger (10:4-10), where the song synthesizing module generates a song singing waveform for the song performance using the lyrics, i.e. singing voice synthesis (11:39-52));
causing a delimiter part of the melody to correspond with a delimiter part of lyrics of the singing voice synthesis by adjustment of the melody (the pre-divided candidate melody segments, i.e. delimiter part of the melody, may be classified according to respective creation intentions, such as emotion, and may be further selected to match each lyrics segment (10:4-16), where a lyrics segment may have a predefined length or may be divided by a structure of words, i.e. delimiter part of the lyrics, and the candidate melody segment is selected as matching with the distribution of words, i.e. causing...to correspond...by adjustment of the melody (8:18-29), where the song synthesizing module generates a song singing waveform for the song performance using the lyrics, i.e. singing voice synthesis (11:39-52));
Page 10 of 20Application No. 16/622,387Reply to Office Action of August 25, 2021synthesizing, by a processor, a singing voice based on the acquired plurality of parameters and the adjustment of the melody (the song synthesizing module generates a song singing waveform for the song performance, i.e. synthesizing a singing voice (11:39-52), using the generated lyrics, the melody indicated by the template, and the voice characteristics of a singer for the song, i.e. based on acquired plurality of parameters and the adjustment of the melody (6:27-29),(8:4-7,18-22),(11:39-52),(12:13-14), where the processing unit performs the various processes, i.e. by a processor (3:15-22)); and
controlling a sound output device to output the synthesized singing voice (the module provides the song formed by the lyrics and melody, such as a song singing waveform, i.e. synthesized singing voice, to the output device for output, i.e. controlling a sound output device to output (3:61-4:9),(11:39-52)).
While Luan provides the use of stored information and trained models to generate lyrics, a melody, and a singing waveform, Luan does not specifically teach that the information is stored in a table associated with the user, and thus does not teach
acquiring a plurality of parameters from a table based on the trigger, wherein the table includes the plurality of parameters for the singing voice synthesis in association with the user.
Silverstein, however, teaches acquiring a plurality of parameters from a table based on the trigger, wherein the table includes the plurality of parameters … in association with the user (user preferences, i.e. in association with the user, such as musical experience descriptors and table parameters, i.e. table includes the plurality of parameters, are used during automated music composition Fig. 27B6, 36B, (41:18-24),(143:14-34), where an example use is the selection of the song form based on the song form sub-phrase parameter table, i.e. acquiring a plurality of parameters (14:1-10), and the user accesses the system to initiate the system to compose and generate music, i.e. based on the trigger (8:19-23)).
Where Luan teaches that the parameters are used in the generation of a singing waveform using specific voice characteristics (11:39-52).
Luan and Silverstein are analogous art because they are from a similar field of endeavor in automatic generation of music based on user input. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of stored information and trained models to generate lyrics, a melody, and a singing waveform teachings of Luan with the selection of specific parameters from parameter tables saved in user preferences as taught by Silverstein. The motivation to do so would have been to achieve a predictable result of enabling the system to use saved user preferences to accurately and quickly satisfy the user’s requests (Silverstein (143:14-23)).
Conclusion
	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICOLE A K SCHMIEDER whose telephone number is (571)270-1474. The examiner can normally be reached 8:00 - 5:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NICOLE A K SCHMIEDER/Examiner, Art Unit 2659                                                                                                                                                                                                        
/Paras D Shah/Primary Examiner, Art Unit 2659                                                                                                                                                                                                        
08/02/2022