DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Response to Arguments
Applicant's arguments filed 07/05/2022 have been fully considered but they are not persuasive. Regarding arguments on page 11 of the Remarks, Examiner notes that Applicant has not shown how the claim is directed to an improvement in computer related technology. Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims would overcome the 101 rejection.
Regarding arguments on pages 14-15 of the Remarks, Examiner notes that Gorodetski’s teachings of the dynamic energy range teach the amended limitations. Specifically, the dynamic energy range corresponds to the unevenness of the volume, where an even volume would have a smaller dynamic energy range.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-6, 8, and 10-20 rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.  Using the subject matter eligibility test from page 74621 of the Federal Register Notice titled “2014 Interim Guidance on Patent Subject Matter Eligibility,” a two-step process is performed. Under step 1, the claims are analyzed to determine if the claim is directed to a process, machine, article of manufacture, or composition of matter. In this case, claims 1-6, 8, and 10-18 are directed to an electronic device, which is a machine or an article of manufacture; claim 19 is directed to a method, which is a process; and claim 20 is directed to a machine readable medium, which is a machine or an article of manufacture. Step 2A (part 1 of the Mayo test), using the guidance from pages 50-57 of the Federal Register Vol. 84 No. 4 from Monday, January 7, 2019, requires applying a two-prong inquiry. In Prong One, examiners evaluate whether the claim recites a judicial exception, determining if the claim is directed to a law of nature, a natural phenomenon, or an abstract idea. In this case, claim 1 recites signal separation, and identification of voice signals, which is a mental process. In Prong Two, examiners evaluate whether the judicial exception is integrated into a practical application that imposes a meaningful limit on the judicial exception. In this case, additional structural elements are generic computer components used to apply the abstract idea.
Step 2B (part 2 of the Mayo test) requires analyzing the claims to determine if they recite additional elements that amount to significantly more than the judicial exception. In this case, the claims do not include additional elements that are sufficient to amount to significantly more than the abstract idea itself.  

Regarding claims 1, 19, and 20, signal separation and identification of voice signals are mental processes, which are an abstract idea. Additional structural elements such as processor, memory, and computer readable medium are generic computer components, and a sound receiver is also generic, and is used for extrasolution activity of receiving sound. None of these elements constitute integration into a practical application, or significantly more.

Regarding claims 2-3, 5-6, 8, 12-13, the limitations are further clarifications of the above abstract ideas.

Regarding claim 4, identifying a signal as voice is a mental process, which is an abstract idea without integration into a practical application and without significantly more.

Regarding claim 10, removing echo or noise is a mathematical calculation, which is an abstract idea without integration into a practical application and without significantly more.

Regarding claim 11, detecting a user, and identifying signals corresponding to users are mental processes, which are abstract ideas without integration into a practical application and without significantly more.

Regarding claims 14-15 and 17, storing data is insignificant extrasolution activity, and identifying signals corresponding to users are mental processes, which are abstract ideas without integration into a practical application and without significantly more.

Regarding claim 16, storing data is insignificant extrasolution activity, which does not constitute integration into a practical application or significantly more.

Regarding claim 18, transmitting data is insignificant extrasolution activity, which does not constitute integration into a practical application or significantly more.

The limitations of the claims, taken alone, do not amount to significantly more than the above-identified judicial exception (the abstract idea). Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements individually. Applicable case law cited in the Federal Register includes, but is not limited to: Alice Corp., 134 S. Ct. at 2355-56, Digitech Image Tech., LLC v. Electronics for Imaging, Inc., 758 F.3d 1344 (Fed. Cir. 2014), Benson, 409 U.S. at 63.

See "Preliminary Examination Instructions in view of the Supreme Court Decision in Alice Corporation Pty. Ltd. v. CLS Bank International, et al.," dated June 25, 2014, and the Federal Register notice titled "2014 Interim Guidance on Patent Subject Matter Eligibility" (79 FR 74618).

	
	Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, and 11-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shin et al. (KR 20180041355 A), hereinafter referred to as Shin, in view of Krishnaswamy et al. (US 2018/0146370 A1), hereinafter referred to as Krishnaswamy, and further in view of Gorodetski et al. (US 9,875,742 B2), hereinafter referred to as Gorodetski.

Examiner notes that for translation purposes, the US filing of the Shin reference (US 2019/0214011 A1) from the same patent family will be relied upon for citations in the Office Action.

Regarding claim 1, Shin teaches:
An electronic device, comprising: 
a sound receiver (para [0041-42], where microphones are used); and 
a processor (Fig. 2 element 120, para [0052], where a processor is used) configured to: 
separate a sound signal obtained through the sound receiver into a plurality of sound source signals (para [0069], where an audio signal is separated into multiple audio signals); and 
Shin does not teach:
identify a sound source signal to have characteristics satisfying at least one predefined condition as corresponding to a user voice from among the plurality of sound source signals,
wherein the at least one predefined condition includes a predefined condition to identify whether each of the plurality of sound source signals has an uneven volume in a certain section and
the processor is configured to identify the sound signal which has the uneven volume in the certain section as corresponding to user voice.
Krishnaswamy teaches:
identify a sound source signal to have characteristics satisfying at least one predefined condition as corresponding to a user voice from among the plurality of sound source signals (para [0064], where speaker recognition and diarisation are performed to find the true caller).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Shin by using the speaker recognition of Krishnaswamy (Krishnaswamy para [0064]) to differentiate the speakers of Shin (Shin para [0075]) in order to find the true caller from multiple speakers (Krishnaswamy para [0064]).
Gorodetski teaches:
wherein the at least one predefined condition includes a predefined condition to identify whether each of the plurality of sound source signals has an uneven volume in a certain section (col. 6 line 65 - col. 7 line 13, where the dynamic energy range is used to determine speech and non-speech frames) and
the processor is configured to identify the sound signal which has the uneven volume in the certain section as corresponding to user voice (col. 6 line 65 - col. 7 line 13, where the dynamic energy range being below a predetermined threshold is used to indicate a non-speech frame).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Shin in view of Krishnaswami by using the energy range of Gorodetski (Gorodetski col. 6 line 65 - col. 7 line 13) in the determination of Shin in view of Krishnaswami (Shin para [0075]), in order to filter out non-speech frames in a diarization process (Gorodetski col. 6 line 65 - col. 7 line 13).

Regarding claim 2, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 1, wherein the characteristics include volume (Shin para [0104-105], where energy of the voice is used).  

Regarding claim 3, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 2, wherein the at least one predefined condition includes a predefined condition indicating a change in the volume for identifying the sound source signal as corresponding to a user voice (Shin para [0104-105], where increase in energy of the voice above a predetermined threshold is used).  

Regarding claim 11, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 1, further comprising: 
a detector configured to detect a specific user (Krishnaswamy para [0063], where a user is authenticated using biometric information), 
wherein the processor is configured to: 
identify two or more sound source signals of the plurality of sound source signals as corresponding to two or more user voices, respectively (Krishnaswamy para [0064], where multiple speakers are separated); and 
based on the specific user being detected by the detector, identify a user voice of the two or more user voices corresponding the specific user (Krishnaswamy para [0064-65], [0067], where a speaker is authenticated to find the true speaker by template matching). 

Regarding claim 12, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 11, wherein the detector detects the specific user by at least one of a login of a user account, speaker recognition using voice characteristics, camera face recognition, and user detection through a sensor (Shin para [0107], Krishnaswamy para [0064], where features and characteristics of a speaker are used for identification).  

Regarding claim 13, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 1, wherein the processor is configured to identify two or more sound source signals of the plurality of sound source signals as corresponding to two or more user voices, respectively (Shin para [0095], [0106-107], [0139], Krishnaswamy para [0064], where multiple user voices are separated).  

Regarding claim 14, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 1, further comprising: 
a memory configured to store a characteristic pattern of a user voice of a specific user (Krishnaswamy para [0066], where a voice print template database stores templates), and 
the processor is configured to 
identify two or more sound source signals of the plurality of sound source signals as corresponding to two or more user voices, respectively (Krishnaswamy para [0064], where multiple speakers are separated), and 
identify a sound source signal of the two or more sound source signals corresponding to the user voice of the specific user based on the stored characteristic pattern (Krishnaswamy para [0064-65], [0067], where a speaker is authenticated to find the true speaker by template matching).  

Regarding claim 15, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 1, further comprising: 
a memory configured to store a voice recognition model (Krishnaswamy para [0064-65], where a speaker recognition model is stored), 
wherein the processor is configured to: 
identify two or more sound source signals of the plurality of sound source signals as corresponding to two or more user voices, respectively (Krishnaswamy para [0064], where multiple speakers are separated); and 
recognize a sound source signal of the two or more sound source signals corresponding to a user voice of a specific user based on the voice recognition model (Krishnaswamy para [0064-65], [0067], where a speaker is authenticated to find the true speaker by template matching).  

Regarding claim 16, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 15, wherein the processor is configured to store a plurality of characteristics of a plurality of user voices of a plurality of users, respectively, for use by the voice recognition model (Krishnaswamy para [0027], [0066], where the memory includes a database of a plurality of voice prints for a plurality of persons).  

Regarding claim 17, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 15, wherein
the memory is configured to store texts of user utterances of a plurality of users for use by the voice recognition model (Gorodetski col. 8 lines 8-22, col. 10 lines 11-18, where transcripts of the user's speech are used for diarization and stored), and 
the processor is configured to recognize the sound source signal corresponding to the user voice of the specific user based on the stored texts (Gorodetski col. 7 line 60 - col. 8 line 7, where text analysis techniques are used to identify the agent).

Regarding claim 18, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 1, wherein the processor is configured to transmit the identified sound source signal to a voice recognition server (Shin para [0039], where the audio signal is transmitted to a voice recognition server).  

Regarding claim 19, Shin teaches:
A method for controlling an electronic device, comprising: 
separating a sound signal obtained through a sound receiver into a plurality of sound source signals (para [0069], where an audio signal is separated into multiple audio signals); and
Shin does not teach:
identifying a sound source signal to have characteristics satisfying the at least one predefined condition as corresponding to a user voice from among the plurality of sound source signals,
wherein the at least one predefined condition includes a predefined condition to identify whether each of the plurality of sound source signals has an uneven volume in a certain section, and
the identifying the sound signal comprises identifying the sound source signal which has the uneven volume in the certain section as corresponding to user voice.
Krishnaswamy teaches:
identifying a sound source signal to have characteristics satisfying the at least one predefined condition as corresponding to a user voice from among the plurality of sound source signals (para [0064], where speaker recognition and diarisation are performed to find the true caller).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Shin by using the speaker recognition of Krishnaswamy (Krishnaswamy para [0064]) to differentiate the speakers of Shin (Shin para [0075]) in order to find the true caller from multiple speakers (Krishnaswamy para [0064]).
Gorodetski teaches:
wherein the at least one predefined condition includes a predefined condition to identify whether each of the plurality of sound source signals has an uneven volume in a certain section (col. 6 line 65 - col. 7 line 13, where the dynamic energy range is used to determine speech and non-speech frames) and
the identifying the sound signal comprises identifying the sound source signal which has the uneven volume in the certain section as corresponding to user voice (col. 6 line 65 - col. 7 line 13, where the dynamic energy range being below a predetermined threshold is used to indicate a non-speech frame).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Shin in view of Krishnaswami by using the energy range of Gorodetski (Gorodetski col. 6 line 65 - col. 7 line 13) in the determination of Shin in view of Krishnaswami (Shin para [0075]), in order to filter out non-speech frames in a diarization process (Gorodetski col. 6 line 65 - col. 7 line 13).

Regarding claim 20, Shin teaches:
A non-transitory computer-readable storage medium in which a computer program executed by a computer is stored (para [0149], where a computer readable medium is used), wherein the computer program is configured to: 
separate a sound signal into a plurality of sound source signals (para [0069], where an audio signal is separated into multiple audio signals); and 
Shin does not teach:
identify a sound source signal to have characteristics satisfying the at least one predefined condition as corresponding to a user voice from among the plurality of sound source signals,
wherein the at least one predefined condition includes a predefined condition to identify whether each of the plurality of sound source signals has an uneven volume in a certain section and
the processor is configured to identify the sound signal which has the uneven volume in the certain section as corresponding to user voice.
Krishnaswamy teaches:
identify a sound source signal to have characteristics satisfying the at least one predefined condition as corresponding to a user voice from among the plurality of sound source signals (para [0064], where speaker recognition and diarisation are performed to find the true caller).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Shin by using the speaker recognition of Krishnaswamy (Krishnaswamy para [0064]) to differentiate the speakers of Shin (Shin para [0075]) in order to find the true caller from multiple speakers (Krishnaswamy para [0064]).
Gorodetski teaches:
wherein the at least one predefined condition includes a predefined condition to identify whether each of the plurality of sound source signals has an uneven volume in a certain section (col. 6 line 65 - col. 7 line 13, where the dynamic energy range is used to determine speech and non-speech frames) and
the processor is configured to identify the sound signal which has the uneven volume in the certain section as corresponding to user voice (col. 6 line 65 - col. 7 line 13, where the dynamic energy range being below a predetermined threshold is used to indicate a non-speech frame).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Shin in view of Krishnaswami by using the energy range of Gorodetski (Gorodetski col. 6 line 65 - col. 7 line 13) in the determination of Shin in view of Krishnaswami (Shin para [0075]), in order to filter out non-speech frames in a diarization process (Gorodetski col. 6 line 65 - col. 7 line 13).

Claim 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shin, in view of Krishnaswamy, and Gorodetski, and further in view of Kawamura et al. (US 2018/0124255 A1), hereinafter referred to as Kawamura.

Regarding claim 4, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 2, wherein
Shin in view of Krishnaswamy and Gorodetski does not teach:
the processor is configured to identify a sound source signal which has not the uneven volume in the certain section as a speaker output voice that is output from a speaker.
Kawamura teaches:
  the processor is configured to identify a sound source signal which has not the uneven volume in the certain section as a speaker output voice that is output from a speaker (para [0116], where a level of volume remaining above a threshold corresponds to the audio being output from a speaker).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Shin in view of Krishnaswamy and Gorodetski by using the terminal of Kawamura (Kawamura para [0116]) to determine the sound source signal of Shin in view of Krishnaswamy and Gorodetski (Shin para [0069]), in order to prevent an outgoing call in a period where audio is being outputted (Kawamura para [0114]).

Claim 5-6 and 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shin, in view of Krishnaswamy, and Gorodetski, and further in view of Lund et al. (Lund, T., & Skovenborg, E. (2014). Loudness vs. Speech Normalization in Broadcast. SMPTE Motion Imaging Journal, 123(5), 44-51.), and further in view of Block et al. (US 10,004,110 B2), hereinafter referred to as Block.

Regarding claim 5, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 1, wherein
Shin in view of Krishnaswamy and Gorodetski does not teach:
the characteristics include lufs or lkfs values, and 
the at least one predefined condition includes a predefined condition indicating a rate of change of the lufs or lkfs values for identifying the sound source signal as corresponding to a user voice.  
Lund teaches:
the characteristics include lufs or lkfs values (Fig. 1, where LUFS or LKFS are used as the measurement unit), and
The prior art of Shin in view of Krishnaswamy and Gorodetski contained a device which differed from the claimed device by the substitution of some elements (LUFS, LKFS) with other elements (Volume). The substituted elements and their functions were known in the art, as taught by Lund (Lund Fig. 1). One of ordinary skill in the art could have substituted one known element for another, and the results of the substitution would have been predictable.
Block teaches:
the at least one predefined condition includes a predefined condition indicating a rate of change of the lufs or lkfs values for identifying the sound source signal as corresponding to a user voice (col. 3 line 59 - col. 4 line 30, where rate or change of the signal energy level is used to distinguish speech from steady-state background noise).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Shin in view of Krishnaswamy, Gorodetski, and Lund by using the measurements of Block (Block col. 3 line 59 - col. 4 line 30) in the determination of Shin in view of Krishnaswamy, Gorodetski, and Lund (Shin para [0075]), in order to distinguish noise or silence from speech (Block col. 3 line 59 - col. 4 line 30).

Regarding claim 6, Shin in view of Krishnaswamy, Gorodetski, Lund, and Block teaches:
The electronic device of claim 5, wherein 
the characteristics include zero crossing rate (ZCR) and volume (Shin para [0075], [0104-105], where zero crossing rate and energy of the voice is used), and 
the at least one predefined condition includes a predefined condition indicating the ZCR is lower than a first threshold and an average volume is greater than a second threshold for identifying the sound source signal as corresponding to a user voice (Krishnaswamy para [0069], where energy higher than a threshold and lower ZCR indicates speech).  

Regarding claim 8, Shin in view of Krishnaswamy, Gorodetski, Lund, and Block teaches:
The electronic device of claim 6, wherein 
the characteristics include information on a begin of speech (BOS) and an end of speech (BOS) by voice activity detection (VAD) (Shin para [0104-105], where a starting and end point of the voice signal are determined), and 
the at least one predefined condition includes a predefined condition indicating that the begin of speech and the end of speech are detected in the sound source signal for identifying the sound source signal as corresponding to a user voice (Shin para [0104-105], where a starting and end point of the voice signal are determined).  

Claim 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shin, in view of Krishnaswamy, and Gorodetski, and further in view of Chan et al. (US 2010/0046770 A1), hereinafter referred to as Chan.

Regarding claim 10, Shin in view of Krishnaswamy and Gorodetski teaches:
The electronic device of claim 1, further comprising:
Shin in view of Krishnaswamy and Gorodetski does not teach:
a preprocessor configured to remove echo and noise from the sound signal prior to the sound signal being separated into the plurality of sound source signals.
Chan teaches:
a preprocessor configured to remove echo and noise from the sound signal prior to the sound signal being separated into the plurality of sound source signals (para [0041], where pre-processing includes echo cancellation and noise reduction).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Shin in view of Krishnaswami and Gorodetski by using the pre-processing of Chan (Chan para [0041]) in the processing of Shin in view of Krishnaswami and Gorodetski (Krishnaswamy para [0064]), in order to increase intelligibility of a received signal in a noisy environment (Chan para [0036]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US 2013/0083193 A1 para [0132] teaches changing a volume of voice output from speakers depending on user attributes.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRYAN S BLANKENAGEL whose telephone number is (571)270-0685. The examiner can normally be reached 8:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BRYAN S BLANKENAGEL/Primary Examiner, Art Unit 2658