DETAILED ACTION

Introduction
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
2.	The information disclosure statement (IDS) submitted on 10/06/2020, 12/09/2020 is/are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 102
3.	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

4.	Claims 1, 2 are rejected under 35 U.S.C. 102(a) (1) as being anticipated by Fukuda et al. (“Detecting breathing sound n realistic Japanese telephone conversations and its application to automatic speech recognition.” SPEECH COMMUNICATION, ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM, NL, vol. 98, 2 February 2018 (2018-02-02), pages 95-103.)

	With respect to Claim 1, Fukuda et al. disclose 
 	A method of processing a voice audio signal, the method comprising: 
 	receiving, at an electronic device, a voice audio signal (Fukuda et al. page 96 left col. In experiments with realistic Japanese telephone conversations, we found that the GMM-based method alone had many false alarm of breath-event detection while our proposed method could significantly reduce the false alarms and achieved an 80.3% error reduction over the best conventional method); 
 	identifying spoken phrases within the voice audio signal based on the detection of voice activity or inactivity (Fukuda page 101 left col. We also show that using breath-event information as delimiters of ASR inputs in conjunction with voice activity detection (VAD) yield a 3.8% error reduction in character error rate, page 102 Perform VAD to divide input signal to multiple speech segments); 
 	dividing the voice audio signal into a plurality of segments based on the identified spoken phrases (Fukuda page 102 Perform VAD to divide input signal to multiple speech segments), and 
 	in accordance with a determination that a selected segment of the plurality of segments has a duration, Tseg, longer than a threshold duration, Tthresh (Fukuda et al. page 1 right col. The re-split process is performed only when speech segments detected by VAD are no longer than N seconds because there is no need to split short speech segments in terms of ASR decoding process, page 2
 	Perform VAD to divide input signal to multiple speech segments
for all speech segments do
pick speech segment i; 
if speech segment i is shorter than N seconds then 
go to next speech segment (no processing); 
end if 
detect breath events in speech segment i;
if breath events exist then
split speech segment i with detected breath events; 
		end if 
	end for)
 	identifying a most likely location of a breath in the audio associated with the selected segment (Fukuda page 102 detect breath events in speech segment i); and 
 	dividing the selected segment into sub-segments based on the identified most likely location of a breath (Fukuda page 102  split speech segment i with detected breath events;). 

	With respect to Claim 2, Fukuda et al. disclose 


Claim Rejections - 35 USC § 103
5.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.
6.	Claim 11 is rejected under 35 U.S.C.103 as being unpatentable over Fukuda et al. in view of Mostafasi et al. (US 2010/0061596 A1.)

	With respect to Claim 11, Fukuda et al. disclose all the limitation of Claim 1 upon which Claim 11 depends. Fukuda et al. fail to explicitly teach 
 	wherein identifying a most likely location of a breath in the audio associated with the selected segment comprises identifying a minimum signal energy in the audio associated with the selected segment.  
	However, Mostafasi et al. teach 
 	wherein identifying a most likely location of a breath in the audio associated with the selected segment comprises identifying a minimum signal energy in the audio associated with the selected segment (Mostafasi et al. [0057] the time series of motion energies may be used to detect time points at which the patient's motion is the least, which correspond with exhale and inhale phases of breathing but without necessarily knowing whether it is inhale or exhale. The processor then determines the time spacing .DELTA.P between the two time points. Next, an image frame is obtained at a time that motion energy reaches a minimum indicating exhale or inhale end of breathing cycle (step 1312).)
 	Fukuda et al. and Mostafasi et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of re-splitting the selecting segment as taught by Fukuda et al., using teaching of minimum energy as taught by Mostafasi et al. for the benefit of identifying the breathing cycle (Mostafasi et al. [0057] the time series of motion energies may be used to detect time points at which the patient's motion is the least, which correspond with exhale and inhale phases of breathing but without necessarily knowing whether it is inhale or exhale. The processor then determines the time spacing .DELTA.P between the two time points. Next, an image frame is obtained at a time that motion energy reaches a minimum indicating exhale or inhale end of breathing cycle (step 1312).)

7.	Claims 12, 13 are rejected under 35 U.S.C.103 as being unpatentable over Fukuda et al. in view of Mostafasi et al. (US 2010/0061596 A1) in view of Mather et al. (US 5,143,078.)

	With respect to Claim 12, Fukuda et al. in view of Mostafasi et al. teach all the limitations of Claim 11 upon which Claim 12 depends. Fukuda et al. in view of Mostafasi et al. fail to explicitly teach 
 	wherein identifying a minimum signal energy comprises: searching the audio associated with the selected segment for the minimum signal energy within a moving time window.  
	However, Mather et al. teach 
 	wherein identifying a minimum signal energy comprises: searching the audio associated with the selected segment for the minimum signal energy within a moving time window (Mather et al. col. 3 lines 48-65 Mather et al. disclose a method of calculating the root-mean-square energy of the signal in the moving time window in order to identify breath sounds.)
 	Fukuda et al., Mostafasi et al. and Mather et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of re-splitting the selecting segment as taught by Fukuda et 

With respect to Claim 13, Fukuda et al. in view of Mostafasi et al., Mather et al. teach 
 	wherein the duration of the time window is in the range 250-500ms (Fukuda et al. Fig. 1 and Fig. 2, page 96 left col. In this page, we define “breathing sound” as in interval that contains only the sound of breath (300ms long in Fig. 1) and “breath event” as a portion consisting of both the breathing sound and silences before and after the breathing sound (380ms in Fig. 1).)

8.	Claim 14 is rejected under 35 U.S.C.103 as being unpatentable over Fukuda et al. in view of Wendling et al. (US 2020/0204895 A1.)

 	With respect to Claim 14, Fukuda et al. disclose all the limitation of Claim 1 upon which Claim 14 depends. Fukuda et al. fail to explicitly teach
 	further comprising applying a low-pass filter to the audio associated with the selected segment prior to identifying a most likely location of a breath.  
	However, Wendling et al. teach
 	further comprising applying a low-pass filter to the audio associated with the selected segment prior to identifying a most likely location of a breath (Wendling et al. [0078] Breathing can be detected using low pass filtering and comparing it to a wide band signal. Since the harmonics of speech have less power, more power will be detected in the low pass filter as compared to the high pass filter for speech. Speech vs. breathing can be detected because if the signal is speech, the wide band will be not much more or similar to the low band pass. If breathing, the wide band will have substantially greater power than the low band pass.)
 	Fukuda et al. and Wendling et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of re-splitting the selecting segment as taught by Fukuda et al., using Breathing can be detected using low pass filtering and comparing it to a wide band signal. Since the harmonics of speech have less power, more power will be detected in the low pass filter as compared to the high pass filter for speech. Speech vs. breathing can be detected because if the signal is speech, the wide band will be not much more or similar to the low band pass. If breathing, the wide band will have substantially greater power than the low band pass.)

9.	Claim 15 is rejected under 35 U.S.C.103 as being unpatentable over Fukuda et al. in view of Muyal et al. (US 2018/0308524 A1.)

 	With respect to Claim 15, Fukuda et al. disclose all the limitation of Claim 1 upon which Claim 15 depends. Fukuda et al. fail to explicitly teach
 	further comprising displaying, on a display of the electronic device, a visual representation of the segments and sub- segments.  
	However, Muyal et al. teach 
 	further comprising displaying, on a display of the electronic device, a visual representation of the segments and sub- segments (Muyal et al. [0068] Step 1140 discloses displaying the video file while each section of the sections of the text is displayed between the start time code and end time code associated with the identified breathing points. This way, the presenter can view the text associated with each scene during the entire scene. For example, the second scene is associated with the text of the second section, from 4.5 to 6.2 seconds from the beginning of the video file.)
 	Fukuda et al. and Muyal et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of re-splitting the selecting segment as taught by Fukuda et al., using teaching of displaying the sections as taught by Muyal et al. for the benefit of enabling the presenter view the text associated with each scene during the entire scene (Muyal et al. [0068] Step 1140 discloses displaying the video file while each section of the sections of the text is displayed between the start time code and end time code associated with the identified breathing points. This way, the presenter can view the text associated with each scene during the entire scene. For example, the second scene is associated with the text of the second section, from 4.5 to 6.2 seconds from the beginning of the video file.)
	
10.	Claims 16, 17 are rejected under 35 U.S.C.103 as being unpatentable over Fukuda et al. in view of Kim (US 2019/0180738 A1.)

 	With respect to Claim 16, Fukuda et al. disclose 
	 receive, at an electronic device, a voice audio signal (Fukuda et al. page 96 left col. In experiments with realistic Japanese telephone conversations, we found that the GMM-based method alone had many false alarm of breath-event detection while our proposed method could significantly reduce the false alarms and achieved an 80.3% error reduction over the best conventional method); 
 	identify spoken phrases within the voice audio signal based on the detection of voice activity or inactivity (Fukuda page 101 left col. We also show that using breath-event information as delimiters of ASR inputs in conjunction with voice activity detection (VAD) yield a 3.8% error reduction in character error rate, page 102 Perform VAD to divide input signal to multiple speech segments); 
 	divide the voice audio signal into a plurality of segments based on the identified spoken phrases (Fukuda page 102 Perform VAD to divide input signal to multiple speech segments), and
 	in accordance with a determination that a selected segment of the plurality of segments has a duration, Tseg, longer than a threshold duration, Tthresh (Fukuda et al. page 1 right col. The re-split process is performed only when speech segments detected by VAD are no longer than N seconds because there is no need to split short speech segments in terms of ASR decoding process, page 2
 	Perform VAD to divide input signal to multiple speech segments
for all speech segments do
pick speech segment i; 
if speech segment i is shorter than N seconds then 
go to next speech segment (no processing); 
end if 

if breath events exist then
split speech segment i with detected breath events; 
		end if 
	end for)
 identify a most likely location of a breath in the audio associated with the selected segment (Fukuda page 102 detect breath events in speech segment i); and
 	divide the selected segment into sub-segments based on the identified most likely location of a breath (Fukuda page 102  split speech segment i with detected breath events;). 
	Fukuda et al. fail to explicitly teach using a computer program stored on a non-transitory, processor readable storage medium, the computer program configured to perform an audio processing. 
	However, Kim teaches 
 	A computer program stored on a non-transitory, processor readable storage medium, the computer program configured to (Kim [0132] the processor 140, [0133] a non-transitory computer readable medium in which a program for performing an audio signal processing method is stored may be installed, [0117] For example, the portable device 100 or 400 may detect a breathing sound of the user from the audio signals received through the first and second microphones and determine the user utterance distance depending on whether the breathing sound is detected.)
 	Fukuda et al. and Kim are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of re-splitting the selecting segment as taught by Fukuda et al., using teaching of the non-transitory computer readable medium as taught by Kim for the benefit of detecting the breathing sound of the user (Kim [0133] a non-transitory computer readable medium in which a program for performing an audio signal processing method is stored may be installed, [0117] For example, the portable device 100 or 400 may detect a breathing sound of the user from the audio signals received through the first and second microphones and determine the user utterance distance depending on whether the breathing sound is detected.)

Claim 17, Fukuda et al. disclose 
 	receive, at an electronic device, a voice audio signal (Fukuda et al. page 96 left col. In experiments with realistic Japanese telephone conversations, we found that the GMM-based method alone had many false alarm of breath-event detection while our proposed method could significantly reduce the false alarms and achieved an 80.3% error reduction over the best conventional method); 
 	identify spoken phrases within the voice audio signal based on the detection of voice activity or inactivity (Fukuda page 101 left col. We also show that using breath-event information as delimiters of ASR inputs in conjunction with voice activity detection (VAD) yield a 3.8% error reduction in character error rate, page 102 Perform VAD to divide input signal to multiple speech segments); 
 	divide the voice audio signal into a plurality of segments based on the identified spoken phrases (Fukuda page 102 Perform VAD to divide input signal to multiple speech segments), and
 	in accordance with a determination that a selected segment of the plurality of segments has a duration, Tseg, longer than a threshold duration, Tthresh (Fukuda et al. page 1 right col. The re-split process is performed only when speech segments detected by VAD are no longer than N seconds because there is no need to split short speech segments in terms of ASR decoding process, page 2
 	Perform VAD to divide input signal to multiple speech segments
for all speech segments do
pick speech segment i; 
if speech segment i is shorter than N seconds then 
go to next speech segment (no processing); 
end if 
detect breath events in speech segment i;
if breath events exist then
split speech segment i with detected breath events; 
		end if 
	end for)
 identify a most likely location of a breath in the audio associated with the selected segment (Fukuda page 102 detect breath events in speech segment i); and
divide the selected segment into sub-segments based on the identified most likely location of a breath (Fukuda page 102  split speech segment i with detected breath events;). 
 	Fukuda et al. fail to explicitly teach A non-transitory, computer readable medium comprising instructions which, when executed by a computer, cause the computer to processing an audio in detecting a breath 
	However, Kim teaches 
 	A non-transitory, computer readable medium comprising instructions which, when executed by a computer, cause the computer to: (Kim [0132] the processor 140, [0133] a non-transitory computer readable medium in which a program for performing an audio signal processing method is stored may be installed, [0117] For example, the portable device 100 or 400 may detect a breathing sound of the user from the audio signals received through the first and second microphones and determine the user utterance distance depending on whether the breathing sound is detected.)
 	Fukuda et al. and Kim are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of re-splitting the selecting segment as taught by Fukuda et al., using teaching of the non-transitory computer readable medium as taught by Kim for the benefit of detecting the breathing sound of the user (Kim [0133] a non-transitory computer readable medium in which a program for performing an audio signal processing method is stored may be installed, [0117] For example, the portable device 100 or 400 may detect a breathing sound of the user from the audio signals received through the first and second microphones and determine the user utterance distance depending on whether the breathing sound is detected.)

Allowable Subject Matter
11.	Claim 3 is  to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Claims 4-10 are objected to by virtue of their dependency.


Conclusion
12.	The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. See PTO-892.
a.	Nakagome et al. (US 2019/0392840 A1.) In this reference, Nakagoma et al. disclose a method of determining a breathing period. 
b.	Shi et al. (US 2018/0374496 A1.) In this reference, Shi et al. disclose a method for detecting a breath sound. 
c. 	Wosk et al. (US 2017/0186446 A1.) In this reference, Wosk et al. disclose a method for detecting a breath. 

13.  	Any inquiry concerning this communication or earlier communications from the examiner should be directed to THUYKHANH LE whose telephone number is (571)272-6429. The examiner can normally be reached Mon-Fri: 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew C. Flanders can be reached on 571-272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.