DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Takahashi in view of Hayashi
Claims 1-7, 10-17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Takahashi(USPubN 2018/0268866) in view of Hayashi(USPubN 2007/0160349).
As per claim 1, Takahashi teaches an image capture device comprising: a housing(“Cameras of such types as life-log cameras or action cameras have been widely used in the fields of sports and the like in recent years” in Para.[0002], “ a hardware configuration of an information processing apparatus according to the present embodiment …  the information processing apparatus 900 illustrated in FIG. 17 may realize the moving image generation device 1 illustrated in FIGS. 1, 11, and 16” in Para.[0120]); 

an optical element carried by the housing and configured to guide light within a field of view to the image sensor(“An imaging device 915 includes a lens system composed of an imaging lens, an iris, a zoom lens, a focus lens and the like, a driving system for causing the lens system to perform a focus operation and a zoom operation” in Para.[0131]); 
a sound sensor carried by the housing and configured to generate a sound output signal conveying audio information based on sound received by the sound sensor, the audio information defining audio content(“The sound input device 917 has a microphone, a microphone amplifier circuit that amplifies sound signals obtained from the microphone, an A/D converter, a signal processing circuit that performs processes of noise removal, sound source separation, and the like on sound data, and the like. The sound input device 917 outputs sound signals converted into digital signals” in Para.[0132]); and 
one or more physical processors carried by the housing, the one or more physical processors configured by machine-readable instructions to(“The CPU 901 functions as an arithmetic processing device and a control device and controls the overall operation in the information processing apparatus 900 according to various programs” in Para.[0122]): 
capture the visual content during a visual capture duration, the visual capture duration extending from a visual capture start point to a visual capture end point(“the video generation 200 generates a fast reproduction video” in Para.[0088], The generated video comprising start point and end point and duration is length between start point and end point as well known in the art.); 
capture the audio content during a first audio capture duration based on activation of an audio content capture option of the image capture device, the first audio capture duration extending from a first audio capture start point to a first audio capture end point, the first audio capture duration being shorter than the visual capture duration(“the sound generation unit 300 generates a shortened sound from the original sound. Specifically, the division unit 310 divides the sound part of the moving image into one or more sections in Step S106. Next, the extraction unit 320 extracts one or more sound segments from the sections obtained by division of the division unit 310 in Step S108. Then, the connecting unit 330 connects the extracted sound segments in Step S110” in Para.[0089], “The division unit 310 divides the original sound and outputs division information that is information indicating division points. As illustrated in FIG. 3, the division unit 310 functions as an utterance sound section division unit 311, an environmental sound section division unit 313, and a feature amount change division unit 315” in Para.[0055], “The extraction unit 320 has a function of extracting one or more of sound segments from a part of a sound part of an input moving image. For example, the extraction unit 320 may determine a section to be thinned out among sections obtained by division of the division unit 310 and a section to be extracted as a section to be used for a shortened sound. Since the sound of some sections of the original sound is used for a shortened sound, a length of sound to be used is shortened in comparison to a case in which the whole sound is used for a shortened sound” in Para.[0065], There are at least three audio content capture options such as utterance sound section, environmental sound section and feature amount change section. Anyone of sections can be activated to capture the sound data.); and 
generate video content of a time-lapse video, the video content of the time-lapse video including the captured visual content and the captured audio content, wherein the captured visual 400 synthesizes the fast reproduction video and the shortened sound in Step S112” in Para.[0090], “the output control unit 500 controls the output unit 20 such that a fast reproduction moving image is output in Step S114” in Para.[0091], “an event sound 820 that is a driving sound of a car is assumed to be extracted by the extraction unit 320 as a sound segment corresponding to a video 810. The connecting unit 330 sets, for example, a time of a video 811 at which the car gradually appearing in the video 810 is assumed to be closest to a camera to match a time at which the volume of the event sound 820 is the maximum. At that time, the connecting unit 330 may generate and connect an event sound 830 obtained by cutting out a waveform and performing a fade process thereon so that a driving sound starts smoothly from around the time at which the object disappears from the video and fades away thereafter. Accordingly, the time at which the volume of the driving sound of the car is the maximum matches the time at which the car is at the closest position in the video” in Para.[0114]).
Takahashi is silent about the activation of the audio content capture option separately prompting the image capture device to capture the audio content from capture of the visual content.
Hayashi teaches the activation of the audio content capture option separately prompting the image capture device to capture the audio content from capture of the visual content(“the check screen on which it is checked if the correct setting is performed according to the process of FIG. 3. FIG. 7 shows the time bands with and without a necessity of recording separately for image or sound. FIG. 8 shows the time bands with and without a necessity of image recording or audio recording” in Para.[0055]).

As per claim 2, Takahashi and Hayashi teach all of limitation of claim 1. 
Takahashi teaches wherein the first audio capture start point coincides with the visual capture start point and the first audio capture end point precedes the visual capture end point(“the extraction unit 320 may first extract an event sound. An event sound refers to a sound corresponding to an event that has occurred during capture of a moving image. An event sound may be, for example, a short utterance sound such as “wow!” or “we are arriving at OO” among utterance sounds. The extraction unit 320, for example, may extract an event sound from an utterance section when the sound of a registered word has been recognized with reference to an extraction rule DB in which words to be extracted are registered. Accordingly, a viewer can hear the content of a short utterance. In addition, an event sound may be a sudden sound such as a sound of a firework, a single burst sound of a car horn, a sound of a car passing by, a single striking sound, or a bursting sound among environmental sounds. The extraction unit 320 may extract an event sound from non-utterance sections when a registered environmental sound has been recognized, for example, with reference to an environmental sound DB in which environmental sounds to be extracted are registered” in Para.[0067], The event sound can be located in beginning, middle or end of generated video as well known in the art.).
As per claim 3, Takahashi and Hayashi teach all of limitation of claim 1. 
Takahashi teaches wherein the first audio capture start point follows the visual capture start point and the first audio capture end point coincides with the visual capture end point(Para.[0067], The event sound can be located in beginning, middle or end of generated video as well known in the art.).
As per claim 4, Takahashi and Hayashi teach all of limitation of claim 1. 

As per claim 5, Takahashi and Hayashi teach all of limitation of claim 1. 
Takahashi teaches wherein the audio content is captured further based on identification of a depiction of interest within the visual content(Para.[0067], [0068]).
As per claim 6, Takahashi and Hayashi teach all of limitation of claim 1. 
Takahashi teaches wherein the audio content is captured further based on identification of a sound of interest within the audio content(Para.[0067], [0068]).
As per claim 7, Takahashi and Hayashi teach all of limitation of claim 1. 
Takahashi teaches wherein the audio content is captured further during a second audio capture duration, the second audio capture duration extending from a second audio capture start point to a second audio capture end point, the captured audio content includes a first captured audio content portion captured during the first audio capture duration and a second captured audio content portion captured during the second audio capture duration, and the first captured audio content portion is mixed with the second captured audio content portion to provide the audio for playback of at least some of the video frames(“FIG. 5 is a diagram for describing an example of an extraction process performed by the extraction unit 320 according to the present embodiment. For example, an original sound 600 is assumed to be separated into the sound of two scenes that are a scene 611 and a scene 612. In addition, the original sound 600 is assumed to include utterance sections 621 and 622. Note that section lines of the original sound 600 are assumed to mean division points set by the division unit 310. First, the extraction unit 320 extracts an environmental sound 630 (including reference numerals 631 to 635) that is a non-utterance section. Then, the extraction unit 320 extracts an event 640. An event sound 641 is a sudden environmental sound of a firework or the like. An event sound 642 is a short utterance sound such as “we are arriving at OO.”” in Para.[0068]).
As per claim 10, Takahashi and Hayashi teach all of limitation of claim 1. 
Takahashi teaches wherein the audio content is captured further during a second audio capture duration, the second audio capture duration extending from a second audio capture start point to a second audio capture end point, the captured audio content includes a first captured audio content portion captured during the first audio capture duration and a second captured audio content portion captured during the second audio capture duration, and the first captured audio content portion provides the audio for playback of a first subset of the video frames and the second captured audio content portion provides the audio for playback of a second subset of the video frames(Para.[0068], Fig. 6, “the connecting unit 330 may continuously use the same environmental sound as long as a scene thereof is not changed. In addition, the connecting unit 330 may generate a shortened sound by dividing and synthesizing sections classified as the same scene as illustrated in FIG. 7. FIG. 7 is a diagram for describing an example of a connecting process performed by the connecting unit 330 according to the present embodiment. In the example illustrated in FIG. 7, the connecting unit 330 synthesizes sound segments extracted from a scene 611 and sound segments extracted from a scene 612 of the original sound 600 and connects the synthesized sound segments, and thereby generates the shortened sound 660” in Para.[0084]).
As per claim 11, Takahashi teaches a method for generating videos, the method performed by an image capture device including one or more processors, an image sensor, and a sound sensor(“Imaging device, sound input device, cpu” in Fig. 17) and the other limitations in the claim 11 has been discussed in the rejection claim 1 and rejected under the same rationale. 	
As per claim 12, the limitations in the claim 12 has been discussed in the rejection claim 2 and rejected under the same rationale. 
As per claim 13, the limitations in the claim 13 has been discussed in the rejection claim 3 and rejected under the same rationale. 	
As per claim 14, the limitations in the claim 14 has been discussed in the rejection claim 4 and rejected under the same rationale. 	
As per claim 15, the limitations in the claim 15 has been discussed in the rejection claim 5 and rejected under the same rationale. 	
As per claim 16, the limitations in the claim 16 has been discussed in the rejection claim 6 and rejected under the same rationale. 	
As per claim 17, the limitations in the claim 17 has been discussed in the rejection claim 7 and rejected under the same rationale. 	
As per claim 20, the limitations in the claim 20 has been discussed in the rejection claim 10 and rejected under the same rationale. 	
	
Takahashi in view of Hayashi and Niemasik
Claims 8 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Takahashi(USPubN 2018/0268866) in view of Hayashi(USPubN 2007/0160349) further in view of Niemasik et al.(USPN 10,372,991; hereinafter Niemasik).
As per claim 8, Takahashi and Hayashi teach all of limitation of claim 7. 
Takahashi teaches wherein mixing of the first captured audio content portion and the second captured audio content portion is performed based on both the first captured audio content portion and the second captured audio content portion(Para.[0068]).
Takahashi and Hayashi is silent about both the first captured audio content portion and the second captured audio content portion not including speech.

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings Takahashi and Hayashi with the above teachings of Niemasik in order to intelligently select candidate audio and generate edited video files for less resource-intensive transmission in an efficient manner.
As per claim 18, the limitations in the claim 18 has been discussed in the rejection claim 8 and rejected under the same rationale.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.  A nonstatutory double patenting rejection is appropriate where the claims at issue are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).

The USPTO internet Web site contains terminal disclaimer forms which may be used.  Please visit http://www.uspto.gov/forms/.  The filing date of the application will determine what form should be used.  A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission.  For more information about eTerminal Disclaimers, refer to http://www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.  
Claims 1-20 are rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 1-20 of U.S. Patent No. 10,827,157.  Although the conflicting claims at issue are not identical, they are not patentably distinct from each other. See the reasons sets forth below:
Instance Application No. 17/083,705
U.S. Patent No. 10,827,157
1. An image capture device comprising: a housing; an image sensor carried by the housing and configured to generate a visual output signal conveying visual information based on light that becomes incident thereon, the visual information defining visual content; an optical element carried by the housing and configured to guide light within a field of view 

2. The image capture device of claim 1, wherein the first audio capture start point coincides with the visual capture start point and the first audio capture end point precedes the visual capture end point.

3. The image capture device of claim 1, wherein the first audio capture start point follows the visual capture start point and the first audio capture end point coincides with the visual capture end point.



5. The image capture device of claim 1, wherein the audio content is captured further based on identification of a depiction of interest within the visual content.

6. The image capture device of claim 1, wherein the audio content is captured further based on identification of a sound of interest within the audio content.

7. The image capture device of claim 1, wherein the audio content is captured further during a second audio capture duration, the second audio capture duration extending from a second audio capture start point to a second audio capture end point, the captured audio content includes a first captured audio content portion captured during the first audio capture 

8. The image capture device of claim 7, wherein mixing of the first captured audio content portion and the second captured audio content portion is performed based on both the first captured audio content portion and the second captured audio content portion not including speech.

9. The image capture device of claim 8, wherein mixing of the first captured audio content portion and the second captured audio content portion is performed based on both the first capture audio content portion and the second captured audio content portion having been captured from a same type of location.

10. The image capture device of claim 1, wherein the audio content is captured further during a second audio capture duration, the second audio capture duration extending from a second audio capture start point to a second audio capture end point, the captured audio content includes a first captured audio content portion captured during the first audio capture duration and a second captured audio content portion captured during the second audio capture duration, and the first captured audio content portion provides the audio for playback of a first subset of the video frames and the second captured audio content portion provides the audio for playback of a second subset of the video frames.

11. A method for generating videos, the method performed by an image capture device including one or more processors, an image sensor, and a sound sensor, the image sensor configured to generate a visual output signal conveying visual information based on light 

12. The method of claim 11, wherein the first audio capture start point coincides with the visual capture start point and the first audio capture end point precedes the visual capture end point.

13. The method of claim 11, wherein the first audio capture start point follows the visual capture start point and the first audio capture end point coincides with the visual capture end point.

14. The method of claim 11, wherein the first audio capture start point follows the visual 

15. The method of claim 11, wherein the audio content is captured further based on identification of a depiction of interest within the visual content.

16. The method of claim 11, wherein the audio content is captured further based on identification of a sound of interest within the audio content.

17. The method of claim 11, wherein the audio content is captured further during a second audio capture duration, the second audio capture duration extending from a second audio capture start point to a second audio capture end point, the captured audio content includes a first captured audio content portion captured during the first audio capture duration and a second captured audio content portion captured during the second audio 

18. The method of claim 17, wherein mixing of the first captured audio content portion and the second captured audio content portion is performed based on both the first captured audio content portion and the second captured audio content portion not including speech.

19. The method of claim 17, wherein mixing of the first captured audio content portion and the second captured audio content portion is performed based on both the first capture audio content portion and the second captured audio content portion having been captured from a same type of location.

20. The method of claim 11, wherein the audio content is captured further during a second 


2. The image capture device of claim 1, wherein the first audio capture start point coincides with the visual capture start point and the first audio capture end point precedes the visual capture end point.

3. The image capture device of claim 1, wherein the first audio capture start point 

4. The image capture device of claim 1, wherein the first audio capture start point follows the visual capture start point and the first audio capture end point precedes the visual capture end point.

5. The image capture device of claim 1, wherein the audio content is captured further based on identification of a depiction of interest within the visual content.

6. The image capture device of claim 1, wherein the audio content is captured further based on identification of a sound of interest within the audio content.

7. The image capture device of claim 1, wherein the audio content is captured further during a second audio capture duration, the second audio capture duration extending from 

8. The image capture device of claim 7, wherein mixing of the first captured audio content portion and the second captured audio content portion is performed based on both the first captured audio content portion and the second captured audio content portion not including speech.

9. The image capture device of claim 7, wherein mixing of the first captured audio content portion and the second captured audio content portion is performed based on 

10. The image capture device of claim 1, wherein the audio content is captured further during a second audio capture duration, the second audio capture duration extending from a second audio capture start point to a second audio capture end point, the captured audio content includes a first captured audio content portion captured during the first audio capture duration and a second captured audio content portion captured during the second audio capture duration, and the first captured audio content portion provides the audio for playback of a first subset of the video frames and the second captured audio content portion provides the audio for playback of a second subset of the video frames.



12. The method of claim 11, wherein the first audio capture start point coincides with the visual capture start point and the first audio 

13. The method of claim 11, wherein the first audio capture start point follows the visual capture start point and the first audio capture end point coincides with the visual capture end point.

14. The method of claim 11, wherein the first audio capture start point follows the visual capture start point and the first audio capture end point precedes the visual capture end point.

15. The method of claim 11, wherein the audio content is captured further based on identification of a depiction of interest within the visual content.

16. The method of claim 11, wherein the audio content is captured further based on identification of a sound of interest within the audio content.

17. The method of claim 11, wherein the audio content is captured further during a second audio capture duration, the second audio capture duration extending from a second audio capture start point to a second audio capture end point, the captured audio content includes a first captured audio content portion captured during the first audio capture duration and a second captured audio content portion captured during the second audio capture duration, and the first captured audio content portion is mixed with the second captured audio content portion to provide the audio for playback of at least some of the video frames.

18. The method of claim 17, wherein mixing of the first captured audio content portion and the second captured audio content portion is performed based on both the first captured audio content portion and the second captured audio content portion not including speech.

19. The method of claim 17, wherein mixing of the first captured audio content portion and the second captured audio content portion is performed based on both the first capture audio content portion and the second captured audio content portion having been captured from a same type of location.

20. The method of claim 11, wherein the audio content is captured further during a second audio capture duration, the second audio capture duration extending from a second audio capture start point to a second audio capture end point, the captured audio content includes a first captured audio content portion captured during the first audio capture duration and a second captured audio content portion captured during the second audio capture duration, and the first captured audio content portion provides the audio for playback of a first subset of the video frames and the second captured audio content 


Claims 1-20 are anticipated by U.S. Patent No. 10,827,157 claims 1-20 as show in the table above.
Allowable Subject Matter
Claims 9 and 19 would be allowable if rewritten to overcome the rejection(s) under nonstatutory obviousness-type double patenting set forth in this office action and to include all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SUNGHYOUN PARK whose telephone number is (571)270-1333.  The examiner can normally be reached on M - Thur 6:00 am - 4 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, THAI Q TRAN can be reached on (571)272-7382.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact 






/SUNGHYOUN PARK/Examiner, Art Unit 2484