DETAILED ACTION
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-20 have been considered but are moot because the amendments overcome the prior rejection.
Response to Arguments
With regards to amendment “all of the video segment…played”, upon further review, Examiner notes that limitation could be misleading, and the specification, does not disclose in any way, that “all of the video segment” is played, even though the received playback time period is less than the original duration. As the Applicant’s specification shows, the original video is “compressed” to fit within the received playback time period.  Compression, as evident to one skilled in the art, involves removal of content from the original video.  Applicant’s specification provides an example of content compression technique using a “factor” for both video and audio, at least in [0038] and Fig. 1B, as follows “At time 150, during the compression technique, a media compression system compresses the video segment by a corresponding factor such that the compressed video segment 152 fits within the received playback time period. Each type of audio segment is adjusted by a distinct corresponding accelerated playback speed. For example, the dialogue audio segment type 154 is compressed by a factor of 0.9, and the silence 156 and background music 158 audio segment types are compressed by factors of 0.1 and 0.3 respectively. In this embodiment, each of the audio segment types are retained and individually compressed by their corresponding accelerated playback factor.” See also [0038].
Accordingly, Examiner interprets the limitation “such that all of the video segment and fewer than all of the plurality of audio segments are played during the playback time period
Applicant argues that “video segment, in its entirety, is played back at a different speed than a plurality of remaining audio segments”. Examiner respectfully notes no such limitation is distinctly claimed regarding video segment in its entirety, being played back at a different speed. However, Examiner notes prior art of record appear to teach video segment is played back at a different speed.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-7, and 11-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Agnihotri et al. (US 2008/0221942) in view of Basso et al. (US 2010/0023964) and Takahashi (US 2018/0268866).
Claim 1
Agnihotri teaches a method for selective audio segment compression for accelerated playback of a media asset by a service provider, wherein the media asset comprises a video segment and a plurality of audio segments (Fig. 1; [0121] In this case the superset comprises all data that is related to this move e.g. sounds, pictures, voices, metadata and so forth.), the method comprising: 
calculating a video playback speed of a video segment of a media asset based on a received playback time period (Examiner notes the video playback speed is interpreted to be associated with the duration of the media asset/video segment [0154] Duration requirements deal with the durations of the trailer and of its subparts (subsets).  Each segment chosen for the preview may have a minimum time required for comprehension depending on its type, 
an original duration of the video segment that is greater than the received playback time period ([0121] In this case the superset comprises all data that is related to this move e.g. sounds, pictures, voices, metadata and so forth. [0122] A sub set of the superset may only comprise selected segments of the pictures and sounds from the film. [0171], M depends on the original program duration and on the actual segmentation.  The desired trailer can be represented as a finite sequence of successive positions that can be taken by any video segment belonging to the original program); 
receiving a plurality of audio segments of the media asset, wherein each audio segment comprises one or more audio portions of similar type from the media asset (See Fig. 1 with multiple segments; [0121] In this case the superset comprises all data that is related to this move e.g. sounds, pictures, voices, metadata and so forth. [0122] A sub set of the superset may only comprise selected segments of the pictures and sounds from the film. [0197] Micro-segmentation: segment exceeding the maximum duration after the shot segmentation are further divided into sub-segments with durations bigger than d.sub.min and with boundaries possibly aligned with content-based clues such as: a change in the audio class, appearance or disappearance of a detected face, a change in camera motion or object motion.); 
receiving a corresponding priority weight for each of the plurality of audio segments ([0175] in which w preferably is a vector of weighting factors and A(s.sub.j) is a column vector of attributes associated to segment s.sub.j in the range [0 .  . . 1].  These attributes may be computed by applying several low- and mid-level content analysis algorithms such as: computation of contrast, audio loudness, detection of action, faces, dialogues, music/speech/noise/silence, and camera motion.  The relative importance of the various attributes can be linearly tuned using the weighting factors w. See also [0156]); 
modifying the plurality of audio segments by removing an audio segment assigned to a lowest priority weight from the plurality of audio segments ([0156] Priority requirements indicate which content should be included in the trailer to convey as much information on the program as possible in the shortest amount of time.  For example including close-ups on main actors as well as action segments and dialogues giving clues on the story line. [0157] Uniqueness requirements aim at maximizing the efficiency of the trailer by minimising redundancy. [0158] Exclusion requirements indicate which content should not be included in the trailer.  For example a trailer should not include how the film or program ends etc [0199] Pre-filtering: commercial detection may be performed over the entire video and the detected commercials may be discarded from the set of segments available for the generation of the trailer. However this commercial detection is different from product placement detection, since product placements preferably should be present in the created trailer.).
in response to determining that the duration of remaining audio segments does not exceed the received playback time period, generating, for playback, the video segment based on the video playback speed and the remaining audio segments ([0172] teaches multiple factors used in creating a trailer, which includes priority requirements for each segment as a function of differently weighted attributes, including audio attributes such as audio loudness, dialogues, music/speech/noise, silence ([0175]). Agnihotri then teaches that the creation of the trailer involves compensating segments, which include video, in creation of a trailer based on audio classifier [0196]-[0198]. Agnihotri then creates a trailer that maximizes the equation provided in [0172]). 

Basso teaches the calculation of the video playback speed ([0026] Second, the method selects media content to fill the estimated amount of time between the first event and the second event (204). One possible dynamic aspect of selecting media content is that content is chosen or altered in real time to adjust to the remaining estimated time. Media content playback order may change as events change. If more time is available, additional media content may be added or the currently playing media content may be played at a slower speed to stretch out the media to fill the entire estimated time or media content providing additional details or depth may be located and played. If less time is available, then less necessary portions of media content may be abbreviated or the media content may be played at a quicker speed to finish just before the second event.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to incorporate playback speed adjustment as taught by Basso with the automatic trailer generation system of Agnihotri, because doing so would have provided a way to play the selected media content possibly at a reasonably different speed to fit the time interval (abstract of Basso).
Agnihotri in view of Basso may not clearly detail all of the video segment and fewer than all of the plurality of audio segments are played during the playback time period, and wherein the each of the remaining audio segments is played back at a speed based on the corresponding priority weight for each of the remaining audio segments.
Takahashi teaches all of the video segment and fewer than all of the plurality of audio segments are played during the playback time period, and wherein the each of the remaining audio segments is played back at a speed based on the corresponding priority weight for each of fast reproduction video from a video part of an input moving image; a sound generation unit configured to generate a shortened sound using a part of a sound part of the moving image; and a synthesizing unit configured to synthesize the fast reproduction video generated by the video generation unit and the shortened sound generated by the sound generation unit and generate a fast reproduction moving image.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to incorporate playback speed adjustment as taught by Takahashi with the automatic trailer generation system of Agnihotri and Basso, because doing so would have provided a way to enable sound to be naturally reproduced in fast reproduction moving images. ([0006]of Takahashi).
Claim 2

Claim 3
Agnihotri in view of Basso and Takahashi further teaches the method of claim 2, wherein modifying the remaining audio segments by removing an audio segment assigned to a lowest remaining priority weight from the remaining audio segments comprises removing a plurality of audio segments assigned to one or more of the lowest remaining priority weights from the remaining audio segments ([0148] of Agnihotri, Trailer generation concerns the problem of selecting the best subset of segments of a given duration of the original program that satisfies a certain list of trailer requirements. See [0172]-[0175] regarding weighting factors used to prioritize segments when producing a trailer. [0210] However, these second level supersets may be weighted according to the importance score and only a few of them having a score above a predefined threshold value may be chosen. The importance score may be the value of the objective function (1). [0025] of Basso, The estimated time changes from 3 minutes to 8 minutes. 
Claim 4
Agnihotri in view of Basso and Takahashi further suggests the method of claim 1, wherein in response to determining that the duration of remaining audio segments does not exceed the received playback time period, generating, for playback, the video segment based on the video playback speed and the remaining audio segments further comprises: calculating a time period by the difference between the received playback time period and the sum of all remaining audio segments; retrieving the previously removed audio segment; trimming the previously removed audio segment to a playback period matching the time period; and adding the trimmed removed audio segment to the remaining audio segments ([0025] of Basso, The estimated time changes from 3 minutes to 8 minutes. The time estimation is updated to reflect the change in the second event from the first restaurant to the second restaurant, and any selected media content may be rearranged, added, deleted, squeezed, or stretched to fill the new time estimate.).  
Claim 5
Agnihotri in view of Basso and Takahashi further teaches the method of claim 1, wherein determining a corresponding priority weight for each of the plurality of audio segments comprises: retrieving a predefined priority scheme comprising a plurality of audio portion types and corresponding priority weights; determining, for each of the plurality of audio segments, whether the type of the corresponding one or more audio portions of the audio segment match a predefined audio portion type from the predefined priority scheme; and in response to the determination that the type of the corresponding one or more audio portions of the audio segment music/speech/noise/silence, and camera motion. The relative importance of the various attributes can be linearly tuned using the weighting factors w. [0196] Audio classification: the synchronised audio stream is classified into coherent audio classes such as silence, speech, music, noise etc. [0058] of Agnihotri, In the case when fingerprint metadata is used in the method for finding a data segment the fingerprint metadata may be stored on a separate database. In this way the fingerprint metadata can be accessed online and used for creating subsets of data.).  
Claim 6
Agnihotri in view of Basso and Takahashi further teaches the method of claim 5, wherein the predefined priority scheme is received from a content producer of the media asset ([0105] of Agnihotri, The tagging of the content data is preferably done manually by one or more persons that reviews e.g. a film or game and finds the positions of product placements. At this position he/she creates a tag that may comprise information such as: what product, product qualities, colours, context of product usage, who is the provider, to what kind of customer is the product suitable for, both professional and private, to extraverts but not to introverts, and so forth. [0115] The tagged content data in the chosen segments 3 may be in the beginning of the segment in the middle of the segment or in the end of the segment. Thus, how a segment 3 is extracted 
Claim 7
Agnihotri in view of Basso and Takahashi further teaches the method of claim 1 further comprising: receiving real-time locational information of an electronic device for media asset playback, wherein the real-time locational information indicates movement of the electronic device; 3Application No. 16/357,725Docket No.: 003597-2213-103 Reply dated July 2, 2020 Reply to April 2. 2020 Office Actiondetermining, based on historical locational information for the electronic device, whether a subset of the real-time locational information matches a subset of historical locational information; responsive to the determination that the subset of the current real-time locational information matches the subset of historical locational information, determining an estimated playback time period based on the subset of historical locational information; and assigning the estimated playback time period to the received playback time period (([0025] of Basso, The estimation of time between the two events is based on many things, such as traffic reports, comparative analysis of repeated trip data (such as a daily commute), historical usage trends (such as if the user listens to the radio for 25 minutes every day from 8:20 a.m. to 8:45 a.m.), GPS or trip-planning software,)… The second event can be dynamic, fluid, or subject to change at any moment. The second event can be altered slightly or may be changed dramatically. In an example of a user listening to the car radio while on his lunch break, the user may have originally intended to eat at a first restaurant, but a few blocks before the first restaurant decided to eat at a second restaurant instead. The estimated time changes from 3 minutes to 8 minutes. The time estimation is updated to reflect the change in the second event from the first restaurant to the second restaurant, and any selected media content may be rearranged, added, deleted, squeezed, or stretched to fill the new time estimate. New estimated times are calculated on a 
Claims 11-12, and 14-17
These claims recite substantially the same limitations as those provided in claims 1-2, and 4-7 respectively, and therefore they are rejected for the same reasons.
Claim 13
This claim recites substantially the same limitations as those provided in claim 3, and therefore it is rejected for the same reasons.

Claims 8-10 and 18-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Agnihotri et al. (US 2008/0221942) in view of Basso et al. (US 2010/0023964), Takahashi (US 2018/0268866), and Chung (US 2016/0323482).
Claim 8
Agnihotri in view of Basso, and Takahashi further teaches the method of claim 1, further comprising: determining a dialogue audio segment from the plurality of audio segments ([0175] of Agnihotri, in which w preferably is a vector of weighting factors and A(s.sub.j) is a column vector of attributes associated to segment s.sub.j in the range [0 . . . 1]. These attributes may be computed by applying several low- and mid-level content analysis algorithms such as: computation of contrast, audio loudness, detection of action, faces, dialogues, music/speech/noise/silence, and camera motion. The relative importance of the various attributes can be linearly tuned using the weighting factors w.).
Agnihotri in view of Basso and Takahashi may not clearly detail at a particular time during playback: determining an offset audio value based on the difference between the 
Chung teaches in [0065], If the delay time is noticeable, the user may be able to fine-tune the playback of the supplementary content manually, or the media guidance application may attempt to correct for the delay automatically. If the delay time between detecting an indicium and the user device responding is known, the media guidance application may account for that by instructing the playback of the audio asset to resume at a slightly later time to compensate. [0128] In step 612, the media guidance application, in response to an indicium that the video asset has ceased, transmits (e.g., via control circuitry 304 (FIG. 3)) an instruction to pause the playback of the audio asset on the first device. For example, if the media guidance application detects (e.g., by monitoring the data stream via control circuitry 304 (FIG. 3) or detection module 316 (FIG. 3)) that the video has been interrupted for commercials, or that the user has paused playback, the media guidance application may transmit (e.g., via control circuitry 304 (FIG. 3)) an instruction via Internet to the first device to pause playback of the alternate language track.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to incorporate playback synching as taught by Chung with the automatic trailer generation system of Agnihotri and Basso, and Takahashi, because doing so would have provided a way for preserving the synchronicity of the multiple content sources. (abstract of Chung).
Claim 9

modifying the remaining audio segments by removing an audio segment assigned to a lowest remaining priority weight from the remaining audio segments ([0156] of Agnihotri, Priority requirements indicate which content should be included in the trailer to convey as much information on the program as possible in the shortest amount of time.  For example including close-ups on main actors as well as action segments and dialogues giving clues on the story line. [0157] Uniqueness requirements aim at maximizing the efficiency of the trailer by minimising redundancy. [0158] Exclusion requirements indicate which content should not be included in the trailer.  For example a trailer should not include how the film or program ends etc) comprises determining whether the audio segment being removed is one of the one or more high priority audio segments; and in response to the determination that the audio segment being removed is one of the one or more high priority audio segments, stopping generation for playback of the video segment and the remaining audio segments ([0025] of Basso, In an example of a user listening to the car radio while on his lunch break, the user may have originally intended to eat at rearranged, added, deleted, squeezed, or stretched to fill the new time estimate. New estimated times are calculated on a regular, scheduled basis,). Further Examiner notes one of ordinary skill in the art would find it obvious to stop generation for playback when desired media segments are not available. 
Agnihotri in view of Basso may not specifically detail stopping of generation for playback of different segments.
Chung teaches in abstract, “If the broadcast is paused or interrupted, either for an advertisement or in response to a user request to pause the broadcast, the media guidance application will pause the language track without any additional user input.”
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to incorporate playback synching as taught by Chung with the automatic trailer generation system of Agnihotri and Basso, because doing so would have provided a way for preserving the synchronicity of the multiple content sources. (abstract of Chung).
Claim 10
Agnihotri in view of Basso and Takahashi and Chung further teaches the method of claim 9, wherein determinations of one or more audio segments as high priority audio segments are received from a content producer of the media asset ([0052] of Agnihotri,  Preferably the tagged content data is product placement data. The product placement is tagged so that it is possible to classify it, compare it with a user profile and so that it is possible to find it in a superset of data. [0053] Hence by classifying the product placement data it is possible to match the product 
Claims 18-20
These claims recite substantially the same limitations as those provided in claims 8-10 respectively, and therefore they are rejected for the same reasons.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to THOMAS H MAUNG whose telephone number is (571)270-5690.  The examiner can normally be reached on Monday-Friday, 9am-6pm, EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached on 1-(571) 272-7848.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to 






/THOMAS H MAUNG/Primary Examiner, Art Unit 2654