Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This is in response to Applicant’s Remark and Amendments filed on 11/22/2021 regarding to the application 16/398,828 filed on 04/30/2019.
Claims 22, 31, and 40 are hereby amended. Claims 22, 25-31, and 34-40 are currently pending for consideration. 

Continued Examination Under 37 CFR 1.114
Receipt is acknowledged of a request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e) and a submission, was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office Action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 11/22/2021 has been entered.

Response to Amendment and Remark
The applicant's amendments and remarks filed on 11/22/2021 have been fully and carefully considered, with Examiner's response set forth below.
Applicant’s arguments, see (Arg. Page 1), with respect to the rejection(s) of claim(s) 22, 31, and 40 under “Rejected under 35 U.S.C. §103… with the amendments in claims 22, 31, and 40” have been fully considered and are persuasive. Therefore, the rejection has been withdrawn.  However, upon further consideration, because the amendments changes the scope, a new ground of rejection is made with Hidaka in view of Divakaran and Olstad.

Examiner Note
The Examiner cites particular columns, line numbers and/or paragraph numbers in the references as applied to the claims below for the convenience of the Applicant(s). Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the Applicant fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 22, 25-28, 31, 34-37, and 39-40 are rejected under 35 U.S.C. 103 as being unpatentable over Hidaka et al. (US 20030055634 A1, “Hidaka”) in view of Divakaran et al. (US 20040085339 A1, “Divakaran”) and further in view of Olstad et al. (US 20130132374 A1,”Olstad”).
As to the claim 22, Hidaka discloses A method comprising: 
identifying, by a computing device, a multimedia content item comprising digital content and an audio track; (Hidaka discloses [Abstract, 0220] A scheme to judge (i.e. identifying) emphasized speech portions (i.e. multimedia content item) by a statistical processing in terms of a set of speech parameters… the speech block (i.e. multimedia content item) is extracted based on the number of the speech sub-block decided as being emphasized in the speech sub-block extracting part, and by designating the starting time and finishing time of the extracted speech block comprising audio (i.e. audio track) and video data (i.e. digital content) of each content is read out and sent out as summarized speech or summarized video data).
analyzing, via the computing device, the multimedia content item, and based on said analysis, identifying a compression budget, said compression budget comprising information indicating a runtime value of the multimedia content item and a maximum percentage of a total of the runtime value of the multimedia content item; (Hidaka discloses [Abstract, 0184, 0216] A scheme to judge (i.e. analyzing) emphasized speech portions (i.e. multimedia content item) by a statistical processing (i.e. analyzing/identifying) in terms of a set of speech parameters including… a input condition for summarization (i.e. a compression budget) as the user inputs his desired one of a plurality of preset values of the time length (i.e. a runtime value) of the ultimate summary of the media content and the summarization rate (i.e. maximum percentage) like reduces the content to 10% in terms of length or time).
The examiner notes that “compression budget” has a multiple statements in the [0037] of the specification recites “The compression budget 110 can be received in terms of different criteria. In an embodiment, the compression budget 110 can be expressed as a percentage of the total running time of the received content item 108. The compression budget 110 can also be expressed in terms of the actual running time of the summary 150. Thus, if the compression budget 110 is received as a percentage, the input component 102 can be configured to calculate the total running time for the summary 150” and the [0063] of the specification discloses “The compression budget may be expressed either in terms of the fraction of the total running time of the actual content item 107 or in terms of the absolute running time (in terms of seconds, minutes and the like) for the summary”. So, the desired “compression budget” can be received either “a percentage (fraction) of the total running time” or “the actual (absolute) running time of the summary”.
Hidaka discloses “input condition (i.e. compression budget) for summarization… the user may input his desired one of a plurality of preset values of the time length (i.e. a runtime value) of the ultimate summary of the media content and the summarization rate (i.e. maximum percentage)”. The input condition as a desired preset value is the “compression budget” comprising the time length (i.e. a runtime value) of the ultimate summary of the media content and the summarization rate (i.e. a maximum percentage) like reduces the content to 10% in terms of length or time. (See [0184, 0216])
partitioning, via the computing device, upon said determination, said multimedia content item based on the runtime value and the maximum percentage of the total of the runtime value indicated in said compression budget; (Hidaka discloses [0181, 0184] the speech summarizing method partitions the speech blocks (i.e. content item) to decide to be summarized with the its sub-block that permit automatic speech summarization at a desired preset value as input condition (i.e. compression budget)… determine the speech blocks targeted for summarization by use of the desired preset value for the input condition (i.e. compression budget) in step S13 and calculate the gross time (i.e. runtime value) of the speech blocks targeted for summarization, that is, the time length (i.e. runtime value) of the speech blocks to be summarized… inputting condition (i.e. compression budget) for summarization as a desired preset value of the time length (i.e. a runtime value) of the ultimate summary and the summarization rate (i.e. maximum percentage) of the gross time).
generating, via the computing device, a summary content item based on said partitioning, said summary content item comprising said identified segments of the digital content and audio information. (Hidaka discloses [0063, 0448] the summarizing method analyzes an input speech signal to calculate its speech parameters… determine (i.e. partition) speech blocks each composed of a plurality of speech sub-blocks and speech sub-blocks of the input speech signal… determine a frame forming each speech sub-block… summarize speech blocks, providing (i.e. generating) summarized speech… storing real-time image (i.e. identified digital content segment) and speech signals (i.e. identified audio information segment) in correspondence with a playback time, inputting a summarization start time, and inputting the time of summary that is the overall time of summarized portions (i.e. a summary content item), or summarization rate that is the ratio between the overall time of the summarized and the entire summarization target portion).
However, Hidaka may not explicitly discloses all the aspects of determining, via the computing device, that the maximum percentage within the compression budget is at least a threshold fraction less than the runtime value;
Divakaran discloses determining, via the computing device, that the maximum percentage within the compression budget is at least a threshold fraction less than the runtime value; (Divakaran: [0047, 0074] Dynamic time warping (DTW) is used to "stretch" and "compress" time within certain limits (i.e. compression budget as a threshold), to allow for a good alignment between similar segments of the video having different lengths of time to find the time warping of the segments that give a best match with an optimum path… a summarization cost function can trade-off an absolute difference between a required summary length (i.e. runtime value), a total length of a set R of selected segment to be included (i.e. less than) in the summary, (i.e., R ⊂ S), a closest segment in a set R… indicates how well the set R (summary 171) represents the set S).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Hidaka and Divakaran disclosing summarizing the multimedia with compressing process which are analogous art from the “same field of endeavor”, and, when Divakaran’s using Dynamic time warping (DTW) "compress" time within certain limits and summarization cost function trade-off a required summary length and a total length of a set R of selected segment to be included was combined with Hidaka’s summarization both the audio and video content with a speech, the claimed limitation on the determining, via the computing device, that the maximum percentage within the compression budget is at least a threshold fraction less than the runtime value; would be obvious. The motivation to combine Hidaka and Divakaran is to provide a method for summarizing the video content with a dynamic programming process the symmetric cross-distance matrix to reduce the processing steps and increase the efficiency. (See Divakaran [0054]).
However, Hidaka in view of Divakaran may not explicitly discloses all the aspect of  identifying, via the computing device, based on said partitioning, segments of digital content and audio information, the segments corresponding to a portion of the multimedia content item; and
Olstad discloses identifying, via the computing device, based on said partitioning, segments of digital content and audio information, the segments corresponding to a portion of the multimedia content item; and (Olstad discloses [0058] FIG. 16 shows the key video frames (i.e. segments) and associated time sequences (i.e. partitioning) are selected (i.e. identifying) and pieced (i.e. segments) together to form a video summary (i.e. digital content) together with the audio track (i.e. audio information)) 
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Hidaka in view of Divakaran and Olstad disclosing summarizing the multimedia with compressing process which are analogous art from the “same field of endeavor”, and, when Olstad’s identifying the partitioned digital content with audio track was combined with Hidaka in view of Divakaran’s summarization both the audio and video content with a speech, the claimed limitation on the identifying, via the computing device, based on said partitioning, segments of digital content and audio information, the segments corresponding to a portion of the multimedia content item would be obvious. The motivation to combine Hidaka in view of Divakaran and Olstad is to provide a method for displaying a set of intelligent video summaries in a page with thumbnails to allow quick browsing. (See Olstad [0002]).
As to the claim 25, Hidaka in view of Divakaran and Olstad discloses The method of claim 22, wherein said compression budget indicates a value for a runtime of the summary content item to be generated, said runtime of the summary content item to be generated being a target length of the summary content item. (Hidaka discloses [0186] determine the speech blocks targeted for summarization by use of the condition set and calculate the gross time of the speech blocks targeted for summarization, that is, the time length (i.e. target length) of the speech blocks to be summarized).
As to the claim 26, Hidaka in view of Divakaran and Olstad discloses The method of claim 25, further comprising:  3Application No.: To be assignedDocket No.: 085804.116251 Preliminary Amendment
identifying a subset of said segments of digital content and audio information, said subset identification based on the runtime of the summary content item indicated by the compression budget. (Hidaka discloses [0190, 0184] decide a speech block to be separated from the sequence of speech sub-blocks (i.e. subset segments) divided… The speech sub-block is determined when the aforementioned duration as Sj (i.e. subset identification) in FIG. 3… with the preset playback values of the time length of the ultimate summary, summarization rate, or compression rate).
As to the claim 27, Hidaka in view of Divakaran and Olstad discloses The method of claim 22, wherein said analysis further comprises: 
analyzing said multimedia content item by applying a linear regression algorithm, and determining, based on said application, whether said multimedia content item can be summarized, wherein said generation is based on said determination. (Hidaka discloses [0070, 0222] analyzing the speech signal is already coded for each frame by a coding scheme based on CELP (Code-Excited Linear Prediction) model (i.e. linear regression algorithm)… decide the speech blocks including the speech sub-blocks decided as being emphasized. The gross time of the thus determined speech blocks is calculated, and the summarized portion deciding part decides (i.e. determination) whether the result of calculation meets the condition for summarization (i.e. can be summarized)).
The examiner notes that the [00038] of the specification recites “It is based on a simple linear regression model that takes the length of the content item, source and various coarse-grained features for video and audio quality into account” without the detail of the linear regression model. The examiner assures that the CELP (Code-Excited Linear Prediction) model is a linear regression algorithm.
As to the claim 28, Hidaka in view of Divakaran and Olstad discloses The method of claim 27, wherein said determination is further based on a compression rate of said multimedia content item. (Hidaka disclose [0181] a speech processing method permits (i.e. determines) the automatic speech summarization at a desired rate (i.e. compression rate)).
As to the claim 30, Hidaka in view of Divakaran and Olstad discloses The method of claim 22, wherein said each segment of the digital content includes information indicating time intervals within the multimedia content item, wherein the time intervals are a sequence of non-contiguous, non-overlapping sub-intervals of equal lengths. (Hidaka disclose [0065, 0257, 0350] the number of frames preceding and succeeding the current frame which can be an integral number of frames (i.e. a sequence) may be a fixed time interval… receive a content via the user terminal, if a summary of the content desired to receive is available (i.e. non-contiguous) in the case of a content that continues as long as several hours, and a summary compressed into of a desired time length… For each speech sub-block (i.e. sub-interval) of a y-sec duration (i.e. equal lengths), y/t representative still pictures (where y/t represents the normalization of y by a fixed time length t) are extracted (i.e. non-overlapping) in synchronization with speech signals of high emphasized state probability).
Regarding claims 31, 34-37, and 39, these claims recite the storage medium performed by the method of claims 22, 25-28, and 30, respectively; therefore, the same rationale of rejection is applicable.
Regarding claim 40, this claim recites the device performed by the method of claim 22; therefore, the same rationale of rejection is applicable.

Claims 29 and 38 are rejected under 35 U.S.C. 103 as being unpatentable over Hidaka in view of Divakaran and Olstad and further in view of Reed et al. (US 20050276570 A1,”Reed”).
As to the claim 29, Hidaka in view of Divakaran and Olstad discloses The method of claim 22, wherein said analysis further comprises: 
However, Hidaka in view of Divakaran and Olstad may not explicitly discloses all the aspects of the identifying a length of the digital content, a source of the digital content and information indicating coarse-grained features of the digital content and the audio track. 
Reed discloses identifying a length of the digital content, a source of the digital content and information indicating coarse-grained features of the digital content and the audio track. (Reed discloses [0423, 0382, 0482] identifying the Audiobook length approximately 3 hours… Digital Content that is dynamically based on punctuated or ongoing network interaction with data sources, other users or customers, or telemetry from the local or remote devices… using a higher-density paper-based solution such as Xerox's Glyph solution (i.e. coarse-grained); provided higher compression together with a minimally distracting appearance; placed on images, in the background of text on the Audiobook which includes digital content and audio content).
The examiner notes that the [00038] of the specification recites “It is based on a simple linear regression model that takes the length of the content item, source and various coarse-grained features for video and audio quality into account” without any detail description for the “various coarse-grained features”. The examiner interprets that Reed’s Xerox's Glyph solution with higher-density paper-based solution provides higher compression on the image with background of text can be the coarse-grained features for the digital content and audio content quality on the Audiobook.
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Hidaka in view of Divakaran and Olstad and Reed disclosing summarizing the multimedia with compressing process which are analogous art from the “same field of endeavor”, and, when Reed’s Xerox's Glyph solution for the Audiobook digital content with audio track with audio track was combined with Hidaka in view of Divakaran and Olstad’s summarization both the audio and video content with a speech, the claimed limitation on the identifying a length of the digital content, a source of the digital content and information indicating coarse-grained features of the digital content and the audio track would be obvious. The motivation to combine Hidaka in view of Divakaran and Olstad and Reed is to provide a method for producing and playing the audio track with the audiobook on the player and data-streaming platforms effectively. (See Reed [0007]).
Regarding claim 38, this claim recites the storage medium performed by the method of claims 29; therefore, the same rationale of rejection is applicable.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JENQ-KANG (Kang) CHU whose telephone number is (571)270-7396. The examiner can normally be reached M-F 8-6 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kavita Padmanabhan can be reached on 5712728352. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JENQ-KANG CHU/Examiner, Art Unit 2176                                                                                                                                                                                                        
/ARIEL MERCADO/Primary Examiner, Art Unit 2176