Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 05/27/2022 have been fully considered but they are not persuasive. Regarding claim 26 Applicant argues (pg. 9 of the Remarks) that Kritt does not cure Tesch’s deficiencies of teaching “outputting an emotion descriptor icon selected from an emotion descriptor icon set comprising a plurality of emotion descriptor icons, the outputted emotion descriptor icon being based on the selected emotion state.” Examiner respectfully disagrees. Kritt teaches (¶0038) emotion, mood, or theme denoting icons 416 may be rendered in the second window 406. An emotion denoting icon 416 may be associated with and representative of an emotion tag. An emotion or mood denoting icon 416 may be an "emoticon." Icon 416 associated with amusement or a funny mood may be relatively large if the mood or emotion would be expected to be perceived intensely, but the same icon may be relatively small if the mood or emotion would be expected to be perceived mildly (i.e., selected from a set); (¶0047) emotions determined may include amusement, joy, anger, disgust, embarrassment, fear, sadness, surprise, and a neutral state. Therefore, Kritt teaches “outputting an emotion descriptor icon selected from an emotion descriptor icon set comprising a plurality of emotion descriptor icons, the outputted emotion descriptor icon being based on the selected emotion state.”

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 26-34, 36-42, 44-45 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kritt et al. (US 20140181668, hereinafter Kritt.)
Regarding Claim 26, “A method of generating an emotion descriptor icon and adding the emotion descriptor icon to multimedia content, the method comprising:” Kritt teaches (¶0038, ¶0035, and Fig. 4) emotion, mood, or theme denoting icons 416 may be rendered (i.e., generated) alongside video 404, the emotion denoting icon 416 may be associated with and representative of an emotion tag; (¶0034) emotion tag(s) are created.
As to “receiving input multimedia content comprising at least video information” Kritt teaches (¶0019-¶0020) computer implementation; (¶0016 and ¶0002) content is a multimedia video, television show, movie, or internet video stored using a container; (¶0031, ¶0002, and Fig. 2) audio-visual file container 202 contains video file(s) 204, audio file(s) 206, subtitle file(s) 208, and metadata 210; (¶0022 and ¶0027) a memory may store an audio visual file container 202 or it may be on different computer systems and may be accessed remotely, e.g., via a network.
As to “performing analysis on the input multimedia content to produce information representing the video information with respect to a plurality of characteristics” Kritt teaches (Fig. 5-6, ¶0039, ¶0056) process for generating visual tags, audio tags, & key word tags a video file 204 is parsed; (¶0044, ¶0047) employing image recognition process to identify facial features/facial object in video and generate emotion tag associated with facial emotion; (¶0054) attribute tag may correspond to cinematic techniques for evoking a particular mood (e.g., fear, sadness, suspense).
As to “determining, based on a comparison of the information representing the video information at a temporal position in the video information and a set of information items respectively representing an emotion state, a relative likelihood of association between the input multimedia content and at least some of a plurality of emotion states; selecting an emotion state based on the outcome of the relative likelihood of association between the input multimedia content and at least some of the plurality of emotion states” Kritt teaches (¶0034 and ¶0047) an emotion tag file 314 may be created from the attribute tag file 308 and the consistency-corrected visual tag 302, audio tag 304, key word tag 306, and metadata 210 files. The emotion tag file 314 includes tags that are associated with emotion objects. An emotion object may be associated with an emotion, mood, or theme that a typical viewer might be expected to perceive or that the creators of a video intended the audience to perceive. In addition, an emotion object may be generated directly from the visual tag 302, such as where the tag identifies a human object displaying a particular emotion. Each emotion object may be of a predefined type and associated with a time stamp. An emotion object may be generated in operation 312 by identifying patterns of visual, audio, key word, and attribute tags that correspond or correlate with an emotion object. An emotion object may include parameters corresponding with intensity of the perceived emotion or a confidence level that the perceived emotion accurately represents a ground truth emotion; (¶0047) emotions determined may include amusement, joy, anger, disgust, embarrassment, fear, sadness, surprise, and a neutral state
 As to “outputting an emotion descriptor icon selected from an emotion descriptor icon set comprising a plurality of emotion descriptor icons, the outputted emotion descriptor icon being based on the selected emotion state” Kritt teaches (¶0038) emotion, mood, or theme denoting icons 416 may be rendered in the second window 406. An emotion denoting icon 416 may be associated with and representative of an emotion tag. An emotion or mood denoting icon 416 may be an "emoticon." Icon 416 associated with amusement or a funny mood may be relatively large if the mood or emotion would be expected to be perceived intensely, but the same icon may be relatively small if the mood or emotion would be expected to be perceived mildly (i.e., selected from a set); (¶0047) emotions determined may include amusement, joy, anger, disgust, embarrassment, fear, sadness, surprise, and a neutral state
As to “and outputting timing information associating the output emotion descriptor icon with a temporal position in the video information.” Kritt teaches (¶0038, ¶0034, and Fig. 4) An emotion denoting icon 416 may be rendered (i.e., output) at horizontal locations corresponding with the temporal location of the particular emotion tag in the video.

Regarding Claim 27, “The method according to Claim 26, wherein the video information comprises one or more of a scene, body language of one or more people in the scene and facial expressions of the one or more people in the scene.” Kritt teaches (¶0050) tags are associated with a scene; (¶0044, ¶0047) employing image recognition process to identify facial features/facial object in video and generate emotion tag associated with facial emotion; (¶0036) objects are actors in a video.

Regarding Claim 28, “The method according to Claim 26, wherein the input multimedia content further comprises audio information comprises one or more of music, speech and sound effects.” Kritt teaches (¶0016 and ¶0065-¶0066) multimedia video includes spoken words, music, and other sounds, which may be referred to herein as audio objects; (Fig. 2 and ¶0031) audio-visual file container 202 includes an audio file.

Regarding Claim 29, “The method according to Claim 26, wherein the input multimedia content further comprises textual information comprises one or more of a subtitle, a description of the input multimedia content and a closed caption.” Kritt teaches (¶0065) an audio transcript may be provided with the video in the form of a closed caption file included in the AV file container; (Fig. 2 and ¶0031) audio-visual file container 202 includes a subtitle file.

Regarding Claim 30, “The method according to Claim 26, wherein the steps of performing the analysis, determining the relative likelihood of association, selecting the emotion state and outputting the emotion descriptor icon are performed each time there is a change in the video information, or audio information of the input multimedia content or textual information of the input multimedia content.” Kritt teaches (¶0055) visual tags may be associated or set for each scene; (Fig. 4) analysis performed throughout the length of the video.

Regarding Claim 31, “The method according to Claim 26, wherein the steps of performing the analysis, determining the relative likelihood of association, selecting the emotion state and outputting the emotion descriptor icon are performed on the input multimedia content once for each of one or more windows of time in which the input multimedia content is received.” Kritt teaches (¶0038 and Fig. 4) emotion, mood, or theme denoting icons 416 may be rendered in the second window 406. An emotion denoting icon 416 may be associated with and representative of an emotion tag. An emotion denoting icon 416 may be rendered (i.e., output) at horizontal locations (i.e., one or more windows of time) corresponding with the temporal location of the particular emotion tag in the video

Regarding Claim 32, “The method according to Claim 26, wherein the relative likelihood of association between the input multimedia content and the at least some of the emotion states is determined in accordance with a determined genre of the input multimedia content.” Kritt teaches (¶0034) an emotion object may be generated in operation 312 using contextual data provided in the metadata file 210, such as metadata designating that the video is of a particular genre, e.g., comedy, horror, drama, or action; (¶0054, ¶0062, ¶0064) an attribute tag designating an action theme may be associated with a scene with a relatively large number of shots of short duration.

Regarding Claim 33, “The method according to Claim 26, wherein the relative likelihood of association between the input multimedia content and the at least some of the emotion states is determined in accordance with a determination of the identity or location of a user who is viewing the output content.” Kritt teaches (¶0066) A key word may be a word that is predefined to be objectionable or liked by a viewer; (¶0067) a viewing pattern of a viewer may be gathered during the viewing of various videos. Using the viewing pattern, a viewing profile for a viewer may be generated. The viewing profile may identify categories of objects the viewer prefers; (¶0034 and ¶0056) emotion tag is created from key word tags.

Regarding Claim 34, “The method according to Claim 26, wherein the plurality of emotion states are stored in a dynamic emotion state codebook, the method comprising filtering the dynamic emotion state codebook in accordance with a determined genre of the input multimedia content or in accordance with a determination of the identity of a user who is viewing the output content, wherein the selected emotion state is selected from the filtered dynamic emotion state codebook.” Kritt teaches (¶0021-¶0022 and ¶0030) memory 104 (i.e., a codebook/data collection/data structure) is a hierarchy of caches and other memory devices. For example, memory may exist in multiple levels of caches, and these caches may be further divided by function, so that one cache holds instructions while another holds non-instruction data, which is used by the processor or processors; (¶0047) a category of emotion may be determined by comparing motion vectors with a template. The motion vectors may be based on deformation of facial features as reflected in an optical flow that occurs in a sequence of frames. Optical flow may be determined using differential, matching, energy-, or phase-based techniques. In various embodiments, emotions that may be determined may include amusement, joy, anger, disgust, embarrassment, fear, sadness, surprise, and a neutral state

Regarding Claim 36, “A data processing apparatus that generates and adds emotion descriptor icons to multimedia content, the data processing apparatus comprising circuitry configured to” Kritt teaches (¶0075-¶0076) computer implementation (i.e., processor, computer readable instructions/data, memory); (¶0023, ¶0025, ¶0070) circuitry; (¶0038, ¶0035, and Fig. 4) emotion, mood, or theme denoting icons 416 may be rendered (i.e., generated) alongside video 404, the emotion denoting icon 416 may be associated with and representative of an emotion tag; (¶0034) emotion tag(s) are created.
The remainder of claim 36 recites similar features as claim 26, therefore its rejection is similar to claim 26.

Regarding Claim 37, its rejection is similar to Claim 27.

Regarding Claim 38, its rejection is similar to Claim 28.

Regarding Claim 39, its rejection is similar to Claim 29.

Regarding Claim 40, its rejection is similar to Claim 30.

Regarding Claim 41, its rejection is similar to Claim 31.

Regarding Claim 42, its rejection is similar to Claim 32.

Regarding Claim 44, “A non-transitory storage medium comprising executable code components which, when executed on a computer, cause the computer to perform the method according to Claim 26.” Kritt further teaches (¶0075-¶0076) computer implementation (i.e., processor, computer readable instructions/data, memory); (¶0070-¶0073) computer readable medium

Regarding Claim 45, “Circuitry for a data processing apparatus that generates and adds emotion descriptor icons to multimedia content, the circuitry comprising: receiver circuitry”, “analyzing circuitry”, and “output circuitry.” Kritt teaches (¶0075-¶0076) computer implementation (i.e., processor, computer readable instructions/data, memory); (¶0023, ¶0025, ¶0070) circuitry; (¶0038, ¶0035, and Fig. 4) emotion, mood, or theme denoting icons 416 may be rendered (i.e., generated) alongside video 404, the emotion denoting icon 416 may be associated with and representative of an emotion tag; (¶0034) emotion tag(s) are created.
The remainder of claim 45 recites similar features as claim 26, therefore its rejection is similar to claim 26.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 35 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kritt in view of el Kaliouby et al. (US 20170098122, hereinafter el Kaliouby.)
Regarding Claim 35, “The method according to Claim 26, wherein the information representing the video information is a vector signal which aggregates the video information with audio information of the input multimedia content and textual information of the input multimedia content” Kritt further teaches, (¶0032) The process 300 may receive as input a visual tag file 302, an audio tag file 304, a key word tag file 306, an attribute tag file 308, and a metadata file 210; (¶0045, ¶0047, ¶0063) vector based analysis; (¶0033) comparing a tag with one or more other tags associated with the same shot or scene for consistency and using a probability or confidence parameter associated with the particular tag is above a threshold, it may be determined that the object was correctly identified and that the shot or scene includes multiple objects; (¶0034) identifying patterns of visual, audio, keyword, and attribute tags that correspond or correlate with an emotion object.
Kritt alone does not teach “in accordance with individual weighting values applied to each of the one or more of the video information, the audio information and the textual information.” However, el Kaliouby teaches (¶0048-¶0049) computing weighted sums 164. The weighted sums can be used to give certain action units more importance in identifying a particular expression. For example, for detecting a smile, action unit AU12 (Lip Corner Puller) and an absence of AU16 (Lower Lip Depress) may be important in detecting a smile. AU25 (lip part) may also be present in many smiles, but it may still be possible to smile without the presence of that action unit; (¶0075) vectors and features can include known classifications and can be used to train the support vector machine (SVM) to categorized new data into a known classification or classifications. The classification can include classifying a face to determine facial content. The SVM 1030 can analyzed the HoG 1010 and can generate confidence values for a plurality of action units (AU). As discussed elsewhere, the AUs can include AUs from the facial actions classification system (FACS). The confidence values can include weights. Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the computer implemented method that compares video, audio, and textual information/tags for consistency as taught by Kritt to use weighted values as taught by Kaliouby for the benefit of giving certain data/evidence more importance and more accurately categorizing emotions.

Claim(s) 43 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kritt in view of Mallik et al. (US 20150221112, hereinafter Mallik.)
Regarding Claim 43, Kritt teaches (¶0029) computer system is a user device. Kritt does not teach “A television receiver comprising a data processing apparatus according to Claim 38.” However, Mallik teaches (¶0024, ¶0028, and Fig. 4) embodiments enable visual representations associated with one or more emotions to be associated with content, such as videos or photos. A visual representation serves as a reference point to a particular content segment and conveys an emotion associated with the content segment; (¶0030) implemented using a computing device configured as a set-top box communicatively coupled to a television. Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the computer implemented method for rendering emotion or mood icons alongside media content as taught by Kritt’s to be implemented in a set-top box as taught by Mallik for the benefit of providing the service (e.g., visual summaries) to a wider audience of users. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Lehtiniemie et al. (US 20140366049) – Relevant to claim 26: (¶0046) FIG. 6 illustrates the emotional response timeline 250 of FIG. 5 shown on a plot to illustrate the emotional response collected, such as through sensors 82 of the emotion interface 80 of FIG. 2. An event such as a movie may include an emotional response timeline which may factor in one or more user's emotions collected while the user was watching the movie
Bishop (US 20130095460) - Relevant to claim 26: (¶0043) capturing video of another person 110 interacting with user 100; (¶0046-¶0048) processing the video to detect the other users emotion/expression; (¶0056) The system can them display an emoticon (or any other symbol/text) so that the user can identify the "type" of the other person; (¶0018) type is personality type
Srivastava et al. (US 20190090020) – Relevant to claim 26: (¶0114 and Fig. 3) a timeline of a media item associated with expected-emotions-tagging-metadata
Gilson (US 20200051582) – Relevant to claim 28: (¶0061 and ¶0026) The computing device may generate sentiment information 503f indicating a sentiment in a portion of the audio segment. The computing device may determine a beginning time code 502c and an ending time code 502z for sentiment 503f. The time codes may be determined based on audio analysis (e.g., volume and/or frequency characteristics of the audio) and/or based on the time-coded textual information, which indicates when speech occurs; (¶0053) the computing device may insert characters in the caption to indicate a corresponding sentiment (e.g., exclamation points, emojis, words describing the sentiment, and the like).
                                                                                                                                                                                      
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FRANK J JOHNSON whose telephone number is (571)272-9629. The examiner can normally be reached 10:00AM-4:00PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Brian T. Pendleton can be reached on 571-272-7527. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/Frank Johnson/Examiner, Art Unit 2425