DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1-20 are rejected under 35 U.S.C. 102(a1) as being anticipated by Univ Chongqing Posts & Telecom (CN 109 508 375).
With respect to claim 20 (similarly claims 1 and 12), Univ. Chongqing Posts & Telecom (UCPT) teaches a data processing tool (e.g. a model frame, see Fig 1 [0035]) comprising a control unit and a memory, wherein the 5control unit is configured to process speech data for a speech event (e.g. the model frame of Fig 1 [0035] inherently has a processing unit and a memory to process speech data, i.e. audio, for a speech event, as suggested in Fig 1 [0035]), wherein the speech data comprises a visible component and an audible component (e.g. the 
identify a first visible feature within the visible component that corresponds to a predetermined visible speech feature (e.g. identify/classify a first visible feature i.e. emotional features, spatio-temporal features, see [0011]-[0013] [0018]-[0019], [0036]-[0037]); 
10determine a first time corresponding to the occurrence of the first visible feature during the speech event (e.g. the video is divided in T segments [0020] and processed in time-synchronous manner  suggest determine a first time corresponding to the occurrence of the first visible feature during the speech event); 
determine a measurement of a characteristic of the audible component, at a second time, during the speech event (e.g. extracting video features, at a second time, during the video event, see [0023], [0038]), which has a predefined temporal relationship to the first time at which the first visible feature occurred (e.g. setting a sliding window to 100ms [0023] suggest a predefined temporal relationship to the first time at which the first visible feature occurred); and 
15use the determined measurement of the characteristic at the second time to output an evaluation of an attribute with which the predetermined visible speech feature is associated (e.g. see the decision fusion in [0024]-[0027], [0039], which uses a separate classifier for each mode, probability score for each sentiment category is obtained from each classify, use the weighting and law to select the largest label).
With respect to claim 2 (similarly claim 13),  UCPT teaches the method of claim 1, wherein the speech event comprises either a live speech or a recorded speech (e.g. the speech event comprises either a live speech or recorded speech, i.e. consumers use webcams to record their opinions on products and upload them on Youtube or Facebook [0005]).
With respect to claim 3 (similarly claim 14),  UCPT teaches the method of claim 1, wherein the speech data further comprises characteristic data that comprises information relating to one 
With respect to claim 4 (similarly claim 15),  UCPT teaches the method of claim 3, wherein the characteristics of the audible component 25includes at least one of: volume, pitch, speed, length of pauses between words, tonnetz, ferments, Mel Frequency Cepstral Coefficients, Energy Entropy, Short Time Energy, Zero-Crossing Rate, Spectral Roll-Off, Spectral Centroid, Spectral Flux, Pitch Spectral autocorrelation function (ACF), or Pitch Spectral Harmonic Product Spectrum (HPS) (e.g. low-level descriptors and their statistical functions, see [0038]).
With respect to claim 5 (similarly claim 16),  UCPT teaches the method of claim 1, wherein the first visible feature within the visible component comprises at least a portion of a word or phrase spoken during the speech event (see the sentiment classification step [0012]-[0013] which suggest that the first visible feature within the visible component comprises at least a portion of a word or phrase spoken during the speech event).
With respect to claim 6 (similarly claim 17),  UCPT teaches the method of claim 1, wherein the first visible feature within the visible component comprises a facial expression captured during the speech event (e.g. the visual emotion classification step [0018]-[0019], [0037] suggest the first visible feature within the visible component comprises a facial expression captured during the speech event, see also [0006]).
With respect to claim 7 (similarly claim 18),  UCPT teaches the method of claim 1, wherein identifying the first visible feature that corresponds to the predetermined visible speech feature comprises matching the first visible feature with the predetermined visible speech feature within a predetermined degree of tolerance or error (e.g. the CNN-RNN hybrid model and 3DCLS model of [0036]-[0037] are used to match the extracted features to those of the neural networks i.e. comprises matching the first visible feature with the predetermined visible speech feature within a predetermined degree of tolerance or error).
With respect to claim 8 (similarly claim 19),  UCPT teaches the method of claim 1, further comprising: identifying a second visible feature within the visible component, wherein the second visible feature corresponds to a second predetermined visible speech feature different than the predetermined visible speech feature to which the first visible feature 10corresponds (e.g. the text sentiment classification step of [0012]-[0013] and the visual emotion classification step of [0018]-[0019], [0036]-[0037] disclose further comprising: identifying a second visible feature within the visible component, wherein the second visible feature corresponds to a second predetermined visible speech feature different than the predetermined visible speech feature to which the first visible feature 10corresponds); and identifying a third time, during the speech event, at which the second visible feature occurs (e.g. [0020] disclose identifying a third time, during the speech event, at which the second visible feature occurs).
With respect to claim 9,  UCPT teaches the method of claim 1, wherein the predefined temporal relationship dictates the 15first time is equal to the second time (e.g. [0020]-[0023] suggest that the video features and text/visual features are extracted in a synchronous manner).
With respect to claim 10,  UCPT teaches the method of claim 1, wherein the predefined temporal relationship dictates there is a predetermined time difference between the first time and the second time (e.g. [0020]-[0023] suggest that the video features and text/visual features are extracted in a time sequence wherein v1 is different from v2).
With respect to claim 11,  UCPT teaches the method of claim 1, wherein determining the measurement of the characteristic of the audible component comprises: selecting the characteristic of the audible component from a plurality of characteristics (e.g. selecting the characteristic of the audible component from emotional features, spatio-temporal features as suggested in [0011]-[0013], [0036]-[0038]), wherein the predefined temporal relationship between the first time and the second time is dependent on the selected characteristic (e.g. [0020]-[0023] suggest that the predefined temporal relationship between the first time and the second time is dependent on the selected characteristic).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to IBRAHIM SIDDO whose telephone number is (571)272-4508. The examiner can normally be reached 9:00-5:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, King Poon can be reached on 571-272-7440. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/IBRAHIM SIDDO/Primary Examiner, Art Unit 2675