DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification, as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as "configured to" or "so that"; and
the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
Because these claim limitation(s) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, they are being interpreted to cover the corresponding structure described in the specification and Drawings as performing the claimed function, and equivalents thereof.
 Claim Rejections - 35 USC § 103

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-17 are rejected under 35 U.S.C. 103 as being unpatentable over US 20200364464 A1 (Vijayanarasimhan), in view of US 20080175556 A1 (Dorai).
Regarding Claims 1 and 16-17, Vijayanarasimhan teaches:
An information processing apparatus comprising: a feature extraction unit that analyzes a video to extract a feature element; a discriminating unit that, based on a difference in the feature element for each of a plurality of portions of the video, performs discrimination that discriminates between an explaining scene and an explained scene, the explaining scene being a scene providing explanation, the explained scene being a captured scene of what is explained in the explaining scene; and a categorizing unit that categorizes each portion of the video based on a result of the discrimination (Vijayanarasimhan: Figs. 1-2, a system and Figs. 6-7 method to segment video based on action; a machine learning algorithm based on audio (locations of audio corresponds to an action and a type of action), RGB (extract features), and motion (types of movement) classifiers to determine the scene contexts and motion patterns); 
Vijayanarasimhan does not teach explicitly on discriminating between an explaining scene and an explained scene. However, Dorai teaches (Dorai: Figs. 2-3, a system and method for video segmentation based on audiovisual and text info; Figs. 4-7 and [0035]-[0049], video is segmented micro-segments based on color histogram in Fig. 4; audio and speech associated with the video is classified into distinct sound label as speech, silence, music and environmental sound etc. (step 504); visual content of each micro-segment is further classified through visual labels (step 512) as narrator (explaining scenes vs. others as explained scenes between narrators), informative text, and linkage scene, and consequently, micro-segments with semantic audiovisual labels are generated; Fig. 6, speech recognition is used to recognize speech content of the video to create a time-stamped speech content; Fig. 7, based on all of semantic labels and micro-segments and further analysis, boundaries of macro-segments are identified and macro-segments are created; Fig. 9, a macro-segment with scene of “Anthrax” and a macro-segment with scene of “Bubonic Plague” are created).
It would have been obvious for one of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Vijayanarasimhan with comparing the obtained magnetic field information with reference magnetic field information as further taught by Dorai. The advantage of doing so is to provide an automatic video segmentation that leverage all audio, visual and text cues to improve system performance (Dorai: [0001]-[0008]).
Regarding Claim 2, Vijayanarasimhan as modified teaches all elements of Claim 1. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 1, wherein the feature extraction unit extracts, as the feature element, a feature related to a behavior of a RGB classifier for feature identification).
Regarding Claim 3, Vijayanarasimhan as modified teaches all elements of Claims 1/2. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 2, wherein the feature related to a behavior of a person extracted by the feature extraction unit comprises a movement pattern of a specific body part of the person, and wherein the discriminating unit discriminates between the explaining scene and the explained scene by using the movement pattern as a discriminating condition (Vijayanarasimhan: [0022], action scene may include people jumping, eating, etc.).
Regarding Claim 4, Vijayanarasimhan as modified teaches all elements of Claims 1/2. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 2, wherein the feature related to a behavior of a person extracted by the feature extraction unit comprises a speech pattern uttered by the person, and wherein the discriminating unit discriminates between the explaining scene and the explained scene by using the speech pattern as a discriminating condition (Dorai: Fig. 6).
Regarding Claim 5, Vijayanarasimhan as modified teaches all elements of Claim 1. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 1, wherein the feature extraction unit extracts, as the feature element, a feature related to image structure, the feature being obtained by analyzing the video (Dorai: Fig. 9).
Regarding Claim 6, Vijayanarasimhan as modified teaches all elements of Claims 1/5. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 5, wherein the video comprises a video of an operation and a video of explanation of the operation, wherein the feature extraction unit extracts, as the feature related to image structure, a feature that enables differentiation between an image whose subject is an operator and an image whose subject is an operated area where the operation is being performed, and wherein the discriminating unit discriminates between the explaining scene and the explained scene by using, as a discriminating condition, a determination of whether a portion of the video is the image whose subject is the operator or the image whose subject is the operated area (Dorai: Fig. 9).
Regarding Claim 7, Vijayanarasimhan as modified teaches all elements of Claim 1. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 1, wherein the feature extraction unit extracts, as the feature element, a feature related to changes in a specific displayed object displayed on an operating screen, the feature being obtained by analyzing the video that records a manipulation being performed on the operating screen, and wherein the discriminating unit discriminates between the explaining scene and the explained scene by using, as a discriminating condition, a pattern of the changes in the displayed object (Dorai: Figs. 4-7, color histogram based micro-segments, and audio/speech and text assisted macro-segments).
Regarding Claim 8, Vijayanarasimhan as modified teaches all elements of Claim 1. Vijayanarasimhan as modified further teaches:
display 241).
Regarding Claim 9, Vijayanarasimhan as modified teaches all elements of Claims 1/8. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 8, wherein the output screen generated by the screen generating unit associates each portion of the video with the text obtained from audio corresponding to the portion of the video, such that specifying a part of the text causes playback of a portion of the video corresponding to the specified part of the text (Vijayanarasimhan: Figs. 3-5).
Regarding Claim 10, Vijayanarasimhan as modified teaches all elements of Claims 1/8-9. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 9, wherein the screen generating unit deletes, from among the plurality of portions of the video, a video of the explaining scene (Vijayanarasimhan: Figs. 3-5, various scenes are displayed, where deleting a content on a display screen is known technique in the field; Dorai: Fig. 9, a sequence of scenes is listed).
Regarding Claim 11, Vijayanarasimhan as modified teaches all elements of Claims 1/8-10. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 10, wherein the screen generating unit associates a text corresponding to the deleted video of the explaining scene, with a video of the explained scene immediately following the explaining scene (Dorai: Fig. 9, macro-segments are between narrators (explaining scenes)).
Regarding Claim 12, Vijayanarasimhan as modified teaches all elements of Claims 1/8. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 8, wherein the text obtained from audio corresponding to each portion of the video comprises a first text and a second text, the first text corresponding to a video of the explaining scene, the second text corresponding to a video of the explained scene, and wherein the output screen generated by the screen generating unit displays the text in a manner that enables discrimination between the first text and the second text (Dorai: Fig. 9).
Regarding Claim 13, Vijayanarasimhan as modified teaches all elements of Claims 1/8-9. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 9, wherein the text obtained from audio corresponding to each portion of the video comprises a first text and a second text, the first text corresponding to a video of the explaining scene, the second text corresponding to a video of the explained scene, and wherein the output screen generated by the screen generating unit displays the text in a manner that enables discrimination between the first text and the second text (Dorai: Fig. 9).
Regarding Claim 14, Vijayanarasimhan as modified teaches all elements of Claims 1/8. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 8, wherein the output screen generated by the screen generating unit displays the text corresponding to each 
Regarding Claim 15, Vijayanarasimhan as modified teaches all elements of Claims 1/8-9. Vijayanarasimhan as modified further teaches:
The information processing apparatus according to Claim 9, wherein the output screen generated by the screen generating unit displays the text corresponding to each portion of the video in a manner that enables discrimination of each portion of the video categorized by the categorizing unit (Dorai: Fig. 9).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZHITONG CHEN whose telephone number is (571) 270-1936.  The examiner can normally be reached on M-F 9:30am - 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Yuwen Pan can be reached on 571-272-7855.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


/ZHITONG CHEN/
Primary Examiner, Art Unit 2649