Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments with respect to amended claims 2-15 and 19-21, on 10/22/2021, have been considered but are moot because the new ground of rejection relies upon newly applied prior art, and does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
	

The examiner suggests applicant to request an interview to discuss potential distinguishable subject matter in an effort to enhance compact prosecution as well as record clarity.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the 

Claims 2-3, 6-8, 12-13, 15 and 19-21 are rejected under 35 U.S.C. 103 as being unpatentable over in view of Mei US (2011/0264700) in view of Burges et al. (2005/0086682) and Gokturk et al. (US 2006/0253491).
Regarding claim 2, Mei is deemed to teach, as claimed, directed to a method comprising: 
receiving, via an input component of a computing device, a plurality of time slices of video content 

SEE Fig. 1, steps 102, 104 e extracting audio-video descriptors for the time slices of video content, to obtain aural and visual characteristics of the video content corresponding to the time slice 

SEE step 106 to 108 

O	providing the audio-video signature
Abstract, 0003-0005, 0023 w/text, visual and audio, features, reads on signatures, 0029, 0032, 0033)

O	associated with the one or more time slices (see 0005, 0026, 0027, 0030, sequence or segments) of video content (302, 304 w/key frames, 306), as a query toward a dataset (to, Database 320 or “from outside sources”)

SEE search, the DB 320 or outside sources, w/features (or signatures), being associated with, audio and/or video and/or text, signatures

SEE features of visual and text 0032 and/or audio
O	receiving candidate results of the query before (see consuming, 0032), reaching an end of the time slices of video content before (“...additional information to users consuming...” or during), based in, a time window allowed for the query lapses (see 0021, since features are 
presenting (314), at least some of the candidate results before reaching the end of the time slices of video content (See Fig. 3, browser 318, with 302 (video) & 314 (additional information)), at the same time Abstract 

Paragraph - ABTX (1): Many internet users consume content through online videos. For example, users may view movies, television shows, music videos, and/or homemade videos. It may be advantageous to provide additional information to users consuming the online videos. Unfortunately, many current techniques may be unable to provide additional information relevant to the online videos from outside sources. Accordingly, one or more systems and/or techniques for determining a set of additional information relevant to an online video are disclosed herein. In particular, visual, textual, audio, and/or other features may be extracted from an online video (e.g., original content of the online video and/or embedded advertisements). Using the extracted features, additional information (e.g., images, advertisements, etc.) may be determined based upon matching the extracted features with content of a database. The additional information may be presented to a user consuming the online video.

SEE details of 314 and presenting during consuming and/or in response thereto (or media), 0029-, 0032, 0033- [0029] FIG. 3 illustrates an example of a system 300 configured for determining a set of additional information 314 relevant to an online video 302. The system 300 may comprise a parsing 

[0032] The information extraction component 312 may be configured to execute a multimodal relevance matching algorithm against the database 320 using the set of features 310 to determine the set of additional information 314. In one example, the multimodal relevance matching algorithm may be configured to perform a text based search algorithm upon the database 320 using textual features to determine a first list of candidate additional information. The multimodal relevance matching algorithm may be configured to perform a visual based search algorithm upon the database 320 using visual features to determine a second list of candidate additional information. The multimodal relevance matching algorithm may perform a linear combination of the first list and second list to generate the set of additional information 314. The information extraction component 312 may be configured to aggregate the set of additional information 314 into a video.

[0033] In one example, the presentation component 316 may be configured to present the set of additional information 314. In another example, the presentation component 316 may be configured to present the video to a user consuming the online video 302 within a web browsing environment 318.



Burges teaches (in Fig. 4), to extract audio-video descriptors within a stream, to audio and video signatures (w/audio and video fingerprints), trigger a request based on the media object, to infer and return information or candidate results based on the video-audio content (media stream), based on the nature of the information.
SEE abstract, 0012, 0023, 0040 and Fig. 4 and 0080-0084
	Burger is deemed to render obvious audio and video fingerprints (of visual and audio), but fails to mention, use visual Hash bits
Gokturk is deemed to teach and render obvious (visual hash bits), associated the audio-video signatures and a video hash bit associated with the time slice of video content.

SEE Gokturk w/visual Index (w/Signatures, 0096, 0115, 0276, 0301) and visual hash bits (0331, 0335)
Therefore, since, the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill to infer and return information as candidate results based on the video-audio content (media stream), based on the nature of the information, as taught by Burges, as well as to, utilize a hash representation as taught by Gokturk to facilitate, search and retrieval using visual hash bits, as is deemed conventional in the art to perform.

Regarding claim 3 (previously presented), the combination as applied with Mei is further deemed to render obvious, as claimed, wherein the time slices of video content are received from a video output device not associated with the computing device (in Fig. 3)
Note the source is, “online video 302”, therefore the device that generated the video output, as online video 302, is not associated with the computing device (or a Camera).
Mei does not teach, a local camera, in the local machine (in Fig. 3), that generated the video that was uploaded, is obtained from the internet (downloaded) and is not locally generated video content (such as: no local camera), therefore, 

Regarding claim 6 (previously presented), the combination with Mei, as applied above, fails to particularly address or mention, wherein a length of individual ones of the plurality of time slices (CLIPS) includes at least about 0.1 second and at most about 10.0 seconds (.1 seconds to 10 seconds), as a range of information in Time. Note, 1/10 of a second i.e. (w/encoded video 30/frames/sec), i.e. clip = 30 frames, at 10 seconds would be (300 frames), as the content sample. The examiner had rendered, it obvious to merely set a range related to, Time slice selection, grabbing a range of frames of interest about a point, since it is deemed within a routine experimentation to set the range (to locate desired content within), therefore, since, the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to, modify Mei, to select, a range of time slices, alternative to a key frame selection (one), such as, durations between (.1 to 10 sec), to obtain samples of media (being, fingerprint representations), used as the basis for a query, being, a design choice consideration to select a range of time slices (such as, between .1 and 10 seconds), as a basis to generate, a sample, to query (or search), as is obvious to those skilled in the art and to locate and consider, usable video content within.

Regarding claims 7-8, the combination as applied is deemed to render obvious as claimed, in view of the teachings of Gokturk directed to the audio-video signature includes an audio fingerprint and a video hash bit associated with the time slice of video content, as well as, wherein the dataset includes a layered audio-video indexed dataset.
SEE Gokturk w/visual Index (w/Signatures, 0096, 0115, 0276, 0301), associated with, visual hash bits (0331, 0335), from video multi-index, layers w/time (0040, 0067, 0092, 0094), audio 
(0057) and maintaining a time relation, having advantages, using the index generated (or Indexes), where the index data is based on the recognized information, using the index functionality, such as search and retrieval is enabled. Various recognition techniques, including those that use the face, clothing, apparel, and combinations of characteristics may be utilized (or Index layers), as taught by Gokturk.

SEE Gokturk, Index information in layers (from Image 0032 and Index) and (signatures from, clothing, text, audio and video), indexes, based on Similarity (0057).

Also see 0059, 0152, 0179, 0180 and Fig. 12 w/details in (0192, 0193, creating, types of Indexes), 0195, 0196-

[0202] According to an implementation shown by FIG. 12, each of the indexers supply their own respective index. The ID Information Indexer 1240 submits ID index data 1245 to the ID Information Index 1242. The Signature Indexer 1250 supplied Signature Index Data 1255 to the Signature Index 1252. Each of ID index data 1245 and Signature Index 1252 enable specific types of search and retrieval operations. 


Therefore, since, the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify the combination as applied with Mei in view of further teachings of Gokturk, including, the visual Index associated with signatures, with associated visual hash bits, from a video multi-index in layers and time and audio and maintaining a time relationship, as taught by Gokturk, maintaining a time relation, having advantages, using the index 

Claim 9 (being an independent claim, previously presented), is deemed analyzed and discussed with respect to method claim 2 rendering obvious being taught by Mei, further teaching wherein a system (w/device having hardware, to search, applicant at 0061), being, configured to, perform a method as recited, in claim 2.

Claim 10 (being an independent claim, previously presented), is also analyzed and discussed with respect to the claims 2 and 9, teaching a computer-readable medium having computer-executable instructions encoded thereon, the computer-executable instructions configured to, upon execution, program a device to perform a method as recited in claim 2.
Note, the medium is deemed limited to statutory scope based on applicant disclosure at {0051} of 2020/0142928.
“…or any other non-transmission medium...” appears and has been read, to limit the medium to statutory scope defined by applicant specification.



Regarding claim 12 (previously presented), the combination as applied above, is deemed to render obvious as claimed, and provides for receiving a query audio-video signature related to video content at a layered audio-video engine, searching a layered audio-video index associated with the layered audio-video engine to identify entries in the layered audio-video index having a similarity to the query audio-video signature above a threshold performing geometric verification of respective key frames from the query audio-video signature and entries from the layered audio-video index having the Similarity; and sending candidate results identified via the geometric verification until a window of time allowed for a query using the query audio-video signature has elapsed Note the combination fails to address, above, but, Gokturk is deemed to further teach and render further obvious the differences, associated with the above applied teachings, including considering similarity and threshold (or a Test) being a 
SEE abstract, 0010, 0030 and similarity 0063, 0066, 0076, 0080, 0082, 0085 AND 0111- with respect to a threshold considerations (0119, 0125, 0132, 0143, 0147 and 0161- ).

Claim 13 (of claim 12, previously presented), the combination as applied is deemed to process in progressively processing entries (300, 304, 306-, to 314), having respective audio-video signatures

SEE Mei the process is deemed a progressive process, as described above (see during consuming, generates additional information 314) and structures supporting the claimed functions in view of Fig. 3, based on progressively processed video content of the On-line content 302 w/314, in 318 of a browser.

Regarding claim 15 (previously presented), Mei as applied fails to address as claimed, including determining whether the candidate results or results, are stable and determining to update results, in the process of updating candidate results



Claim 19 (previously presented) is deemed analyzed and discussed with respect to the claims 12- above.

Regarding claim 20 (amended), of claim 19, the combination with Mei as applied further in view of associated teachings in view of Gokturk, further renders obvious, wherein the at least a part of the visual index for creating the first layer includes Random selection of hash bits from another layer.

Since, Gokturk, further teaches, Random Hash Functions sampled randomly and greedily pick (0331) and visual signatures (0336), therefore, based on the combination would be further obvious to comprise as claimed, to, perform, at least a part of 

Regarding claim 21 (Amended) of claim 19, the combination as applied is deemed to render obvious allowing for refining the visual points to be searched in the second layer, based on the audio index.

SEE abstract, visual, textural audio and other features, are indexed, are applied as claimed, allowing for refining the visual points to be searched in the second layer, based on the audio index or features (a Layer).
SEE Mei 0029, 0031, 0036

Claim 4-5 are rejected under 35 U.S.C. 103 as being unpatentable over in view of the combination of Mei (US 2011/0264700), Burges et al. (US 2005/0086682) and Gokturk et al. (US 2006/0253491), as applied and further in view of 
Desai et al. (US 2007/0162566).


Desai teaches user generated multimedia and posting (in Fig. 3), including content, from a camera (0055) and creation of searchable content (abstract), being (One Of), directly or indirectly, associated with a device that can receive the same through a browser, as taught by Desai.

Therefore, since, the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to, modify Mei with the teachings of Desai, wherein user’s at a device, can also generate local video content, to the internet, and at least perform the indirectly process mode in Mei, with previously uploaded content from the same device in view of Desai, as being obvious in view of generating content with the device that can render the content, as taught by Mei.

Regarding claim 5 of claim 4 (previously presented), the combination as applied to claim 4, further is deemed to render obvious as recited wherein the time slices of video content are received from a video output device not associated with the computing device

Note, the content is uploaded from (another device), renders obvious being, not associated with the computing device that accessed the content from online.

Claim 14 is rejected under 35 U.S.C. 103(a) as being unpatentable over the combination of Mei et al. (US 2011/0264700) and Burges et al. (US 2005/0086682) and Gokturk et al. (US 2006/0253491), as applied and further in view of Bhatia et al. (US 2013/0014136).
Regarding claim 14 (of claim 13), the combination fails to teach, Bhatia is deemed to teach and render obvious, the difference as claimed, including with respect to the progressively processing (in claim 13), includes employing two part graph based transformation and matching


Signatures (0087), and graphing (see Taxonomy and Graph) and
Similarity (0210), based on transformation and matching, as part of graphic analysis (0079), including use of Social Graph being a type of, Taxonomy graph, as taught by Bhatia. SEE 0175 and 0291

Therefore, it would have been obvious to one skilled in the
art at the time of the invention to modify the combination with the teachings of Bhatia, to, as part of a progressive search process to, utilizing, "respective audio-video signatures employing two part graph (audio and video), based transformation and matching, as taught by Bhatia, to generate based on profile, to, track and determine whether they are influencers in certain categories, disseminators of information, information consumers (0151, 0153).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  




Contact Information
Any inquiry concerning this communication or earlier communications should be directed to the examiner of record
Vincent F. Boccio whose telephone number is (571) 272-7373.
The examiner can normally be reached between Monday-Friday between (8:00 AM to 4:00 PM).

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Vital can be reached on (571)272-4215. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval
(PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR.

Status information for unpublished applications is available through Private PAIR only.

For more information about the PAIR system:
"http://portal.uspto.gov/external/portal/pair"



If you would like assistance from a USPTO Customer Service
Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) OR 571-272-1000.

/VINCENT F BOCCIO/Primary Examiner, Art Unit 2162