DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


Claims 5 and 18-20 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Regarding claim 5, the claim recites " querying the spatio-temporal index using an event identifier in the particular information of the indexed event relevant to the pixel location “. However, claim 1 recites “querying the spatio-temporal index using the pixel location to determine particular information of an indexed event or an indexed object”. These limitations are circular logic and conflicting with each other. Please clarify.
Claims 18-20 recite “wherein the spatio-temporal index relates respective second pixel locations of the indexed object or an indexed event a second video feed with respective pixel locations of the indexed object or an indexed event in the video feed and respective spatial locations of the indexed object of the indexed event as determined from a tracking video feed”. This limitation is unclear. Please clarify.    

Claim Rejections - 35 USC § 103
5.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1,148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 
1. Determining the scope and contents of the prior art. 
2. Ascertaining the differences between the prior art and the claims at issue. 
3. Resolving the level of ordinary skill in the pertinent art. 
4. Considering objective evidence present in the application indicating     
    obviousness or nonobviousness.

6.	Claims 1-6 and 18-20 are rejected under AIA  35 U.S.C. 103 as being unpatentable over Wakim (US Patent 9,760,778) in view of Begeja et al. (US Publication 2020/0169718, hereinafter Begeja). 
Regarding claim 1, Wakim discloses a method for accessing content related to location in a video, the method comprising: 
receiving a reference to a pixel location in a frame of a video feed of a filmed occurrence (Wakim, claim 3, determining that the first video frame comprises a first object associated with a first coordinate within the first video frame, the first coordinate comprising a pixel location within the first video frame overlapping with the first object, col. 3, line 46 through col. 4, line 6, while viewing a particular frame of video content, user sends an indication to identify an object captured/depicted in the video frame); 
accessing a spatio-temporal index corresponding to the filmed occurrence, wherein the spatio-temporal index indexes information relating to events or objects of the filmed occurrence and corresponding pixel locations at which the events or the objects are detected in the video feed (Wakim, col. 14, line 50 through col. 15, line 5, the object may be associated with a unique identification (ID) number or other identifier that may be sent to the device 110 from the object recognition server 140b as part of the object data. The device may store that unique ID and use the unique ID to request further information about the object from the data source “a spatio-temporal index structure for providing additional information of object as recognized as corresponding to a pixel location on the video”. The device may then receive (1020) further information about the selected object and display (1022) that further information, for example as shown in FIG. 10F. As shown in FIG. 10F, after the further information is obtained, the television may show screen 1012, which may include a portion of the original image including the recognized object (shown on the left of screen 1012) as well as information about the object, including the ribbon, manufacturer, price, etc. on another portion of the screen. The information displayed as part of screen 1012 may have come from the recognition server 140b as part of the object data and/or may have come from another information source. For example, the name and image of the produce may have been included in the original object data, but the manufacturer and price may have come from a shopping data source);
querying the spatio-temporal index using the pixel location to determine particular information of an indexed event or an indexed object (Wakim, col. 14, line 50 through col. 15, line 5, the device may use the unique ID to request further information about the object from the data source “index structure”;  the device may then receive (1020) further information about the selected object and display (1022) that further information, for example as shown in FIG. 10F. As shown in FIG. 10F, after the further information is obtained, the television may show screen 1012, which may include a portion of the original image including the recognized object (shown on the left of screen 1012) as well as information about the object, including the ribbon, manufacturer, price, etc. on another portion of the screen. The information displayed as part of screen 1012 may have come from the recognition server 140b as part of the object data and/or may have come from another information source. For example, the name and image of the produce may have been included in the original object data, but the manufacturer and price may have come from a shopping data source).
Wakim does not explicitly disclose but Begeja discloses receiving the particular information wherein the particular information indicates at least one of spatial and temporal alignment parameters for aligning the indexed event with a corresponding event in at least one other video feed of the filmed occurrence (Begeja, para’s 0008 and 0015, multiple source videos may contain images captured at a same location and/or a same object or set of objects at an art exhibit or a sporting event; para’s 0053-0054, discloses a spatial alignment where the processing system may detect key points (e.g., of at least one object, such as the first object) in the multiple source videos and a time alignment of the multiple source videos. For instance, time alignment may be performed when the first source video and the second source video are from overlapping times; the time of a frame of one of the source videos may be determined in accordance with one or both of a start time and an end time of the source video, and the frame rate of the source video. In another example, each frame may be tagged with timing information. In addition, both source videos may similarly have a start time, end time and/or duration, frame rate, and similar information stored as metadata along with the visual information. As such, frames of the source videos having corresponding times may be paired. Alternatively, or in addition, alignment may be achieved with reference to one or more objects in motion. For instance, a ball may be bouncing and then come to rest in both of the source videos. Thus, the processing system may determine that a frame in the first source video and a frame in the second source video where the ball comes to rest are at a same time).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Begeja’s features into Wakim’s invention for enhancing user’s playback of video content by effectively switching between multiple  aligned video sources. 

Regarding claim 2, Wakim-Begeja discloses the method of claim 1, further comprising displaying the aligned event and corresponding event based on the particular information on a user interface (Wakim, fig 1, displaying video content; Begeja, para. 0039, displaying video content).
The motivation to combine the references and obviousness arguments are the same as claim 1.

Regarding claim 3, Wakim-Begeja discloses the method of claim 1, wherein receiving the particular information further comprises receiving statistics for a participant in the event, wherein event in the filmed occurrence is of a sporting event associated with the pixel location (Begeja, para. 0016, providing player’s statistic and/or score in a sporting event is well known in the art).
The motivation to combine the references and obviousness arguments are the same as claim 1.

Regarding claim 4, Wakim-Begeja discloses the method of claim 1, wherein receiving the particular information further comprises receiving statistics relating to a matchup of a sporting event associated with the pixel location (Begeja, para. 0016, providing statistic and/or score related to a matchup in a sporting event is well known in the art).
The motivation to combine the references and obviousness arguments are the same as claim 1.

Regarding claim 5, Wakim-Begeja discloses the method of claim 1, further comprising querying the spatio-temporal index using an event identifier in the particular information of the indexed event relevant to the pixel location (Wakim, col. 14, line 50 through col. 15, line 5, the object may be associated with a unique identification (ID) number or other identifier that may be sent to the device 110 from the object recognition server 140b as part of the object data. The device may store that unique ID and use the unique ID to request further information about the object from the data source).

Regarding claim 6, Wakim-Begeja discloses the method of claim 1, wherein the video feed of the filmed occurrence is a tracking video feed (Wakim, col. 19, lines 20-25, the supplemental content engine 1382 may cross reference an identified object with one or more information sources, including a supplemental content database 1388,  which may include database entries tracking certain potential recognized objects or object classes (e.g., person, place, song, product, etc.) and their corresponding supplemental content or potential supplemental content types).

Regarding claim 18, Wakim discloses a method for accessing content related to location in a video, the method comprising:
receiving a reference to a pixel location in a frame of a video feed of a filmed occurrence (Wakim, claim 3, determining that the first video frame comprises a first object associated with a first coordinate within the first video frame, the first coordinate comprising a pixel location within the first video frame overlapping with the first object, col. 3, line 46 through col. 4, line 6, while viewing a particular frame of video content, user sends an indication to identify an object captured/depicted in the video frame); 
accessing a spatio-temporal index corresponding to the filmed occurrence, querying the spatio-temporal index using the pixel location to determine particular information stored in the spatio-temporal index for an indexed object or an indexed event that is associated with the pixel location; and receiving the particular information from the spatio-temporal index (Wakim, col. 14, line 50 through col. 15, line 5, the object may be associated with a unique identification (ID) number or other identifier that may be sent to the device 110 from the object recognition server 140b as part of the object data. The device may store that unique ID and use the unique ID to request further information about the object from the data source “a spatio-temporal index structure for providing additional information of object as recognized as corresponding to a pixel location on the video”. The device may then receive (1020) further information about the selected object and display (1022) that further information, for example as shown in FIG. 10F. As shown in FIG. 10F, after the further information is obtained, the television may show screen 1012, which may include a portion of the original image including the recognized object (shown on the left of screen 1012) as well as information about the object, including the ribbon, manufacturer, price, etc. on another portion of the screen. The information displayed as part of screen 1012 may have come from the recognition server 140b as part of the object data and/or may have come from another information source. For example, the name and image of the produce may have been included in the original object data, but the manufacturer and price may have come from a shopping data source).
Wakim discloses wherein the spatio-temporal index relates respective first pixel locations of the indexed object or an indexed event to a first video feed as described above, and further discloses wherein a video feed of the filmed occurrence can be configured as a tracking video feed (Wakim, col. 19, lines 20-25, the supplemental content engine 1382 may cross reference an identified object with one or more information sources, including a supplemental content database 1388,  which may include database entries tracking certain potential recognized objects or object classes (e.g., person, place, song, product, etc.) and their corresponding supplemental content or potential supplemental content types).
Wakim does not explicitly disclose wherein the spatio-temporal index relates respective second pixel locations of the indexed object or an indexed event a second video feed with respective pixel locations of the indexed object or an indexed event in the video feed and respective spatial locations of the indexed object of the indexed event as determined from a tracking video feed.
Begeja discloses wherein the spatio-temporal index relates respective second pixel locations of the indexed object or an indexed event a second video feed with respective pixel locations of the indexed object or an indexed event in the video feed and respective spatial locations of the indexed object of the indexed event as determined from a tracking video feed (Begeja, para’s 0008 and 0015, multiple source videos may contain images captured at a same location and/or a same object or set of objects at an art exhibit or a sporting event; as disclosed above, Wakim’s system can receive a reference to a second pixel location in a frame of a second video feed, see Wakim, claim 3, determining that a second video frame of a second video feed comprises the first object associated with a second coordinate within the second video frame, the second coordinate comprising a second pixel location within the second video frame overlapping with the first object; Bageja further discloses, see para’s 0053-0054, a spatial alignment where the processing system may detect key points (e.g., of at least one object, such as the first object) in the multiple source videos and a time alignment of the multiple source videos. For instance, time alignment may be performed when the first source video and the second source video are from overlapping times; the time of a frame of one of the source videos may be determined in accordance with one or both of a start time and an end time of the source video, and the frame rate of the source video. In another example, each frame may be tagged with timing information. In addition, both source videos may similarly have a start time, end time and/or duration, frame rate, and similar information stored as metadata along with the visual information. As such, frames of the source videos having corresponding times may be paired. Alternatively, or in addition, alignment may be achieved with reference to one or more objects in motion. For instance, a ball may be bouncing and then come to rest in both of the source videos. Thus, the processing system may determine that a frame in the first source video and a frame in the second source video where the ball comes to rest are at a same time)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bageja’s features with Wakim’s invention for enhancing user’s playback of video content by effectively switching between multiple  video sources by effectively aligning the video sources.

Regarding claims 19-20, these claims comprise limitations substantially the same as claims 3-4; therefore, they are rejected for the same rationale.

7.	Claims 7-9, 13-15,and 17 are rejected under AIA  35 U.S.C. 103 as being unpatentable over Wakim (US Patent 9,760,778) in view of Shan et al. (US Publication 2010/0312608, hereinafter Shan).
Regarding claim 7, Wakim discloses a method for accessing content related to location in a video, the method comprising:
receiving a reference to a pixel location in a frame of a video feed of a filmed occurrence (Wakim, claim 3, determining that the first video frame comprises a first object associated with a first coordinate within the first video frame, the first coordinate comprising a pixel location within the first video frame overlapping with the first object, col. 3, line 46 through col. 4, line 6, while viewing a particular frame of video content, user sends an indication to identify an object captured/depicted in the video frame); 
accessing a spatio-temporal index corresponding to the filmed occurrence, wherein the spatio-temporal index indexes information relating to events or objects of the filmed occurrence and corresponding pixel locations at which the events or the objects are detected in the video feed (Wakim, col. 14, line 50 through col. 15, line 5, the object may be associated with a unique identification (ID) number or other identifier that may be sent to the device 110 from the object recognition server 140b as part of the object data. The device may store that unique ID and use the unique ID to request further information about the object from the data source “a spatio-temporal index structure for providing additional information of object as recognized as corresponding to a pixel location on the video”. The device may then receive (1020) further information about the selected object and display (1022) that further information, for example as shown in FIG. 10F. As shown in FIG. 10F, after the further information is obtained, the television may show screen 1012, which may include a portion of the original image including the recognized object (shown on the left of screen 1012) as well as information about the object, including the ribbon, manufacturer, price, etc. on another portion of the screen. The information displayed as part of screen 1012 may have come from the recognition server 140b as part of the object data and/or may have come from another information source. For example, the name and image of the produce may have been included in the original object data, but the manufacturer and price may have come from a shopping data source); 
querying the spatio-temporal index using the pixel location to determine particular information of an indexed event or an indexed object (Wakim, col. 14, line 50 through col. 15, line 5, the device may use the unique ID to request further information about the object from the data source “index structure”;  the device may then receive (1020) further information about the selected object and display (1022) that further information, for example as shown in FIG. 10F. As shown in FIG. 10F, after the further information is obtained, the television may show screen 1012, which may include a portion of the original image including the recognized object (shown on the left of screen 1012) as well as information about the object, including the ribbon, manufacturer, price, etc. on another portion of the screen. The information displayed as part of screen 1012 may have come from the recognition server 140b as part of the object data and/or may have come from another information source. For example, the name and image of the produce may have been included in the original object data, but the manufacturer and price may have come from a shopping data source).
Wakim does not explicitly disclose but Shan discloses receiving the particular information, wherein the indexed object is an advertisement overlay object, and the particular information relates to placement of an advertisement as an overlay with respect to the video feed (Shan, para. 0053, the ad overlay component determines the advertisement, template, and placement region for the advertisement in the online video).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Shan’s features into Wakim’s invention for enhancing user’s playback of video content by allowing user to effectively control advertising placement in video source.

Regarding claim 8, Wakim-Shan discloses the method of claim 7, further comprising displaying the advertisement as an overlay of the video feed based on the particular information on a user interface (Shan, fig. 7, para’s 0069 and 0076, once the placement region is determined, a renderer is selected for displaying the advertisement, online video, or both).
The motivation to combine the references and obviousness arguments are the same as claim 7.

Regarding claim 9, Wakim-Shan discloses the method of claim 7, wherein the particular information is indexed in the spatio-temporal index for a plurality of pixel locations and a plurality of respective video frames (Wakim, col. 2, lines 34-40, a computing device may identify one or more objects represented in the image data. The system may then obtain supplemental content, or suggest potential functions for user execution, based on the object(s) represented in the information; col. 5, lines 46-49, In response to a second object being recognized, as illustrated in the example situation 240 of FIG. 2B, the graphical elements may create a bounding box 242 or other such indication about the second recognized object; claim 3, determining that the first video frame comprises a first object associated with a first coordinate within the first video frame, the first coordinate comprising a pixel location within the first video frame overlapping with the first object; similarly, determining that the second video frame comprises a second object associated with a second coordinate within the second video frame, the second coordinate comprising a second pixel location within the second video frame overlapping with the second object).

Regarding claim 13, Wakim-Shan discloses the method of claim 7, wherein the particular information is textual (Wakim, col. 1, lines 62-66, text in image).

Regarding claim 14, Wakim-Shan discloses the method of claim 7, wherein the particular information is graphical (Wakim, col. 4, lines 33-53, graphical element and animated material can be displayed over image data).

Regarding claim 15, Wakim-Shan discloses the method of claim 7, wherein the particular information indicates if the pixel location corresponds to a location where at least one of information, augmentations, graphics, animations, or advertising may be displayed over a content frame (Wakim, col. 4, lines 33-53, graphical element and animated material can be displayed over image data).

Regarding claim 17, Wakim-Shan discloses the method of claim 7, wherein the indexed object or the indexed event relates to at least a group of pixels in each frame (Wakim, claim 3, determining that the first video frame comprises a first object associated with a first coordinate within the first video frame, the first coordinate comprising a pixel location within the first video frame overlapping with the first object, a pixel location can be a seen as location of a group of pixels; col. 4, lines17-59, bounding box of pixels).

8.	Claims 10-12 are rejected under AIA  35 U.S.C. 103 as being unpatentable over Wakim-Shan, as applied to claim 7 above, in view of Begeja et al. (US Publication 2020/0169718, hereinafter Begeja). 
Regarding claims 10-12, Wakim-Shan discloses the method of claim 7.
Wakim-Shan does not explicitly disclose but Begeja discloses wherein the particular information further indicates a playing surface depicted in the video feed (Begeja, para. 0015, field of play); wherein the video feed corresponds to a filmed occurrence and wherein the filmed occurrence is a sporting event taking place on the playing surface (Begeja, para. 0015, field of play); and wherein the particular information further indicates one or more participants in the indexed event that is relative to the pixel location (Begeja, para. 0016, player in a sporting event).
.	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Begeja’s features into Wakim-Shan’s invention for enhancing user’s playback by incorporating sporting events into video capture.

9.	Claim 16 is rejected under AIA  35 U.S.C. 103 as being unpatentable over Wakim-Shan, as applied to claim 7 above, in view of Begeja et al. (US Publication 2020/0169718, hereinafter Begeja) and well-known technique in the art. 
Regarding claim 16, Wakim-Shan discloses the method of claim 7.
Wakim-Shan discloses identifying an object in the video that is associated with a pixel location as described above but does not explicitly disclose providing a smart pipe related to the pixel location, wherein the smart pipe comprises multiple aligned content channels associated with the pixel.
Begeja discloses providing a smart pipe related to the pixel location, wherein the smart pipe comprises multiple aligned content channels associated with the pixel (Begeja, para’s 0008 and 0015, multiple source videos may contain images captured at a same location and/or a same object or set of objects at an art exhibit or a sporting event; para’s 0053-0054, discloses a spatial alignment where the processing system may detect key points (e.g., of at least one object, such as the first object) in the multiple source videos and a time alignment of the multiple source videos. For instance, time alignment may be performed when the first source video and the second source video are from overlapping times; the time of a frame of one of the source videos may be determined in accordance with one or both of a start time and an end time of the source video, and the frame rate of the source video. In another example, each frame may be tagged with timing information. In addition, both source videos may similarly have a start time, end time and/or duration, frame rate, and similar information stored as metadata along with the visual information. As such, frames of the source videos having corresponding times may be paired. Alternatively, or in addition, alignment may be achieved with reference to one or more objects in motion. For instance, a ball may be bouncing and then come to rest in both of the source videos. Thus, the processing system may determine that a frame in the first source video and a frame in the second source video where the ball comes to rest are at a same time. By providing multiple aligned video sources, user can elect to switch to any specific video source during playback. A smart pipe selector for determining which of the plurality of pipes to select from the plurality of pipes to enable user access to the digital content is well known in the art, see Greene, US Publication 2008/0077524, claim 2).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Begeja’s features and well-known technique in the art into Wakim-Shan’s invention for enhancing user’s playback of video content by effectively switching between multiple  aligned video sources.

Consideration of Reference/Prior Art
10.    For applicant’s benefit portions of the cited reference(s) have been cited to aid in the review of the rejection(s). While every attempt has been made to be thorough and consistent within the rejection it is noted that the PRIOR ART MUST BE CONSIDERED IN ITS ENTIRETY, INCLUDING DISCLOSURES THAT TEACH AWAY FROM THE CLAIMS. See MPEP 2141.02 VI.

Conclusion
11.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to LOI H TRAN whose telephone number is (571)270-5645. The examiner can normally be reached 8:00AM-5:00PM PST FIRST FRIDAY OF BIWEEK OFF.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, THAI TRAN can be reached on 571-272-7382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LOI H TRAN/Primary Examiner, Art Unit 2484