DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The numerous documents crossed out in the IDS dated 10/21/2021 have been considered in the IDS dated 09/21/2020.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 2, 4-8, 11-16, 19 are rejected under 35 U.S.C. 102(a)(2)as being anticipated by Carter et al. (US 2019/0222776)(Hereinafter referred to as Carter).

A system for augmenting casted content with augmented reality content (Systems and methods are provided for identifying one or more portions of images or video frames that are appropriate for augmented overlay of advertisement or other visual content, and augmenting the image or video data to include such additional visual content. See abstract),  the system comprising: 
a computer system that comprises one or more processors programmed with computer program instructions that (See figure 3, element 302)( As depicted in FIG. 3, the computing environment 300 may include a computing system 302. The general architecture of the computing system 302 may include an arrangement of computer hardware and software components used to implement aspects of the present disclosure. The computing system 302 may include many more (or fewer) elements than those shown in FIG. 3. It is not necessary, however, that all of these generally conventional elements be shown in order to provide an enabling disclosure. See paragraph [0055]), when executed, cause the computer system to: 
provide, to a neural network, characteristic information related to a content portion of casted content being casted at a current time (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]) (Characteristic is the certain type of a region)(Dynamic value of moment is the region); 
obtain, from the neural network, a dynamic value of moment related to the content portion of the casted content (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]) (Characteristic is the certain type of a region)(Dynamic value of moment is the region), 
the dynamic value of moment being an indication of current user interest in the content portion (As one example according to one embodiment, advertisement content may be displayed over regions or portions of individual video frames that are determined or predicted to be considered negative space (such as from the perspective of a rights holder in the underlying content). The negative space may be, in one example, portions of the audience or crowd in a basketball arena that are viewable in the background of a video shot (such as television broadcast footage of the game that focuses on the on-court action). The crowd portions may be deemed negative space, in some instances, at least in part because such portions are not likely to be the primary focus of a human viewer ( e.g., they are not part of the in-game action on the basketball court). Furthermore, overlaying supplemental content within such portions is not likely to interfere with viewing of other in-venue signage or other content that a rights holder would like to remain visible in the shot. See paragraph [0023]),
the dynamic value of moment being updated responsive to the characteristic information being provided to the neural network for the current time (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]) (Characteristic is the certain type of a region)(Dynamic value of moment is the region);  
generate, based on the dynamic value of moment, an augmentation package for the content portion (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]); 
select, based on a first entity being associated with the augmentation package, augmented reality content associated with the first entity to be presented with the content portion (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]);  
and augment the content portion with the augmented reality content associated with the first entity such that the augmented reality content is integrated as a background object of the content portion behind one or more foreground objects of the content portion during a presentation of the augmented content portion ( At block 214 (which may be implemented before, after or in parallel with block 212), the computing system may pose the augmentation content within three-dimensional space, such as within the virtual scene or environment discussed above. The rotation, position, sizing and/or other data associated with the placement and pose of the augmentation content may change from frame to frame based on the analysis of the target area's position and pose and well as the estimated camera location that captured the video (as discussed above and will be further discussed below). See paragraph [0038]) (Next, at block 216, the computing system may apply the mask to the rendered augmentation content (rendered in 3D space at an in-frame location corresponding to the target area), such that the augmentation content only appears at pixel locations corresponding to the candidate region (such as background or negative space) rather than at the location of foreground content, in-game action, or other critical areas of the video from the perspective of a rights holder or viewer, depending on the embodiment. In some embodiments, the content may be overlaid, composited, blended or superimposed with partial transparency relative to the original content of the video frame, such that the original video content is visible beneath the augmented overlay. In other embodiments, the overlay pixel values may completely replace the corresponding pixel values in the original video frame at the augmented pixel locations. See paragraph [0039]).

Regarding claim 2, Carter teaches the system of claim 1, wherein the characteristic information comprises descriptive information (Characteristic is the certain type of a region)(e.g. clouds, crowd etc.), popularity information, rating information, or timing information.

Regarding claim 4, Carter teaches a method being implemented by a computer system (Systems and methods are provided for identifying one or more portions of images or video frames that are appropriate for augmented overlay of advertisement or other visual content, and augmenting the image or video data to include such additional visual content. See abstract) that comprises one or more processors executing computer program instructions that (See figure 3, element 302)( As depicted in FIG. 3, the computing environment 300 may include a computing system 302. The general architecture of the computing system 302 may include an arrangement of computer hardware and software components used to implement aspects of the present disclosure. The computing system 302 may include many more (or fewer) elements than those shown in FIG. 3. It is not necessary, however, that all of these generally conventional elements be shown in order to provide an enabling disclosure. See paragraph [0055]), when executed, perform the method, the method comprising: 
obtaining, based on characteristic information related to a content portion of casted content, a dynamic value of moment related to the content portion of the casted content (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047])(Characteristic is the certain type of a region)(Dynamic value of moment is the region); 
generating, based on the dynamic value of moment, an augmentation package for the content portion (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]);  
selecting, based on a first entity being associated with the augmentation package, supplemental content associated with the first entity to be presented with the content portion (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]); 
and causing a modified content portion to be presented such that the supplemental content associated with the first entity is presented with the content portion (Next, at block 216, the computing system may apply the mask to the rendered augmentation content (rendered in 3D space at an in-frame location corresponding to the target area), such that the augmentation content only appears at pixel locations corresponding to the candidate region (such as background or negative space) rather than at the location of foreground content, in-game action, or other critical areas of the video from the perspective of a rights holder or viewer, depending on the embodiment. In some embodiments, the content may be overlaid, composited, blended or superimposed with partial transparency relative to the original content of the video frame, such that the original video content is visible beneath the augmented overlay. In other embodiments, the overlay pixel values may completely replace the corresponding pixel values in the original video frame at the augmented pixel locations. See paragraph [0039]).

Regarding claim 5, Carter teaches the method of claim 4, wherein causing the modified content portion to be presented comprises augmenting the content portion to include the supplemental content associated with the first entity  (Next, at block 216, the computing system may apply the mask to the rendered augmentation content (rendered in 3D space at an in-frame location corresponding to the target area), such that the augmentation content only appears at pixel locations corresponding to the candidate region (such as background or negative space) rather than at the location of foreground content, in-game action, or other critical areas of the video from the perspective of a rights holder or viewer, depending on the embodiment. In some embodiments, the content may be overlaid, composited, blended or superimposed with partial transparency relative to the original content of the video frame, such that the original video content is visible beneath the augmented overlay. In other embodiments, the overlay pixel values may completely replace the corresponding pixel values in the original video frame at the augmented pixel locations. See paragraph [0039]).

Regarding claim 6, Carter teaches the method of claim 4, further comprising: providing, to a first neural network, first characteristic information related to the content portion being casted at a first time (first type of negative space at first time) (Characteristic is the certain type of a region)(Dynamic value of moment is the region) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]); and providing, to the first neural network, second characteristic information related to the content portion being casted at a second time (second type of negative space at second time) (Characteristic is the certain type of a region)(Dynamic value of moment is the region) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]).

Regarding claim 7, Carter teaches the method of claim 6, wherein the dynamic value of moment is updated responsive to the first characteristic information being provided to the first neural network (first type of negative space at first time) (Characteristic is the certain type of a region)(Dynamic value of moment is the region) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]) (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]) , and wherein the dynamic value of moment is subsequently updated responsive to the second characteristic information being provided to the first neural network (second type of negative space at second time) (Characteristic is the certain type of a region)(Dynamic value of moment is the region) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]) (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]).

Regarding claim 8, Carter teaches the method of claim 4, further comprising periodically providing the characteristic information related to the content portion of the casted content (Each frame) (second type of negative space at second time) (Characteristic is the certain type of a region)(Dynamic value of moment is the region) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]) (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]), and 
wherein obtaining the dynamic value of moment comprises periodically obtaining the dynamic value of moment related to the content portion of the casted content (Each frame) (second type of negative space at second time) (Characteristic is the certain type of a region)(Dynamic value of moment is the region) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]) (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]).


Regarding claim 11, Carter teaches the method of claim 4, further comprising: receiving, from the first entity, an indication of interest in future content portions of the casted content having a particular value of moment (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]); identifying a future content portion of the casted content having the particular value of moment (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047])(Characteristic is the certain type of a region)(Dynamic value of moment is the region); 
and causing the supplemental content associated with the first entity to be presented with the future content portion (Next, at block 216, the computing system may apply the mask to the rendered augmentation content (rendered in 3D space at an in-frame location corresponding to the target area), such that the augmentation content only appears at pixel locations corresponding to the candidate region (such as background or negative space) rather than at the location of foreground content, in-game action, or other critical areas of the video from the perspective of a rights holder or viewer, depending on the embodiment. In some embodiments, the content may be overlaid, composited, blended or superimposed with partial transparency relative to the original content of the video frame, such that the original video content is visible beneath the augmented overlay. In other embodiments, the overlay pixel values may completely replace the corresponding pixel values in the original video frame at the augmented pixel locations. See paragraph [0039]).

Regarding claim 12, Carter one or more non-transitory computer-readable media comprising instructions that, when executed by one or more processors, cause operations (Also disclosed is a non-transitory computer readable medium storing computer executable instructions that, when executed by one or more computer systems, configure the one or more computer systems to perform operations. See paragraph [0074]) comprising: 
obtaining, based on characteristic information related to a content portion of casted content, a dynamic value of moment related to the content portion of the casted content (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]) (Characteristic is the certain type of a region)(Dynamic value of moment is the region); 
generating, based on the dynamic value of moment, an augmentation package for the content portion (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]);
selecting, based on a first entity being associated with the augmentation package, supplemental content associated with the first entity to be presented with the content portion (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]); and causing a modified content portion to be presented such that the supplemental content associated with the first entity is presented with the content portion (Next, at block 216, the computing system may apply the mask to the rendered augmentation content (rendered in 3D space at an in-frame location corresponding to the target area), such that the augmentation content only appears at pixel locations corresponding to the candidate region (such as background or negative space) rather than at the location of foreground content, in-game action, or other critical areas of the video from the perspective of a rights holder or viewer, depending on the embodiment. In some embodiments, the content may be overlaid, composited, blended or superimposed with partial transparency relative to the original content of the video frame, such that the original video content is visible beneath the augmented overlay. In other embodiments, the overlay pixel values may completely replace the corresponding pixel values in the original video frame at the augmented pixel locations. See paragraph [0039]).

Regarding claim 13, Carter teaches the media of claim 12, wherein causing the modified content portion to be presented comprises augmenting the content portion to include the supplemental content associated with the first entity (Next, at block 216, the computing system may apply the mask to the rendered augmentation content (rendered in 3D space at an in-frame location corresponding to the target area), such that the augmentation content only appears at pixel locations corresponding to the candidate region (such as background or negative space) rather than at the location of foreground content, in-game action, or other critical areas of the video from the perspective of a rights holder or viewer, depending on the embodiment. In some embodiments, the content may be overlaid, composited, blended or superimposed with partial transparency relative to the original content of the video frame, such that the original video content is visible beneath the augmented overlay. In other embodiments, the overlay pixel values may completely replace the corresponding pixel values in the original video frame at the augmented pixel locations. See paragraph [0039]).

Regarding claim 14, Carter teaches The media of claim 12, further comprising: providing, to a first neural network, first characteristic information related to the content portion being casted at a first time (first type of negative space at first time) (Characteristic is the certain type of a region)(Dynamic value of moment is the region) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]); and providing, to the first neural network, second characteristic information related to the content portion being casted at a second time (second type of negative space at second time) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]).

Regarding claim 15, Carter teaches The media of claim 14, wherein the dynamic value of moment is updated responsive to the first characteristic information being provided to the first neural network (first type of negative space at first time) (Characteristic is the certain type of a region)(Dynamic value of moment is the region) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]) (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]), and wherein the dynamic value of moment is subsequently updated responsive to the second characteristic information being provided to the first neural network (second type of negative space at second time) (Characteristic is the certain type of a region)(Dynamic value of moment is the region) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]) (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]).

Regarding claim 16, Carter teaches The media of claim 12, further comprising periodically providing the characteristic information related to the content portion of the casted content (Each frame) (second type of negative space at second time) (Characteristic is the certain type of a region)(Dynamic value of moment is the region) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]) (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]), and 
wherein obtaining the dynamic value of moment comprises periodically obtaining the dynamic value of moment related to the content portion of the casted content (Each frame) (second type of negative space at second time) (Characteristic is the certain type of a region)(Dynamic value of moment is the region) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]) (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]).

Regarding claim 19, Carter teaches the media of claim 12, further comprising: receiving, from the first entity, an indication of interest in future content portions of the casted content having a particular value of moment (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047]); 
identifying a future content portion of the casted content having the particular value of moment (FIG. 2B is a flow diagram of an illustrative method 230 for identifying one or more target augmentation areas of a video frame or image, as well as determining each target area's surface and pose information. In some embodiments, the method 230 may be implemented by the computing system at block 206 of method 200 described above with reference to FIG. 2A. In other embodiments, block 206 described above may be performed in other manners, and is not limited to the specific method illustrated in FIG. 2B. The method 230 begins at block 232, where the computing system may use a convolutional neural network (CNN) or other machine learning model to identify one or more candidate regions, areas or objects within the image or the current frame of the video for potential augmented overlay. For example, in an embodiment in which the computing system is configured to superimpose an advertisement or other augmentation content over a crowd identified in an arena or stadium depicted in the video, a CNN or other model may have been trained to identify clusters of people. Accordingly, in some embodiments, identifying a given candidate region may include identifying a number of similar objects (such as individual people) that appear in a cluster or pattern in the frame, while in other embodiments a candidate region may be based on a single identified object (such as a portion of the venue, a table, sporting equipment such as a basketball stanchion, etc.). In some embodiments, the candidate region may be identified based on a segmentation process (such as using a CNN or other model) that identifies one or more textures of interest (such as a crowd of people, grass, sky, etc.) in an image or video frame. See paragraph [0043])(One or more augmentation-safe or augmentation- appropriate portions of an image or video frame may be determined based on an automated analysis of an image or individual frames of video data using computer vision techniques. Such techniques may include employing machine learning models configured to identify objects or regions of an image or video frame that meet criteria for visual augmentation that may be set in a variety of manners that will be described herein. See paragraph [0022])( For example, a computing system may analyze game footage broadcast on television or posted to social media to identify underutilized areas of the video frames (such as in-venue space, like crowds, that appears in the video and is not the focus of the action or foreground of the video). The computing system may then present a user interface for display to a rights holder that provides a labeled example (such as using a bounding box or other visual indicator) of such a region in a sample image or video portion. For example, the user interface may include a message such as "On your team's social media posts, our system has detected 30% negative space that may be currently underutilized and appropriate for augmentation." If a certain type of candidate region is approved by the user as ad-appropriate, a computing system may then apply advertisement augmentation or other visual augmentation in some such areas within subsequently processed video footage using automated computer vision techniques that identify additional instances of such objects or areas, as will be described below. See paragraph [0024]) (As discussed above, in some embodiments, example candidate regions identified by the computing system may have been presented in a user interface to a user associated with an advertiser, broadcaster, venue owner, team, and/or other rights holder for their approval or confirmation that a certain type of region should be considered negative space, augmentation-appropriate and/or otherwise considered for augmentation when the system identifies similar regions in image or video content. Accordingly, the machine learning models used at block 232 may be specific to a given rights holder ( e.g., a certain broadcaster, league or team), specific to a given advertiser (e.g., an advertiser may have indicated to the system that the advertiser's ads should only appear on clouds), specific to a given venue (e.g., trained using video recorded at a certain venue and used only for video recorded at that venue), specific to a given sport (across multiple venues and/or leagues), specific to a given content creator associated with the video (e.g., used for a specific content creator who uploads his user-created videos to a video sharing platform or social networking service), and/or tailored in some other manner. See paragraph [0047])(Characteristic is the certain type of a region)(Dynamic value of moment is the region); and 
causing the supplemental content associated with the first entity to be presented with the future content portion (Next, at block 216, the computing system may apply the mask to the rendered augmentation content (rendered in 3D space at an in-frame location corresponding to the target area), such that the augmentation content only appears at pixel locations corresponding to the candidate region (such as background or negative space) rather than at the location of foreground content, in-game action, or other critical areas of the video from the perspective of a rights holder or viewer, depending on the embodiment. In some embodiments, the content may be overlaid, composited, blended or superimposed with partial transparency relative to the original content of the video frame, such that the original video content is visible beneath the augmented overlay. In other embodiments, the overlay pixel values may completely replace the corresponding pixel values in the original video frame at the augmented pixel locations. See paragraph [0039]).


Allowable Subject Matter
Claims 3, 9, 10, 17, 18 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  The prior art of record alone or in combination is silent to the limitations “wherein the computer system is further caused to: provide a notification comprising a mechanism that enables a user to jump to the augmented content portion being casted, the notification being provided responsive to the dynamic value of moment satisfying a threshold value of moment; and cause the augmented content portion to be presented at a user device responsive to activation of the mechanism via the notification.” of claim 3 when read in light of the rest of the limitations in claim 3 and the claims to which claim 3 depends and thus claim 3 contains allowable subject matter.

The prior art of record alone or in combination is silent to the limitations “providing a notification comprising a mechanism that enables a user to jump to the content portion being casted, the notification being provided responsive to the dynamic value of moment satisfying a threshold value of moment; and causing the content portion to be presented at a user device responsive to activation of the mechanism via the notification. ” of claim 9 when read in light of the rest of the limitations in claim 9 and the claims to which claim 9 depends and thus claim 9 contains allowable subject matter.
	Claim 10 contains allowable subject matter because it depends on a claim containing allowable subject matter.

The prior art of record alone or in combination is silent to the limitations “further comprising: providing a notification comprising a mechanism that enables a user to jump to the content portion being casted, the notification being provided responsive to the dynamic value of moment satisfying a threshold value of moment; and causing the content portion to be presented at a user device responsive to activation of the mechanism via the notification. ” of claim 17 when read in light of the rest of the limitations in claim 17 and the claims to which claim 17 depends and thus claim 17 contains allowable subject matter.
Claim 18 contains allowable subject matter because it depends on a claim containing allowable subject matter.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS R WILSON whose telephone number is (571)272-0936. The examiner can normally be reached M-F 7:30-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on (572)-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To 





/NICHOLAS R WILSON/Primary Examiner, Art Unit 2611