DETAILED ACTION
This Office Action is in response to the Amendment filed on 09/19/2022. 
In the filed response, claims 1, 7, 8, and 10-19 have been amended, where claims 1, 10, and 19 are independent claims.
Accordingly, Claims 1-20 have been examined and are pending. This Action is made FINAL.


	Response to Arguments
1.	Applicant’s arguments with respect to claim 1 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Please see Examiner’s responses below.
2.	As to the limitation “wherein the location information of one of the bounding boxes includes the location offset of the one of the bounding boxes between the first picture and the second picture that is coded in the video bitstream” of claim 7, Applicant argues (pg. 8 of remarks) nothing in (prior art) Rybkin discloses or suggests that the location information of the requested content includes a location offset of the content between locations in different frames of the video. Applicant further argues (pg. 9 of remarks) that the frame offset described in Rybkin merely identifies a frame in the video stream and does not indicate a location offset indicating the distance between locations of particular content in different frames.
3.	After careful consideration of Rybkin, the Examiner agrees Rybkin does not explicitly disclose the aforementioned limitation. The location information (¶0029) may include the bounding box coordinates of an original image of a video stream that contains a particular content item, however, this only appears to address offset information for a single frame whereas the claim requires “a location offset indicating a distance between (i) a location of a bounding box surrounding the object in the first picture and (ii) a location of a bounding box surrounding the object in a second picture that is coded in the coded video bitstream” as recited in amended claim 1 and as similarly recited in claims 10 and 19. 
4.	In light of the amendments, an updated search yielded prior art Cooper et al. WO 2018/049321 A1, hereinafter referred to as Cooper (see PTO 892). Here Cooper allows a viewer of streaming video to retrieve and view a zoomed-in version of a spatial portion of a video (¶0004). For example, the object of interest may be a temporal sequence of a soccer ball moving across the screen through neighboring slices as depicted in Fig. 9. The soccer ball may be identified by a rendering reference point that defines a bounding box specifying the soccer ball’s location and areal extent (¶0079) as it moves between slices in time. These rendering reference points (i.e. labeling information) may be transmitted to the client either in-band as part of the video streams/segments (i.e. in the coded video bitstream) or as side information sent along with the video streams/segments. Once received, the client can use this information to then extract the renderable regions as a zoomed region of interest on the client’s display (¶0079).  Please note, in some embodiments the rendering reference point is transmitted as two coordinates such as (x, y), while in other embodiments, the reference point can be transmitted as a differential from the previous frame (¶0081).  As such, the Examiner respectfully submits that Cooper’s teachings in combination with the current art of record disclose or suggest “a location offset indicating a distance between (i) a location of a bounding box surrounding the object in the first picture and (ii) a location of a bounding box surrounding the object in a second picture that is coded in the coded video bitstream”, since the rendering reference points, which are understood to define the bounding box of the soccer ball’s location and areal extent in time, can be transmitted in-band/out-of-band to a client device as coordinate or differential information, in order to enable video clients to zoom in to a particular region of interest without substantial loss of resolution (abstract).   See the office action below for details.
5.	Examiner acknowledges Applicant’s remarks and amendments in response to the claim objections raised in the last office action. As such, the objections are withdrawn.
6.	The Examiner is available to discuss the matters of this office action to help move the Instant Application forward. Please refer to the conclusion to this office action regarding scheduling interviews.  
7.	Accordingly, Claims 1-20 have been examined and are pending.

	
Claim Rejections - 35 USC § 103
8.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-7, 9-16, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Gupta  et al. US 10,817,739 B2, in view of Cooper et al. WO 2018/049321 A1, and in further view of Abe et al. US 2021/0044809 A1,hereinafter referred to as Gupta, Cooper, and Abe, respectively.
Regarding claim 1,  Gupta discloses “A method of video coding at a decoder, comprising: receiving metadata associated with a coded video bitstream, the metadata including labeling information of one or more objects detected in a first picture that is coded in the coded video bitstream [Refer to Fig. 18 (col. 33 lines 30-58) which describe a process for generating a selection area that includes one or more objects in an image. Lines 45-58 in particular show extracting metadata from the decoded file of image data (step 1802) which describes an object(s) and corresponding bounding box(es) and at least one label. Also see col. 14 lines 29-40 for support], the labeling information comprising, for an object of the one or more objects, a location offset indicating a distance between (i) a location of a bounding box surrounding the object in the first picture and (ii) a location of a bounding box surrounding the object in a second picture that is coded in the coded video bitstream; [Gupta however does not disclose the foregoing features. See Cooper below for corresponding support] decoding the labeling information of the one or more objects in the first picture that is coded in the coded video bitstream [Since the metadata is extracted from the decoded file of image data (see above), this implies said metadata containing the bounding boxes of objects and their labels must have also been decoded. For further support, see Cooper and Abe below]; and applying the labeling information to the one or more objects in the first picture.” [See for e.g. Figs. 6 and 7 (and corresponding text) which depict annotated images containing the labeling information associated with the objects] Although Gupta is found to disclose the aforementioned features, Gupta is silent with respect to “the labeling information comprising, for an object of the one or more objects, a location offset indicating a distance between (i) a location of a bounding box surrounding the object in the first picture and (ii) a location of a bounding box surrounding the object in a second picture that is coded in the coded video bitstream;” as amended. On the other hand, Cooper from the same or similar field of endeavor is brought in to teach the above features. [Cooper shows (e.g. Fig. 9 and ¶0077-0081) an object of interest (a soccer ball) can be spatially tracked in time as it moves between different parts of a video. This can be achieved by transmitting in-band, rendering reference points that define bounding boxes enclosing the soccer ball. These points can be transmitted as (x, y) information or as a differential relative to a previous frame. Please refer to Examiner’s response #4 for further details.] Cooper is also found to teach or suggest “decoding the labeling information of the one or more objects in the first picture that is coded in the coded video bitstream” [See for e.g. ¶0005 with respect to receiving additional metadata at the client to decode and render one or more objects or areas of interest at a high resolution and/or a zoomed scale, where the spatial support for the objects/areas of interest may vary in time] It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the content aware selection system of Gupta (e.g. Fig. 1), to add the teachings of Cooper as above to provide a video client with the means to track and zoom in to a particular  region of interest in a transmitted video stream without substantial loss of resolution (abstract).
Although Gupta and Cooper are found to disclose the foregoing limitation, Abe from the same or similar field of endeavor is brought in to provide additional support for “decoding the labeling information of the one or more objects in the first picture that is coded in the coded video bitstream”. [¶0363-0364 describes attribute information of an object in an image (e.g. person, car, ball, etc.) along with its position, size, color, etc., such that a decoder can identify the relevant information about said object. Reference Fig. 30 regarding the data storage structure for the metadata] It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the content aware selection system of Gupta (e.g. Fig. 1), to add the teachings of Abe as above to provide a decoder with metadata of objects in a compressed video stream such that said decoder can identify the position of a desired object and to determine which tile/tiles include said object (¶0363-0364); hence, the quality of the decoded video can be improved.    
Regarding claim 2,  Gupta, Cooper, and Abe teach all the limitations of claim 1, and are analyzed as previously discussed with respect to that claim. Gupta however does not teach the features of claim 2. Cooper on the other hand from the same or similar field of endeavor is found to teach “wherein the metadata is included in a supplementary enhancement information (SEI) message in the coded video bitstream.” [Reference for e.g. ¶0079 of Cooper where the rendering reference points may be communicated as SEI messages to the client device. Also see Abe  ¶0363-0364 for support, where metadata may be stored as a supplemental enhancement information message in HEVC] 
Regarding claim 3,  Gupta, Cooper, and Abe teach all the limitations of claim 1, and are analyzed as previously discussed with respect to that claim.  Gupta however does not explicitly teach “wherein the metadata is included in a file that is separate from the coded video bitstream.”  Cooper on the other hand from the same or similar field of endeavor is found to teach the foregoing features. [See for e.g. ¶0079 Cooper’s rendering reference points may be transmitted as side information sent along with the video streams or video segments (i.e. not a part of the video data). Alternatively, the rendering reference points may be specified in an out-of-band communication  (for e.g. metadata in a manifest such as DASH MPD) ] 
The motivation for combining Gupta and Cooper has been discussed in connection with claim 1, above. 
Regarding claim 4,  Gupta, Cooper, and Abe teach all the limitations of claim 1, and are analyzed as previously discussed with respect to that claim.  Gupta further discloses “wherein the labeling information indicates a total number of bounding boxes in the first picture and includes location information and size information of each bounding box [object identification data includes bounding boxes describing the location and dimensions (construed as size) of each object (col. 10 lines 14-22). Also see col. 15, lines 19-30 for support. Since bounding boxes are identified for each object, the total number will be known], each bounding box being associated with one of the one or more objects in the first picture.”  [Same as above. Also reference col. 33 lines 45-58 where objects are contained in bounding boxes. Figs. 6-7 illustrate the foregoing]
Regarding claim 5, Gupta, Cooper, and Abe teach all the limitations of claim 1, and are analyzed as previously discussed with respect to that claim. Gupta further discloses “wherein the labeling information includes category information that indicates a category for each of the one or more objects.”  [See for e.g. col. 19 lines 43-53 where labels can describe a class or category of objects]
Regarding claim 6,  Gupta, Cooper, and Abe teach all the limitations of claim 1, and are analyzed as previously discussed with respect to that claim. Gupta further discloses “wherein the labeling information includes identification information that identifies each of the one or more objects in a video sequence.”  [See the labeling information in for e.g. Figs. 6-7, where said information identifies the objects shown in the images]
Regarding claim 7,  Gupta, Cooper, and Abe teach all the limitations of claim 4, and are analyzed as previously discussed with respect to that claim. Gupta however does not disclose the features of claim 7.Cooper on the other hand from the same or similar field of endeavor is found to disclose or suggest “wherein the location information of one of the bounding boxes includes [[a]] the location offset of the one of the bounding boxes between the first picture and [[a]] the second picture that is coded in the video bitstream.” [Cooper shows (Fig. 9 and ¶0077-0081) an object of interest (a soccer ball) can be spatially tracked in time as it moves between video segments. This can be achieved by transmitting in-band (i.e. as part of the video streams or segments), rendering reference points that define bounding boxes enclosing the soccer ball. These points can be transmitted as (x, y) information or as a differential relative to a previous frame. Please refer to Examiner’s response #4 for further details.] The motivation for combining Gupta and Cooper has been discussed in connection with claim 1, above. 
Regarding claim 9,  Gupta, Cooper, and Abe teach all the limitations of claim 1, and are analyzed as previously discussed with respect to that claim. Gupta further discloses “further comprising: sending a request to receive the metadata associated with the coded video bitstream.”  [See for e.g. col. 31 lines 48-67 where a program can request processing of image to facilitate identifying the objects in the image]
Regarding claim 10, claim 10 is rejected under the same art and evidentiary limitations as determined for the method of Claim 1. As to the processing circuitry, see for e.g. computing device 2010 of Gupta (Fig. 20).
Regarding claim 11, claim 11 is rejected under the same art and evidentiary limitations as determined for the method of Claim 2.
Regarding claim 12, claim 12 is rejected under the same art and evidentiary limitations as determined for the method of Claim 3.
Regarding claim 13, claim 13 is rejected under the same art and evidentiary limitations as determined for the method of Claim 4.
Regarding claim 14, claim 14 is rejected under the same art and evidentiary limitations as determined for the method of Claim 5. 
Regarding claim 15, claim 15 is rejected under the same art and evidentiary limitations as determined for the method of Claim 6.
Regarding claim 16, claim 16 is rejected under the same art and evidentiary limitations as determined for the method of Claim 7.
Regarding claim 18, claim 18 is rejected under the same art and evidentiary limitations as determined for the method of Claim 9. As to the processing circuitry, see for e.g. computing device 2010 of Gupta (Fig. 20).
Regarding claim 19, claim 19 is rejected under the same art and evidentiary limitations as determined for the method of Claim 1. As to the hardware/software, see for e.g. computing device 2010 of Gupta (Fig. 20) for implementing the functions described therein.
Regarding claim 20, claim 20 is rejected under the same art and evidentiary limitations as determined for the method of Claim 2. As to the hardware/software, see for e.g. computing device 2010 of Gupta (Fig. 20) for implementing the functions described therein.
Claims 8 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Gupta, in view of Cooper, in further view of Abe, and in further view of Chen et al.  US 2020/0175279 A1, hereinafter referred to as Chen.
Regarding claim 8,  Gupta, Cooper, and Abe teach all the limitations of claim 4, and are analyzed as previously discussed with respect to that claim. Gupta, Cooper, and Abe however do not explicitly disclose the features of claim 8. Chen on the other hand from the same or similar field of endeavor is found to disclose “wherein the location information of one of the bounding boxes indicates a location outside a picture boundary of the first picture for the one of the bounding boxes based on an object associated with the one of the bounding boxes not existing in the first picture.” [Refer to ¶0054 and Figs. 5A and 5B where for e.g. an object can exceed the boundary of the frame, i.e. does not exist in the frame] It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the techniques of Gupta and Abe, to add the image recognition teachings of Chen that is capable of recognizing identification information of an object in an image (¶0002) such that calculation loading is significantly reduced and the stability of the object recognition is increased (¶0005).
Regarding claim 17, claim 17 is rejected under the same art and evidentiary limitations as determined for the method of Claim 8. 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RICHARD A HANSELL JR. whose telephone number is (571)270-0615. The examiner can normally be reached Mon - Fri 10 am- 7 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jamie Atala can be reached on 571-272-7384. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/RICHARD A HANSELL  JR./Primary Examiner, Art Unit 2486