DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

35 USC § 112 (f)
The following is a quotation of 35 U.S.C. 112(f): 
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph: 
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims 28-36 in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:

(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.



Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the 


Claims 1-2, 6, 10-11, 15, 19-20, 24, 28-29, 33 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Mierle et al. (US Pub 2017/0243324 A1).
As to claim 1, Mierle disclose a method for generating metadata by a host device to aid warping of a rendered frame (Mierle, abstract), comprising: 
generating the rendered frame based on head tracking information of a user (¶0029, “A graphics engine may be used to render graphics of the VR content for display based on a current pose of the VR headset or user's head. Rendering may include a process of generating an image, such as, for example, rendering an image from a 2D (2 dimensional) or 3D (3 dimensional) model by a computer program or software that is run by a processor.”);
identifying a region of interest (ROI) of the rendered frame (¶0039, “the VR headset 100 may include a gaze tracking device 165 to detect and track an eye gaze of the user.” “the VR headset 100 may be configured so that the detected gaze is processed as a user input to be translated into a corresponding interaction in the immersive virtual experience.” ¶0029, “Time-warping may be used to warp (e.g., rotate or adjust) an image or frame to correct for head motion that occurred after (or while) the scene was rendered and thereby reduce perceived latency. For example, a homograph warp may use homography transformation of the image to rotate the image based on post-rendering pose information.”); 
(¶0061, “a first time-warping may be performed (e.g., in a first processing pipeline) for a scene based on updated head pose information (e.g., updated pose for VR headset); a second time-warping of an object, e.g., a graphical overlay for the sword (which is controlled by the VR controller) may be performed (e.g., in a second processing pipeline) based on updated pose information for the VR controller 424A (because, in this example, the sword, displayed in the virtual world, is controlled by movement of the VR controller 424A in the physical world); and a third time-warping of the dragon or a graphical overlay for the dragon (where motion or pose of the dragon is controlled by the VR application 412) may be performed (e.g., in a third processing pipeline) based on updated pose information (e.g., received from the VR application 412) for the dragon.”); and 
transmitting the rendered frame and the metadata for a warping operation of the rendered frame (¶0029, Fig .1, Fig .4, ¶0031, “a separate processing pipeline may be provided for each of the VR headset and the VR controller (or object), where the processing pipeline may include, for example, a graphics engine and an electronic display stabilization (EDS) engine to perform time-warping.” ¶0107, “a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.” A transmission happens between system components.).

As to claim 2, claim 1 is incorporated and Mierle discloses the ROI is determined from eye tracking information of the user (¶0039, “the VR headset 100 may include a gaze tracking device 165 to detect and track an eye gaze of the user.” “the VR headset 100 may be configured so that the detected gaze is processed as a user input to be translated into a corresponding interaction in the immersive virtual experience.”)

As to claim 6, claim 1 is incorporated and Mierle discloses the ROI is determined from content information of the rendered frame (¶0039, “the VR headset 100 may include a gaze tracking device 165 to detect and track an eye gaze of the user.” “the VR headset 100 may be configured so that the detected gaze is processed as a user input to be translated into a corresponding interaction in the immersive virtual experience.”)

As to claim 10, Mierle discloses an apparatus, comprising: a memory storing processor readable code; and a processor coupled to the memory and configured to execute the processor readable code to cause the apparatus to: generate a rendered frame based on head tracking information of a user; identify a region of interest (ROI) of the rendered frame; generate metadata for a warping operation from the ROI; and transmit the rendered frame and the metadata for a warping operation of the rendered frame (See claim 1 for detailed analysis.).

As to claim 11, claim 10 is incorporated and Mierle discloses the ROI is determined from eye tracking information of the user (See claim 2 for detailed analysis.).

As to claim 15, claim 10 is incorporated and Mierle discloses the ROI is determined from content information of the rendered frame (See claim 6 for detailed analysis.).

As to claim 19, Mierle discloses a non-transitory computer-readable medium storing computer executable code, the code when executed by a processor causes the processor to: generate a rendered frame based on head tracking information of a user; identify a region of interest (ROI) of the rendered frame; generate metadata for a warping operation from the ROI; and transmit the rendered frame and the metadata for a warping operation of the rendered frame (See claim 1 for detailed analysis.).

As to claim 20, claim 19 is incorporated and Mierle discloses the ROI is determined from eye tracking information of the user (See claim 2 for detailed analysis.).

As to claim 24, claim 19 is incorporated and Mierle discloses the ROI is determined from content information of the rendered frame (See claim 6 for detailed analysis.).

As to claim 28, Mierle discloses an apparatus of a host device to aid warping of a rendered frame, comprising: means for generating the rendered frame based on head tracking information of a user; means for identifying a region of interest (ROI) of the rendered frame; means for generating metadata for a warping operation from the ROI; and means for transmitting the rendered frame and the metadata for a warping operation of the rendered frame (See claim 1 for detailed analysis.).

As to claim 29, claim 28 is incorporated and Mierle discloses the ROI is determined from eye tracking information of the user (See claim 2 for detailed analysis.).

As to claim 33, claim 28 is incorporated and Mierle discloses the ROI is determined from content information of the rendered frame (See claim 6 for detailed analysis.).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a 



Claims 3-5, 7-8, 12-14, 16-17, 21-23, 25-26, 30-32, 34-35 are rejected under 35 U.S.C. 103 as being unpatentable over Mierle et al. (US Pub 2017/0243324 A1) in view of Thirumalai et al. (US Pub 2015/0003521 A1).

As to claim 3, claim 2 is incorporated and Mierle does not disclose generating the metadata comprises computing a single depth approximation of a plurality of pixel depths of pixels within the ROI.
Thirumalai teaches computing a single depth approximation of a plurality of pixel depths of pixels within the ROI (Thirumalai, ¶0103, “video encoder 20 may implement a “weighted average” approach to derive the depth value of the current block from the depth values associated with the three neighboring samples. More specifically, video encoder 20 may respectively assign a weight to each depth value of each neighboring sample, and multiply each depth value by the assigned weight to obtain three weighted product values. In turn, video encoder 20 may sum the three product values, and divide the sum by a predetermined constant (e.g., 16) to obtain a resulting value. Video encoder 20 may use the resulting value as the depth value for the current block.” ¶0337, “Video encoder 20 may derive the disparity vector from the single depth value, which video encoder 20 may calculate using a weighted average of three neighboring reconstructed depth samples with weights (5,5,6). More specifically, video encoder may calculate a single depth value as follows:”)
Mierle and Thirumalai discloses are considered to be analogous art because all pertain to 3D video. It would have been obvious before the effective filing date of the claimed invention to have modified Mierle with the features of “computing a single depth approximation of a plurality of pixel depths of pixels within the ROI” as taught by Thirumalai. The suggestion/motivation would have been in order to improve accuracy of motion vector prediction for dependent depth views by leveraging a greater number of motion vector candidates from already-coded motion information for a depth base view (Thirumalai, ¶0046).

As to claim 4, claim 3 is incorporated and the combination of Mierle and Thirumalai discloses computing the single depth approximation comprises computing a harmonic mean depth of the plurality of pixel depths of the pixels within the ROI (Thirumalai, ¶0101, “video encoder 20 may obtain the average by calculating the mean value of the three depth values” ¶0121, “to determine the depth value, the one or more processors are configured to calculate at least one of a mean value, a median value, or a mode value associated with the one or more neighboring pixels.”).

As to claim 5, claim 3 is incorporated and the combination of Mierle and Thirumalai discloses computing the single depth approximation comprises computing a weighted average of the plurality of pixel depths of the pixels within the ROI by applying weighting factors to the plurality of pixel depths, wherein the weighting factors are selected to favor contributions from a subset of the pixels that are closer to a center of the ROI (Thirumalai, ¶0083, “Each sample in the prediction block may be a weighted average of corresponding samples in the reference blocks. The weighting of the samples may be based on temporal distances of the reference pictures from the picture containing the PU.” ¶0103, “video encoder 20 may implement a “weighted average” approach to derive the depth value of the current block from the depth values associated with the three neighboring samples. More specifically, video encoder 20 may respectively assign a weight to each depth value of each neighboring sample, and multiply each depth value by the assigned weight to obtain three weighted product values. In turn, video encoder 20 may sum the three product values, and divide the sum by a predetermined constant (e.g., 16) to obtain a resulting value. Video encoder 20 may use the resulting value as the depth value for the current block.” ¶0104-105, ¶0164, “the motion of the right-bottom PU within the center PUs of the CU containing the current PU is used.” ¶0180, ¶0246, ¶0337.). 

As to claim 7, claim 6 is incorporated and the combination of Mierle and Thirumalai discloses generating the metadata comprises computing a single depth (Thirumalai, ¶0337, “Video encoder 20 may derive the disparity vector from the single depth value, which video encoder 20 may calculate using a weighted average of three neighboring reconstructed depth samples with weights (5,5,6). More specifically, video encoder may calculate a single depth value as follows:” ¶0396.).

As to claim 8, claim 1 is incorporated and Mierle disclose generating the metadata comprise: analyzing content information of the rendered frame within the ROI (¶0039, “the VR headset 100 may include a gaze tracking device 165 to detect and track an eye gaze of the user.” “the VR headset 100 may be configured so that the detected gaze is processed as a user input to be translated into a corresponding interaction in the immersive virtual experience.”).
Mierle does not discloses generating a motion vector grid size as the metadata based on the analyzing, wherein the motion vector grid size is used to sample motion vectors of the rendered frame during the warping operation.
Thirumalai teaches generating a motion vector grid size as the metadata based on the analyzing, wherein the motion vector grid size is used to sample motion vectors of the rendered frame during the warping operation (Thirumalai, Fig. 8, ¶0011, “determining a depth value associated with a block of video data included in a dependent depth view based on one or more neighboring pixels positioned adjacent to the block of video data in the dependent depth view, and generating a disparity vector associated with the block of video data based at least in part on the determined depth value associated with the block of video data.” ¶0046, “derive motion vector candidates (e.g., with which to populate a merge list) from the base depth view” ¶0079, “Video encoder 20 may generate, based at least in part on samples corresponding to the first and second reference locations, the predictive blocks for the PU. Moreover, when using bi-prediction to encode the PU, video encoder 20 may generate a first motion vector indicating a spatial displacement between a prediction block of the PU and the first reference location and a second motion vector indicating a spatial displacement between the prediction block of the PU and the second reference location.” Fig. 10. ¶0029. ¶0070, “A picture may include three sample arrays, denoted SL, SCb, and SCr. SL is a two-dimensional array (i.e., a block) of luma samples.” ¶0073, “one or more sample blocks and syntax structures used to code samples of the one or more blocks of samples.” ¶0098-0102.)
Mierle and Thirumalai discloses are considered to be analogous art because all pertain to 3D video. It would have been obvious before the effective filing date of the claimed invention to have modified Mierle with the features of “generating a motion vector grid size as the metadata based on the analyzing, wherein the motion vector grid size is used to sample motion vectors of the rendered frame during the warping operation” as taught by Thirumalai. The suggestion/motivation would have been in order to improve accuracy of motion vector prediction for dependent depth views by (Thirumalai, ¶0046).

As to claim 12, claim 11 is incorporated and the combination of Mierle and Thirumalai discloses to generate the metadata, the processor when executing the processor readable code further causes the apparatus to compute a single depth approximation of a plurality of pixel depths of pixels within the ROI (See claim 3 for detailed analysis.).

As to claim 13, claim 12 is incorporated and the combination of Mierle and Thirumalai discloses to compute the single depth approximation of the plurality of pixel depths of pixels within the ROI, the processor when executing the processor readable code further causes the apparatus to compute a harmonic mean depth of the plurality of pixel depth of the pixels within the ROI (See claim 4 for detailed analysis.).

As to claim 14, claim 12 is incorporated and the combination of Mierle and Thirumalai discloses to compute the single depth approximation of the plurality of pixel depths of pixels within the ROI, the processor when executing the processor readable code further causes the apparatus to apply weighting factors to the plurality of pixel depths to compute a weighted average of the plurality of pixels depths of the (See claim 5 for detailed analysis.).

As to claim 16, claim 15 is incorporated and the combination of Mierle and Thirumalai discloses to generate the metadata, the processor when executing the processor readable code further causes the apparatus to compute a single depth approximation of a plurality of pixel depths of pixels within the ROI (See claim 7 for detailed analysis.).

As to claim 17, claim 10 is incorporated and the combination of Mierle and Thirumalai discloses to generate the metadata, the processor when executing the processor readable code further causes the apparatus to: analyze content information of the rendered frame within the ROI; and generate a motion vector grid size as the metadata based on the content information analyzed, wherein the motion vector grid size is used to sample motion vectors of the rendered frame during the warping operation (See claim 8 for detailed analysis.).


As to claim 21, claim 20 is incorporated and the combination of Mierle and Thirumalai discloses the code when executed by the processor causes the processor to compute a single depth approximation of a plurality of pixel depths of pixels within the ROI (See claim 3 for detailed analysis.).

As to claim 22, claim 21 is incorporated and the combination of Mierle and Thirumalai discloses the code when executed by the processor causes the processor to compute a harmonic mean depth of the plurality of pixel depth of the pixels within the ROI (See claim 4 for detailed analysis.).

As to claim 23, claim 21 is incorporated and the combination of Mierle and Thirumalai discloses the code when executed by the processor causes the processor to apply weighting factors to the plurality of pixel depths to compute a weighted average of the plurality of pixels depths of the pixels within the ROI, wherein the weighting factors are selected to favor contributions from a subset of the pixels that are closer to a center of the ROI (See claim 5 for detailed analysis.).

As to claim 25, claim 24 is incorporated and the combination of Mierle and Thirumalai discloses the code when executed by the processor causes the processor to (See claim 7 for detailed analysis.).

As to claim 26, claim 19 is incorporated and the combination of Mierle and Thirumalai discloses the code when executed by the processor causes the processor to: analyze content information of the rendered frame within the ROI; and generate a motion vector grid size as the metadata based on the content information analyzed, wherein the motion vector grid size is used to sample motion vectors of the rendered frame during the warping operation (See claim 8 for detailed analysis.).

As to claim 30, claim 29 is incorporated and the combination of Mierle and Thirumalai discloses the means for generating the metadata is configured to compute a single depth approximation of a plurality of pixel depths of pixels within the ROI (See claim 3 for detailed analysis.).

As to claim 31, claim 30 is incorporated and the combination of Mierle and Thirumalai discloses the means for computing the single depth approximation is configured to compute a harmonic mean depth of the plurality of pixel depth of the pixels within the ROI (See claim 4 for detailed analysis.).

As to claim 32, claim 30 is incorporated and the combination of Mierle and Thirumalai discloses the means for computing the single depth approximation is configured to compute a weighted average of the plurality of pixels depths of the pixels within the ROI by applying weighting factors to the plurality of pixel depths, wherein the weighting factors are selected to favor contributions from a subset of the pixels that are closer to a center of the ROI (See claim 5 for detailed analysis.).

As to claim 34, claim 33 is incorporated and the combination of Mierle and Thirumalai discloses the means for generating the metadata is configured to compute a single depth approximation of a plurality of pixel depths of pixels within the ROI (See claim 7 for detailed analysis.).

As to claim 35, claim 28 is incorporated and the combination of Mierle and Thirumalai discloses the means for generating the metadata is configured to: analyze content information of the rendered frame within the ROI; and generate a motion vector grid size as the metadata based on the analyzing, wherein the motion vector grid size is used to sample motion vectors of the rendered frame during the warping operation (See claim 8 for detailed analysis.).


Claims 9, 18, 27, 36 are rejected under 35 U.S.C. 103 as being unpatentable over Mierle et al. (US Pub 2017/0243324 A1) in view of Thirumalai et al. (US Pub 2015/0003521 A1) and Cuervo et al. (US Pub 2019/0155372 A1) 
As to claim 9, claim 8 is incorporated and the combination of Mierle and Thirumalai does not disclose receiving the head tracking information from a client device; and transmitting the rendered frame and the metadata for the warping operation of the rendered frame to the client device.
Cuervo teaches receiving the head tracking information from a client device (cuervo, abstract, “the pose is provided to the server”); and
transmitting the rendered frame and the metadata for the warping operation of the rendered frame to the client device (Cuervo, ¶0006, “The HMD includes one or more optical receivers and demodulates the optical beam including the data transmissions from the server that include the rendered panoramic frames and associated poses, and stores the panoramic frames to populate the frame cache. As the system operates, the frame cache on the HMD is continuously updated and populated with panoramic frames received from the server over the optical beam for the last pose sent to the server. As the frame cache is being populated and updated, the HMD continues to update the pose, retrieve appropriate panoramic frames from the frame cache that are the best match for the pose, and display a view of the virtual environment to the user.”).
Cuervo, abstract).

As to claim 18, claim 17 is incorporated and the combination of Mierle, Thirumalai and Cuervo discloses the processor when executing the processor readable code further causes the apparatus to: receive the head tracking information from a client device; and transmit the rendered frame and the metadata for the warping operation of the rendered frame to the client device (See claim 9 for detailed analysis.).

As to claim 27, claim 26 is incorporated and the combination of Mierle, Thirumalai and Cuervo discloses the code when executed by the processor causes the processor to: receive the head tracking information from a client device; and transmit (See claim 9 for detailed analysis.).

As to claim 36, claim 35 is incorporated and the combination of Mierle, Thirumalai and Cuervo discloses means for receiving the head tracking information from a client device; and means for transmitting the rendered frame and the metadata for the warping operation of the rendered frame to the client device (See claim 9 for detailed analysis.).


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YU CHEN whose telephone number is (571)270-7951.  The examiner can normally be reached on M-F 8-5 PST Mid-day flex.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu can be reached on 571-270-7951.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.





/YU CHEN/Primary Examiner, Art Unit 2613