Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
Claims 1–17 have been submitted for examination.  
Claims 1–17 have been examined and rejected. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.

4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1–20 are rejected under 35 U.S.C. 103 as being unpatentable over Mao (US 6,728,965) in view of Ma et al. (US 8,925,021)
Regarding claims 1, Mao discloses:
A video quality control method comprising: 
receiving a video stream including frames that are classified into a plurality of frame types based on a reference relationship between the frames with regard to decoding; (Mao, col. 7, ln. 25–40, “the compressed format used to encode the video data is the Moving Pictures Experts Group 2 format, known as MPEG-2. As shown in FIG. 5, the data is transmitted as one of three basic frames. The GOP start point is coded in the "I" or intracoded frames. In addition to the frames, there are predicted frames and bidirectional frames (P and B frames, respectively). The P and B frames normally contain the video and audio content. Each synchronization frame is separated from the next synchronization frame by a pre-determined number of other frames. This predetermined number can be set to accommodate a specific requirement but in one preferred embodiment is fifteen frames.”)
selecting a frame type among the plurality of frame types; and (Mao, prioritizes I frames, which are independently renderable, col. 9, ln. 9–25, “If, for example, the subscriber wishes to change TV channels to the one corresponding with video channel X, the I frame of buffer 50 is accessed by microprocessor 55. At time t.sub.1, as shown in FIG. 5, the processor 55 has stored the information corresponding to intracoded frame I.sub.x. At a later time t.sub.2, the processor 55 stores the position of the next intracoded frame I.sub.x+1. As I.sub.x+1 moves through buffer 50, the processor 55 keeps continuous track of the I frame. Accordingly, the processor 55 can immediately synchronize with the video signal stored in the FIFO buffer 50. Since the processor 55 can immediately synchronize with the video signal, it can substantially simultaneously direct the desired video data from FIFO buffer 50 to the multiplexer 44 for eventual transmission downstream to the subscriber.”)

In a similar field of endeavor Ma teaches:
controlling a quality of the video stream by multiplexing the frames included in the video stream and by outputting one or more of the frames that have the selected frame type. (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)

The motivation is “To limit the bandwidth wasted downloading non-key frames when only key frames are used” as taught by Ma (col. 2, ln. 65–67).


Regarding claim 2, the combination of Mao and Ma teaches:
The video quality control method of claim 1, wherein the plurality of frame types comprises a first frame type indicating that a corresponding frame is independently decoded without referring to other frames, a second frame type indicating that the corresponding frame is decoded with reference to a preceding frame, and a third frame type indicating that the corresponding frame is decoded with reference to the preceding frame and a subsequent frame. (Mao, col. 7, ln. 25–40, “the compressed format used to encode the video data is the Moving Pictures Experts Group 2 format, known as MPEG-2. As shown in FIG. 5, the data is transmitted as one of three basic frames. The GOP start point is coded in the "I" or intracoded frames. In addition to they frames, there are predicted frames and bidirectional frames (P and B frames, respectively). The P and B frames normally contain the video and audio content. Each synchronization frame is separated from the next synchronization frame by a pre-determined number of other frames. This predetermined number can be set to accommodate a specific requirement but in one preferred embodiment is fifteen frames.”) (Ma, col. 2, ln. 15–30, “the client extracts independently renderable key frames (e.g., MPEG I-frames, JPEG images, etc.) from the segment. The inter-key-frame gaps are referred to herein as a group of pictures (GOP) size (the number of frames between key frames) or GOP duration (the amount of wall clock time between key frames, also calculated as the GOP size divided by the frame rate).”)


Regarding claim 3, the combination of Mao and Ma teaches:
The video quality control method of claim 2, wherein the controlling of the quality of the video stream comprises: providing the video stream with a first quality by outputting the frames of the first frame type, the second frame type, and the third frame type through a first channel; providing the video stream with a second quality by outputting the frames of the first frame type and the second frame type through a second channel; and providing the video stream with a third quality by outputting the frames of the first frame type through a third channel. (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)

Regarding claim 4, the combination of Mao and Ma teaches:
The video quality control method of claim 1, wherein the receiving of the video stream comprises receiving the frames of the video stream that are encoded at a terminal of a sender and transmitted over a network, and the quality control method further comprises relaying the video stream of which the quality is controlled by transmitting the one or more of the frames having the selected frame type through a channel accessible by a terminal of a viewer. (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)

Regarding claim 5, the combination of Mao and Ma teaches:
The video quality control method of claim 4, further comprising transmitting the one or more of the frames having the selected frame type to a monitoring terminal for monitoring or inspecting the video stream. (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)

Regarding claim 6, the combination of Mao and Ma teaches:
A non-transitory computer-readable record medium storing computer instructions that, when executed by a processor, cause the processor to perform the video quality control method of claim 1. (Mao, col. 3, ln. 33–43) (Ma, col. 5, ln. 40–55, “It will be appreciated that the term "server" used herein refers to a general-purpose or special-purpose computer, generally including memory, input/output circuitry, and instruction processing logic along with interconnections such as one or more high-speed data buses connecting those components together. Many aspects of the disclosed techniques can be embodied as software executing on one or more server computers. Similarly, a "client" herein is a computerized device (also including the above components) capable of receiving content from a network connection and decoding and rending the content on a display or similar output device. So-called smartphones are specifically included within the definition of client as used herein.”)

Regarding claim 7, Mao discloses:
A computer apparatus comprising: at least one processor configured to execute computer-readable instructions, wherein the at least one processor (Mao, col. 3, ln. 33–43) is configured to: 
receive a video stream including frames that are classified into a plurality of frame types based on a reference relationship between the frames with regard to decoding, (Mao, col. 7, ln. 25–40, “the compressed format used to encode the video data is the Moving Pictures Experts Group 2 format, known as MPEG-2. As shown in FIG. 5, the data is transmitted as one of three basic frames. The GOP start point is coded in the "I" or intracoded frames. In addition to they frames, there are predicted frames and bidirectional frames (P and B frames, respectively). The P and B frames normally contain the video and audio content. Each synchronization frame is separated from the next synchronization frame by a pre-determined number of other frames. This predetermined number can be set to accommodate a specific requirement but in one preferred embodiment is fifteen frames.”)
select a frame type among the plurality of frame types; and (Mao, prioritizes I frames, which are independently renderable, col. 9, ln. 9–25, “If, for example, the subscriber wishes to change TV channels to the one corresponding with video channel X, the I frame of buffer 50 is accessed by microprocessor 55. At time t.sub.1, as shown in FIG. 5, the processor 55 has stored the information corresponding to intracoded frame I.sub.x. At a later time t.sub.2, the processor 55 stores the position of the next intracoded frame I.sub.x+1. As I.sub.x+1 moves through buffer 50, the processor 55 keeps continuous track of the I frame. Accordingly, the processor 55 can immediately synchronize with the video signal stored in the FIFO buffer 50. Since the processor 55 can immediately synchronize with the video signal, it can substantially simultaneously direct the desired video data from FIFO buffer 50 to the multiplexer 44 for eventual transmission downstream to the subscriber.”)
Mao does not explicitly teach “control a quality of the video stream by multiplexing the frames included in the video stream and by outputting one or more of the frames that have the selected frame type.”
In a similar field of endeavor Ma teaches:
control a quality of the video stream by multiplexing the frames included in the video stream and by outputting one or more of the frames that have the selected frame type.  (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)
Therefore it would have been obvious to one of ordinary skill in the art prior to the filing date of the invention to combine the system for . . . as taught by Pan with the system for . . .  with the change in control of quality based on frame type as taught by Ma.
The motivation is “To limit the bandwidth wasted downloading non-key frames when only key frames are used” as taught by Ma (col. 2, ln. 65–67).


Regarding claim 8, the combination of Mao and Ma teaches:
The computer apparatus of claim 7, wherein the plurality of frame types comprises a first frame type indicating that a corresponding frame is independently decoded without referring to other frames, a second frame type indicating that the corresponding frame is decoded with reference to a preceding frame, and a third frame type indicating that the corresponding frame is decoded with reference to the preceding frame and a subsequent frame. (Mao, col. 7, ln. 25–40, “the compressed format used to encode the video data is the Moving Pictures Experts Group 2 format, known as MPEG-2. As shown in FIG. 5, the data is transmitted as one of three basic frames. The GOP start point is coded in the "I" or intracoded frames. In addition to they frames, there are predicted frames and bidirectional frames (P and B frames, respectively). The P and B frames normally contain the video and audio content. Each synchronization frame is separated from the next synchronization frame by a pre-determined number of other frames. This predetermined number can be set to accommodate a specific requirement but in one preferred embodiment is fifteen frames.”) (Ma, col. 2, ln. 15–30, “the client extracts independently renderable key frames (e.g., MPEG I-frames, JPEG images, etc.) from the segment. The inter-key-frame gaps are referred to herein as a group of pictures (GOP) size (the number of frames between key frames) or GOP duration (the amount of wall clock time between key frames, also calculated as the GOP size divided by the frame rate).”)

Regarding claim 9, the combination of Mao and Ma teaches:
The computer apparatus of claim 8, wherein the at least one processor is further configured to provide the video stream with a first quality by outputting the frames of the first frame type, the second frame type, and the third frame type through a first channel, provide the video stream with a second quality by outputting the frames of the first frame type and the second frame type through a second channel, and provide the video stream with a third quality by outputting the frames of the first frame type through a third channel. (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)


Regarding claim 10, the combination of Mao and Ma teaches:
The computer apparatus of claim 8, wherein the at least one processor is further configured to receive the frames of the video stream that are encoded at a terminal of a sender and transmitted over a network, and relay the video stream of which the quality is controlled by transmitting the one or more (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)


Regarding claim 11, the combination of Mao and Ma teaches:
The computer apparatus of claim 10, wherein the at least one processor is further configured to transmit the one or more of the frames having the selected frame type to a monitoring terminal for monitoring or inspecting the video stream. (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)

Regarding claim 12, Mao discloses:
A server for providing a video stream to a client device, the server comprising: a memory configure to store computer-readable instructions; and at least one processor configured to execute the computer-readable instructions to: (Mao, col. 3, ln. 33–43)
receive a video stream comprising a plurality of frames having a plurality of different frame types from another client device; (Mao, col. 7, ln. 25–40, “the compressed format used to encode the video data is the Moving Pictures Experts Group 2 format, known as MPEG-2. As shown in FIG. 5, the data is transmitted as one of three basic frames. The GOP start point is coded in the "I" or intracoded frames. In addition to they frames, there are predicted frames and bidirectional frames (P and B frames, respectively). The P and B frames normally contain the video and audio content. Each synchronization frame is separated from the next synchronization frame by a pre-determined number of other frames. This predetermined number can be set to accommodate a specific requirement but in one preferred embodiment is fifteen frames.”)
select a frame type among the plurality of different frame types based on a network state between the server and the client device; and (Mao, prioritizes I frames, which are independently renderable, col. 9, ln. 9–25, “If, for example, the subscriber wishes to change TV channels to the one corresponding with video channel X, the I frame of buffer 50 is accessed by microprocessor 55. At time t.sub.1, as shown in FIG. 5, the processor 55 has stored the information corresponding to intracoded frame I.sub.x. At a later time t.sub.2, the processor 55 stores the position of the next intracoded frame I.sub.x+1. As I.sub.x+1 moves through buffer 50, the processor 55 keeps continuous track of the I frame. Accordingly, the processor 55 can immediately synchronize with the video signal stored in the FIFO buffer 50. Since the processor 55 can immediately synchronize with the video signal, it can substantially simultaneously direct the desired video data from FIFO buffer 50 to the multiplexer 44 for eventual transmission downstream to the subscriber.”)
Mao does not explicitly teach “selectively transmit one or more of the plurality of frames according to the selected frame type.”
In a similar field of endeavor Ma teaches:
selectively transmit one or more of the plurality of frames according to the selected frame type. (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)
Therefore it would have been obvious to one of ordinary skill in the art prior to the filing date of the invention to combine the system for . . . as taught by Pan with the system for . . .  with the change in control of quality based on frame type as taught by Ma.
The motivation is “To limit the bandwidth wasted downloading non-key frames when only key frames are used” as taught by Ma (col. 2, ln. 65–67).


Regarding claim 13, the combination of Mao and Ma teaches:
The server of claim 12, wherein the plurality of frames comprise intra frames (I-frames), predictive frames (P-frames), and bidirectional predictive frames (B-frames). (Mao, col. 7, ln. 25–40, “the compressed format used to encode the video data is the Moving Pictures Experts Group 2 format, known as MPEG-2. As shown in FIG. 5, the data is transmitted as one of three basic frames. The GOP start point is coded in the "I" or intracoded frames. In addition to they frames, there are predicted frames and bidirectional frames (P and B frames, respectively). The P and B frames normally contain the video and audio content. Each synchronization frame is separated from the next synchronization frame by a pre-determined number of other frames. This predetermined number can be set to accommodate a specific requirement but in one preferred embodiment is fifteen frames.”) (Ma, col. 2, ln. 15–30, “the client extracts independently renderable key frames (e.g., MPEG I-frames, JPEG images, etc.) from the segment. The inter-key-frame gaps are referred to herein as a group of pictures (GOP) size (the number of frames between key frames) or GOP duration (the amount of wall clock time between key frames, also calculated as the GOP size divided by the frame rate).”)


Regarding claim 14, the combination of Mao and Ma teaches:
The server of claim 13, wherein the at least one processor is further configured to execute the computer-readable instructions to: set a quality of the video stream to be transmitted to the client device, to a first video quality, a second video quality, or a third video quality, according to the network (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)


Regarding claim 15, the combination of Mao and Ma teaches:
The server of claim 12, wherein the at least one processor is further configured to execute the computer-readable instructions to: determine the network state based on information of a traffic delay provided from the client device. (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)


Regarding claim 16, the combination of Mao and Ma teaches:
The server of claim 12, wherein the at least one processor comprises a network state monitor configured to monitor the video stream and determine the network state based on the monitored video stream. (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)


Regarding claim 17, the combination of Mao and Ma teaches:
The server of claim 12, wherein the at least one processor is further configured to execute the computer-readable instructions to: communicate with a monitoring terminal configured to monitor the video stream and determine the network state based on the monitored video stream; and receive information of the network state from the monitoring terminal. (Ma, cols. 2–3, lns. 64–23, “fast forward uses bursts of consecutive frames (both key frames and non-key frames). To limit the bandwidth wasted downloading non-key frames when only key frames are used, the initial portions of segments may be used and the latter portion of segments skipped, e.g., given a segment duration of ten seconds, and a trick play playout rate of 5.times., downloading and playing just the first two seconds of each segment would achieve a 5.times. playout rate without a commensurate increase in bandwidth usage. In one embodiment, all segments are of fixed duration. In another embodiment, all segments contain a fixed number of bytes. In one embodiment, all segments begin with a key frame. In another embodiment, segments may begin with non-key frames. In general, the first (D*F/R) frames of each segment are played out, where D is the time duration of each segment, F is the frame rate of the encoded video, and R is the trick play playout rate multiplier. If segments do not begin with a key frame, playback should begin with the first key frame in the segment. The duration D should be reduced by (L/F), where L is the number of leading non-key frames and F is the frame rate of the encoded video. If the segment is not duration-based, i.e., if the segment is byte-based, the duration D should be estimated as (S/E), where S is the size of the segment in bits and E is the encoded bitrate of the content. In one embodiment, segments with short durations D (e.g., less than 10 seconds) may be concatenated to improve rendering continuity.”)


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL B PIERORAZIO whose telephone number is (571)270-3679.  The examiner can normally be reached on Monday - Thursday, 8am - 5pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nasser Goodarzi can be reached on 5712704195.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/MICHAEL B. PIERORAZIO/Primary Examiner, Art Unit 2426