DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-44 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
While claim 1, line 7, recites “the video streams”, it is unclear if this is meant to reference the “video streams” of line 1, the “first video stream” of line 2, and/or “the second video stream” of line 3.

While claim 1, line 2, recites “a first video stream”, it is unclear if this is meant to reference one of the “video streams” of line 1 or a different video stream.

While claim 1, line 3, recites “a second video stream”, it is unclear if this is meant to reference one of the “video streams” of line 1 or a different video stream.

While claim 10, line 5, recites “the video stream”, it is unclear if this is meant to reference the “first video stream” or “the second video stream” of claim 1 or a different video stream.

Claim 12 recites the limitation "the combinations of cross-space extracted features" in line 8.  There is insufficient antecedent basis for this limitation in the claim.

Claim 13 recites the limitation "the combinations of per group-of-frame extracted features" in lines 2-3.  There is insufficient antecedent basis for this limitation in the claim.

While claim 22, line 4, recites “a video stream”, it is unclear if this is meant to reference the “a video stream” of line 3 or a different video stream.

Claim 33 recites the limitation "the combinations of cross-space extracted features" in line 9.  There is insufficient antecedent basis for this limitation in the claim.

Claim 36 recites the limitation "the first video stream" in line 1.  There is insufficient antecedent basis for this limitation in the claim.

Claim 36 recites the limitation "the second video stream" in line 2.  There is insufficient antecedent basis for this limitation in the claim.

Claim 37 recites the limitation "the first video stream" in line 3.  There is insufficient antecedent basis for this limitation in the claim.

Claim 37 recites the limitation "the second video stream" in lines 8-9.  There is insufficient antecedent basis for this limitation in the claim.

While claim 43, line 5, recites “a video stream”, it is unclear if this is meant to reference the “a video stream” of line 4 or a different video stream.

Claim 44 recites the limitation "the first video stream" in line 3.  There is insufficient antecedent basis for this limitation in the claim.

Claim 44 recites the limitation "the second video stream" in lines 8-9.  There is insufficient antecedent basis for this limitation in the claim.

Claim 12 recites the limitation "the combinations of cross-space extracted features" in line 8.  There is insufficient antecedent basis for this limitation in the claim.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-4, 7-8, 12, 13, 15, 22-25, 28-29, 33, 34, 36, 43 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Wolf et al. (Wolf) (US 5,446,492).
	As to claim 1, Wolf discloses a method for identifying real-time latency of video streams, comprising:
buffering frames of a first video stream into a first buffer (Fig. 2, 19-20; column 3, line 66-column 4, line 25);
buffering frames of a second video stream into a second buffer (Fig. 2, 27-28; column 4, line 36-column 4, line 62);
identifying a control group as a subset of frames of the second buffer (frame samples with motion used for time correlation; column 9, lines 19-25);
computing correlations of extracted features of the control group to extracted features of successive sliding windows of the first buffer (time alignment processing using successive windows of frames from each buffer; Fig. 4, column 6, line 26-60, column 7, line 51-column 9, line 18), the extracted features being based on spatial information and temporal information of the video streams (column 3, line 21-45, column 6, line 38-51, column 11, line 13-18); and
identifying a delay between the first video stream and the second video stream (column 6, line 26-60, column 7, line 51-column 8, line 30) according to a maximum correlation of the correlations (identifying the delay based on the correlation with the smallest deviation between the two streams; column 8, lines 19-32).

	As to claim 22, Wolf discloses a system for identifying real-time latency of video streams, comprising:
a computing device programmed to buffer frames of a video stream into a first buffer (Fig. 2, 19-20; column 3, line 66-column 4, line 25);
buffer frames of a video stream into a second buffer (Fig. 2, 27-28; column 4, line 36-column 4, line 62);
identify a control group as a subset of frames of the second buffer (frame samples with motion used for time correlation; column 9, lines 19-25);
compute correlations of extracted features of the control group to extracted features of successive windows of frames of the first buffer (time alignment processing using successive windows of frames from each buffer; Fig. 4, column 6, line 26-60, column 7, line 51-column 9, line 18), the extracted features being based on spatial information and temporal information of the video streams (column 3, line 21-45, column 6, line 38-51, column 11, line 13-18); and
identify a delay between the video stream collected at the first point and the video stream collected at the second point (column 6, line 26-60, column 7, line 51-column 8, line 30) according to a maximum correlation of the correlations (identifying the delay based on the correlation with the smallest deviation between the two streams; column 8, lines 19-32).

	As to claim 43, Wolf discloses a non-transitory computer-readable medium comprising instructions for identifying real-time latency of video streams, that when executed by a processor of a computing device, cause the computing device to:
buffer frames of a video stream into a first buffer (Fig. 2, 19-20; column 3, line 66-column 4, line 25);
buffer frames of a video stream into a second buffer (Fig. 2, 27-28; column 4, line 36-column 4, line 62);
identify a control group as a subset of frames of the second buffer (frame samples with motion used for time correlation; column 9, lines 19-25);
compute correlations of extracted features of the control group to extracted features of successive sliding windows of the first buffer (time alignment processing using successive windows of frames from each buffer; Fig. 4, column 6, line 26-60, column 7, line 51-column 9, line 18), the extracted features being based on spatial information and temporal information of the video stream (column 3, line 21-45, column 6, line 38-51, column 11, line 13-18); and
identify a delay between the video stream collected at the first point and the video stream collected at the second point (column 6, line 26-60, column 7, line 51-column 8, line 30) according to a maximum correlation of the correlations (identifying the delay based on the correlation with the smallest deviation between the two streams; column 8, lines 19-32).

	As to claims 2, 23, Wolf discloses extracting the spatial information using one or more of a Gaussian filter, a Laplacian filter, a Laplacian of Gaussian filter, a Sobel filter (21 and 29, Fig. 2; column 4, line 13-16, column 4, line 63-column 5, line 2), a Prewitt filter, or Scharr filter.

	As to claims 3, 24, Wolf discloses extracting the temporal information using one or more of an average/min/max difference of consecutive frames in terms of raw pixel values (pixel differences used to determine motion and which frame information to extract; column 5, line 3-25, column 7, line 51-66, column 9, line 3-2+5), Gaussian filtered pixel values, Laplacian filtered pixel values, Laplacian of Gaussian filtered pixel values, Sober filtered pixel values, Prewitt filtered pixel values, or Scharr filtered pixel values.

	As to claims 4, 25, Wolf discloses computing the correlations according to one or more of the following metrics: mean squared error (MSE), root mean squared error (RMSE) (column 7, lines 30-45, column 11, line 5-12), mean absolute error (MAE), peak signal to noise ratio (PSNR), Pearson Linear correlation coefficient (PLCC), Spearman's rank correlation coefficient (SRCC), or Kendall's rank correlation coefficient (KRCC).

	As to claims 7, 28, Wolf discloses decomposing the video streams into a plurality of different spatial regions (pixel data for subset ROIs; column 6, line 61-column 7, line 23); and using a different filter to extract features for each respective spatial region (column 6, line 61-column 7, line 23).

	As to claims 8, 29, Wolf discloses decomposing the video streams into a plurality of different groups of frames (breaking the stream into time windows of frames where motion occurs; Fig. 4, column 6, line 26-60, column 7, line 51-column 9, line 25); and using a different filter to extract features for each respective group of frames (Fig. 2, column 6, line 26-60, column 7, line 51-column 9, line 18).

	As to claims 9, 30, Wolf discloses one or more of:
(i) decomposing the video streams into a plurality of different content types, and using a different filter to extract features for each respective content type;
(ii) decomposing the video streams into a plurality of different distortion types, and using a different filter to extract features for each respective distortion type; or
(iii) decomposing the video streams into a plurality of different complexity levels, and using a different filter to extract features for each respective complexity level (decomposing into ROIs of frames including different amounts of motion; column 6, line 61-column 7, line 23).

	As to claims 12, 33, Wolf discloses
one or more of:
using region segmentation methods to divide frames of the video streams into regions (dividing frames into subset ROIs; column 6, line 61-column 7, line 23);
using visual saliency evaluation methods to assign spatially-varying importance factors to different pixels in the frames of the video streams; and
using the segmentation results and/or the spatially-varying importance factors to guide the combinations of cross-space extracted features (column 6, line 61-column 7, line 23).

	As to claims 13, 34, Wolf discloses dividing the video streams into groups of frames of a fixed or variable size (dividing frames into motion groups; column 6, line 61-column 7, line 23, column 8, line 38-column 9, line 35), and using a group-of-frame importance assessment method to guide the combinations of per group-of-frame extracted features (only extracting features from “important” motion frame groups; column 6, line 61-column 7, line 23, column 8, line 38-column 9, line 35),

	As to claims 14, 35, Wolf discloses one or more of:
(i) classifying frames of the video streams or groups of frames of the video streams into different content types, and using the content types to guide combinations of content-type-dependent extracted features (dividing frames into subset ROIs; column 6, line 61-column 7, line 23);
(ii) classifying the video streams, as a whole or as groups of frames, into different distortion types, and using the different distortion types to guide combinations of distortion-type- dependent extracted features; and
(iii) classifying frames of the video streams or spatial regions of the frames into different complexity levels of different complexity measures, and using the complexity levels and the complexity measures to guide combinations of complexity-dependent extracted features (subset ROIs of frames including the maximal amount of motion; column 6, line 61-column 7, line 23).

	As to claims 15, 36, Wolf discloses wherein the first video stream is collected at a first point along a video delivery chain, and the second video stream is collected at a second point along the video delivery chain, the second point being downstream the video delivery chain from the first point (see Fig. 1-2; column 3, line 46-55).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 5, 6, 10, 11, 26, 27, 31, 32 are rejected under 35 U.S.C. 103 as being unpatentable over Wolf in view of Watson (US 6,493,023 B1).
As to claim 5, 26, while Wolf discloses decomposing the video streams and using different filters to extract features, he fails to specifically disclose decomposing the video streams into a plurality of different scales and resolutions and using a different filter to extract features for each respective scale and resolution.
In an analogous art, Watson discloses a system for comparing two video streams (Fig. 2) which will decompose the video streams into a plurality of different scales and resolutions (column 5, line 38-62, column 9, line 12-36, column 11, line 37-58) and use a different filter to extract features for each respective scale and resolution (column 11, line 37-58) so as to improve visual quality measurement by incorporating a human vision model within a simple processing architecture (column 4, lines 15-25).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Wolf’s system to include decomposing the video streams into a plurality of different scales and resolutions and using a different filter to extract features for each respective scale and resolution, as taught in combination with Watson, for the typical benefit of improving visual quality measurement by incorporating a human vision model within a simple processing architecture.

As to claim 6, 27, while Wolf discloses decomposing the video streams and using different filters to extract features, he fails to specifically disclose decomposing the video streams into a plurality of different frequency bands and using a different filter to extract features for each respective frequency band.
In an analogous art, Watson discloses a system for comparing two video streams (Fig. 2) which will decompose the video streams into a plurality of different frequency bands (column 5, line 38-62, column 9, line 12-36, column 11, line 37-58) and use a different filter to extract features for each respective frequency band (column 11, line 37-58) so as to improve visual quality measurement by incorporating a human vision model within a simple processing architecture (column 4, lines 15-25).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Wolf’s system to include decomposing the video streams into a plurality of different frequency bands and using a different filter to extract features for each respective frequency band, as taught in combination with Watson, for the typical benefit of improving visual quality measurement by incorporating a human vision model within a simple processing architecture.

As to claim 10, 31, while Wolf discloses decomposing the video streams and extracting features, he fails to specifically disclose using a multi-scale signal analysis method to decompose the video streams into multiple resolutions and using human visual contrast sensitive models to guide combinations of cross-scale extracted features of the video stream as decomposed.
In an analogous art, Watson discloses a system for comparing two video streams (Fig. 2) which will utilize a multi-scale signal analysis method to decompose the video streams into multiple resolutions (column 5, line 38-62, column 9, line 12-36, column 11, line 37-58) and use human visual contrast sensitive models to guide combinations of cross-scale extracted features of the video stream as decomposed (column 5, lines 49-61, column 13, lines 1-31) so as to improve visual quality measurement by incorporating a human vision model within a simple processing architecture (column 4, lines 15-25).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Wolf’s system to include using a multi-scale signal analysis method to decompose the video streams into multiple resolutions and using human visual contrast sensitive models to guide combinations of cross-scale extracted features of the video stream as decomposed, as taught in combination with Watson, for the typical benefit of improving visual quality measurement by incorporating a human vision model within a simple processing architecture.

As to claim 11, 32, while Wolf discloses decomposing the video streams and extracting features, he fails to specifically disclose using 2D Fourier, 3D Fourier, or Discrete Cosine Transform (DCT) analysis methods to decompose the video streams into multiple frequency bands and using human visual contrast sensitive models to guide combinations of cross-frequency band extracted features of the video streams as decomposed.
In an analogous art, Watson discloses a system for comparing two video streams (Fig. 2) which will utilize 2D Fourier, 3D Fourier, or Discrete Cosine Transform (DCT) analysis methods to decompose the video streams into multiple frequency bands (column 5, line 38-62, column 9, line 12-36, column 11, line 37-58) and use human visual contrast sensitive models to guide combinations of cross-frequency band extracted features of the video streams as decomposed (column 5, lines 49-61, column 13, lines 1-31) so as to improve visual quality measurement by incorporating a human vision model within a simple processing architecture (column 4, lines 15-25).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Wolf’s system to include using 2D Fourier, 3D Fourier, or Discrete Cosine Transform (DCT) analysis methods to decompose the video streams into multiple frequency bands and using human visual contrast sensitive models to guide combinations of cross-frequency band extracted features of the video streams as decomposed, as taught in combination with Watson, for the typical benefit of improving visual quality measurement by incorporating a human vision model within a simple processing architecture.

Allowable Subject Matter
Claims 16-21, 37-42, 44 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to James R Sheleheda whose telephone number is (571)272-7357. The examiner can normally be reached M-F 8 am-5 pm CST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jefferey Harold can be reached on (571) 272-7519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/James R Sheleheda/Primary Examiner, Art Unit 2424