PNG
    media_image1.png
    340
    340
    media_image1.png
    Greyscale
United States Patent and Trademark Office    
        
            
                                
            
        
    

Commissioner for Patents
United States Patent and Trademark Office
P.O. Box 1450
Alexandria, VA 22313-1450
www.uspto.gov











BEFORE THE PATENT TRIAL AND APPEAL BOARD


Application Number: 16/397,511
Filing Date: 29 Apr 2019
Appellant(s): NVIDIA Corporation



__________________
Jason Lohr
For Appellant


EXAMINER’S ANSWER





This is in response to the appeal brief filed 4/14/2022.

(1) Grounds of Rejection to be Reviewed on Appeal
Every ground of rejection set forth in the Office action dated 8/26/2021 from which the appeal is taken is being maintained by the examiner except for the grounds of rejection (if any) listed under the subheading “WITHDRAWN REJECTIONS.”  New grounds of rejection (if any) are provided under the subheading “NEW GROUNDS OF REJECTION.”
	The following ground(s) of rejection are applicable to the appealed claims:

Claim Rejections – 35 USC § 102
3.	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


4.	Claims 1-3, 5-11, 16-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Xiong (US PGPub 2010/0034420) [hereafter Xiong].

5.	As to claim 1, Xiong discloses a computer-implemented method (executed by video processing system shown in Figure 1 executing video analysis algorithm shown in Figure 3) comprising receiving a sequence of rendered images (as shown in Figure 2 supplied from video detector 12), and processing the sequence of rendered images, by a neural network model (learned neural network model of block-wise video metric extractor 22 and decisional logic 24), to produce at least one quality metric (video metrics including color, texture, flickering, obscuration, blurring, shape metrics which are then fused to produce a fire detection metric) for the sequence of rendered images, each quality metric indicating presence or absence of a visual artifact in the sequence of rendered images (Paragraphs 0010-0019, 0021-0022, 0026-0030, 0032-0033, a video recognition system receives a plurality of images rendered from a video detector that are stored within a frame buffer and provided to a block divider that portions the images into blocks in order for the image data within the blocks to be processed by a block-wise video metric extractor that analyzes the image data within the blocks using a learned model to formulate various video metrics indicating the presence or absence of various visual artifacts which are later fused an applied to decisional logic that determines whether or not the fused metrics indicate the presence of a fire within the captured image data).

6.	As to claim 2, Xiong discloses dividing a first rendered image of the sequence of rendered images into a number of regions including at least a first region and a second region (as shown in Figure 2) (Paragraphs 0012-0015, 0017). 

7.	As to claim 3, Xiong discloses each region in the number of regions corresponds with a single pixel (as shown in Figure 2), and wherein the at least one quality metric includes at least one bit for each region (Paragraphs 0018-0021). 

8.	As to claim 5, Xiong discloses dividing each rendered image in the sequence of rendered images into a number of regions including at least a first region and a second region, wherein the processing comprises processing the first region of each rendered image by the neural network model to produce a first quality metric of the at least one quality metric for each rendered image (Paragraphs 0012-0015, 0017, 0027-0030).

9.	As to claim 6, Xiong discloses processing the second region of each rendered image to produce a second quality metric of the at least one quality metric for each rendered image (Paragraphs 0027-0030).

10.	As to claim 7, Xiong discloses the second region is processed by a second neural network model in parallel with the processing of the first region by the neural network model (Paragraphs 0027-0029). 

11.	As to claim 8, Xiong discloses the first quality metric indicates a first type of visual artifact is present in the first region of a first rendered image and the second quality metric indicates a second type of visual artifact is present in the second region of the first rendered image (Paragraphs 0027-0029). 

12.	As to claim 9, Xiong discloses the neural network model detects when a first type of artifact (fire artifact) is present in the sequence of rendered images and ignores the presence of a second type of artifact (non-fire artifact) in the sequence of rendered images (Paragraphs 0026-0030). 

13.	As to claim 10, Xiong discloses the sequence of rendered images includes at least four rendered images (Paragraphs 0012, 0017, 0027). 

14.	As to claim 11, Xiong discloses the quality metric indicates a severity of the visual artifact (Paragraphs 0019-0020, 0024, 0028-0030). 

15.	As to claim 16, Xiong discloses the neural network model is trained to detect a first type of visual artifact using a second sequence of rendered images for a scene including a first image that does not include the first type of visual artifact and a second image that does include at least one occurrence of the first type of visual artifact (Paragraphs 0010, 0018, 0020, 0028). 

16.	As to claim 17, Xiong discloses computing a first ground truth quality metric for the first image and a second ground truth quality metric for the second image (Paragraphs 0010, 0018, 0020, 0028). 

17.	As to claim 18, Xiong discloses the at least one quality metric is computed using only the sequence of rendered images without using a reference image (Paragraphs 0027-0029). 

18.	As to claims 19-20, the Xiong reference discloses all claimed subject matter as discussed above with respect to the comments/citations of claim 1.



Claim Rejections – 35 USC § 103
19.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


20.	Claims 4 and 12 are rejected under 35 U.S.C 103 as being unpatentable over Xiong (US PGPub 2010/0034420) [hereafter Xiong] in view of the Applicant’s Admitted Prior Art [hereafter AAPA].

*NOTE*: The common knowledge or well-known in the art statement with regards to the subject matter of claims 4 and 12 is taken to be admitted prior art because the Applicant did not traverse the Examiner’s assertion of Official Notice with regards to the subject matter of claims 4 and 12 in the Office action mailed on 3/29/2021.  See MPEP 2144.03(C).

21.	As to claim 4, Xiong discloses all claimed subject matter with regards to claim 3, except processing the first rendered image comprises processing the regions in parallel to produce the at least one quality metric.  On the other hand, AAPA discloses that the technique of processing regions of images in parallel to produce a detected output by a neural network model is a well-known and established practice in the art of image artifact detection.
	Therefore it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to include processing the regions in parallel to produce the at least one quality metric within the method of Xiong in order to yield predictable results of expediting the generation of the quality metric for the plurality of blocks within an image by processing the blocks in parallel within a neural network model.

22.	As to claim 12, Xiong discloses all claimed subject matter with regards to claim 1, except wherein the visual artifact includes an aliasing artifact.  On the other hand, AAPA discloses that the technique of processing regions of images in order to detect aliasing artifacts is a well-known and established practice in the art of image artifact detection.
	Therefore it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to include detecting a visual artifact that includes an aliasing artifact within the method of Xiong in order to yield predictable results of increasing the accuracy of the fire detection by distinguishing blocks within the image frames that are distorted due to aliasing artifacts from blocks having higher quality image data.

23.	Claims 13-15 are rejected under 35 U.S.C 103 as being unpatentable over Xiong (US PGPub 2010/0034420) [hereafter Xiong] in view of Andreou (US PGPub 2021/0055835) [hereafter Andreou].

24.	As to claim 13, it is noted that Xiong fails to particularly disclose the visual artifact includes a data compression artifact.
	On the other hand, Andreou discloses determining quality metrics for images that indicate the presence or absence of visual artifacts wherein the visual artifact includes a data compression artifact (Paragraphs 0089, 0091, 0094, 0096).
	It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to include detecting a visual artifact includes a data compression artifact as taught by Andreou with the method of Xiong because the cited prior art references are directed towards imaging processing devices and methods that detect visual artifacts within rendered images and because the claimed limitations are fully disclosed within the cited prior art references and would yield predictable results of increasing the accuracy of the fire detection by distinguishing blocks within the image frames that are distorted due to compression artifacts from blocks having higher quality image data.

25.	As to claim 14, it is noted that Xiong fails to particularly disclose the visual artifact includes a de-noising artifact.
	On the other hand, Andreou discloses determining quality metrics for images that indicate the presence or absence of visual artifacts wherein the visual artifact includes a de-noising artifact (Paragraphs 0089, 0091, 0094, 0096).
	It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to include detecting a visual artifact includes a de-noising artifact as taught by Andreou with the method of Xiong because the cited prior art references are directed towards imaging processing devices and methods that detect visual artifacts within rendered images and because the claimed limitations are fully disclosed within the cited prior art references and would yield predictable results of increasing the accuracy of the fire detection by distinguishing blocks within the image frames that are distorted due to de-noising artifacts from blocks having higher quality image data.

26.	As to claim 15, it is noted that Xiong fails to particularly disclose the visual artifact includes an overexposure artifact.
	On the other hand, Andreou discloses determining quality metrics for images that indicate the presence or absence of visual artifacts wherein the visual artifact includes an overexposure artifact (Paragraphs 0089, 0091, 0094, 0096).
	It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to include detecting a visual artifact includes an overexposure artifact as taught by Andreou with the method of Xiong because the cited prior art references are directed towards imaging processing devices and methods that detect visual artifacts within rendered images and because the claimed limitations are fully disclosed within the cited prior art references and would yield predictable results of increasing the accuracy of the fire detection by distinguishing blocks within the image frames that are distorted due to overexposure artifacts from blocks having higher quality image data.

	(2) Response to Arguments
	Firstly with regards to independent claim 1, as well as independent claims 19 and 20, Appellant argues that the applied Xiong reference does not disclose “processing a sequence of rendered images by a neural network” since there was not a showing of a neural network that processes a sequence of rendered images but instead that Xiong reports a neural network model and inputs to it metrics taken from blocks of a video image such that Xiong compares coefficients using a learned model as against a sequence of rendered images used in Appellant’s claim and supported throughout the specification.  It is noted that the current claim recites “a neural network”, but does not require or go into fur
  ther detail explaining any specific architecture for the neural network, and also that Appellant’s specification (Appellant’s Paragraph 0027) states the following:

[0027] Although the neural network model 100 is described in the context of processing units, one or more of the units, layers 105, 110, 115, and 120 may be implemented as a program, custom circuitry, or by a combination of custom circuitry and a program. For example, the layers 105 may be implemented by a GPU (graphics processing unit), CPU (central processing unit), or any processor capable of implementing layers of a neural network. Furthermore, persons of ordinary skill in the art will understand that any system that performs the operations of the neural network model 100 is within the scope and spirit of embodiments of the present disclosure.
 	
	Due to the disclosure and the requirements of the claims, a reasonable interpretation for one of ordinary skill in the art for the “neural network” would include any system that includes connect components that perform the operations of the neural network model 100.  Turning now to Xiong, a video recognition system (element 14 as shown in Figure 1) which includes hardware and software necessary to perform the disclosed operations, the software including video content analysis software that performs the operations shown in Figure 3 (Xiong, Paragraph 0011).  The video recognition system includes components that perform the disclosed duties of frame buffer 18, block divider 20, block-wise video metric extractor 22, and decisional logic 24.  Specifically, it is disclosed that the block-wise video metric extractor 22 applies the video analysis algorithm (shown in Figure 3) to each block to generate a number of video features or metrics which are then provided to decisional logic that performs decisional operations based on the metrics to output and indication for each block whose metrics are input to determine whether or not the metrics (which judge color, texture, flickering, blurring, etc.) are indicative of the presence of fire within a captured video (Xiong, Paragraphs 0013, 0016).  Therefore, because the connected hardware and software components performing the operations of the block-wise video metric extractor 22 and decisional logic 24 of the video recognition system 14 within Xiong process a sequence of rendered images to produce at least one quality metric for a sequence of rendered images, where each quality metric indicates a presence or absence of a visual artifact in the sequence of rendered images, which are the functions specified by the claims for the “neural network”, and the specification does not exclude the elements of Xiong from being interpreted or defined as the claimed neural network, it is suggested that the architecture of the connected components of the block-wise video metric extractor 22 and the decisional logic 24 fully anticipate the claimed limitations and operations of the claimed “neural network”.  
	Furthermore, it is also suggested that the decisional logic component 24 on its own merit anticipates the claimed requirements and operations for the “neural network”.  Specifically, Xiong discloses that the decisional logic 24 is embodied by a neural network (Xiong, Paragraph 0033), this neural network receiving the video metrics derived by the video metric extractor 22 performing processing operations (shown in Figure 3) for each block of image data produced by block divider 20 with respect to each frame of video data held in frame buffer 18 (Xiong, Paragraphs 0013) in order to render a decision for each block of video data whether or not visual image characteristics present within the blocks of video data possess the color, texture, flickering, blurring, etc., characteristics associated with the presence of fire (Xiong, Paragraph 0016) in each respective video block.  Therefore, because the decisional logic 24 alone processes the derived visual metrics/characteristics, produced by processing operations of the block-wise video metric extractor 24, representative of processed image data for blocks of the sequence of rendered images produced by the video detector 12 to produce a decision for each block regarding whether or not the blocks include visual characteristics consistent with the visual presence of fire (corresponding to the claimed quality metric), it is suggested that the decisional logic of Xiong fully anticipates the requirements for and operations of the claimed “neural network.”               
	Secondly, with regards to independent claim 1, as well as independent claims 19 and 20, Appellant argues that a sequence of “rendered images” is not received since a rendered image is understood to a person of skill in the art as an image rendered or completed processing so that it is at least photo-realistic or accurate while Xiong merely discloses using frames from a video and does not describe rendered images.  It is suggested that the claimed “rendered images” are not claimed or explicitly described as being images that are “completed processing for being photo-realistic” or that the rendered images undergo any type of image processing operations at all.  Instead, the rendered images are merely images that are received and processed by a neural network model according to the claims.  As explained above, the Xiong reference discloses a video detector 12 that captures a number of successive video images or frames (Xiong, Paragraph 0012) which are further processed by the block-wise video metric extractor 22 and the decisional logic 24 after being divided into a plurality of blocks by block divider 20 (Xiong, Paragraphs 0012-0013), thus fully meeting the requirements for the claimed “rendered images”.  Although the Appellant gives a specific definition for what constitutes “rendered images”, it is suggested that another reasonable interpretation for the plain and ordinary meaning of “rendered images” would include images that are merely provided in a specific format, and that a difference exists between “rendered images” and “images that have undergone rendering operations” with “rendered images” being a more broadly interpreted term and the current term that is being claimed.     
	Additionally, taking into account the Appellant’s own definition of a “rendered image” as one “being rendered or completed processing so that it is at least photo-realistic or accurate”, because the video detector 12 captures and produces data representing the imaging/monitoring environment where the video detector is placed which in-turn forms the successive video images which are used to generate the disclosed video metrics of the scene which judge the presence or absence of fire, it is suggested that without the video images being “completed processing” or “realistic” or “accurate”, that the system of Xiong would fail to perform the operations and achieve the results disclosed therein.  Therefore, the conversion of captured data by the video detector into the output of successive video images performed within Xiong also anticipates the requirements for the “sequence of rendered images.”
	Finally, with regards to dependent claim 3, Appellant argues that there is no basis to allege that a few pixels or a large number of pixels in the reference can be read contrary to plain teaching as being “a region in the number of [that] corresponds with a single pixel” since Xiong discloses Figures 2A and 2B use “size of the blocks [that] vary from only a few pixels (e.g., 4x4) to a large number of pixels.”  Within Xiong (Paragraphs 0018-0021), it is disclosed that for a color comparison metric to be derived for each video image block, each pixel within the block is compared to a learned color map with a thresholds value to determine if a pixel is indicative of a fire pixel, and the number of pixels within a block identified as fire pixels or the percentage of pixels identified as fire pixels are then provided as the color comparison metric.  Therefore, it is suggested that deriving a quality metric representing a decision as to whether or not the pixel color for each pixel within a specific block is indicative of the presence of fire fully anticipates and constitutes the claimed limitations of dependent claim 3.  Furthermore, since a single pixel is included within the constraints/bounds of “a few pixels (e.g., 4x4) to a large number of pixels”, it is suggested that the remaining video metrics that account for visual characteristics of blocks of 4x4 or a larger number of pixels must fully take into account the single pixels that constitute the 4x4 or larger groups of pixels in order for the video metrics to be completed for each of the blocks, and therefore the rest of the disclosed video metrics described within Xiong anticipate the limitations and constraints of dependent claim 3.   

	Conclusion
For the above reasons, it is believed that the rejections should be sustained.
Respectfully submitted,
/MICHAEL S OSINSKI/Primary Examiner, Art Unit 2664                                                                                                                                                                                                        
Conferees:
/NAY A MAUNG/Supervisory Patent Examiner, Art Unit 2664                                                                                                                                                                                                           
/VU LE/Supervisory Patent Examiner, Art Unit 2668                                                                                                                                                                                                        

                                                                                                                                                                                          
Requirement to pay appeal forwarding fee.  In order to avoid dismissal of the instant appeal in any application or ex parte reexamination proceeding, 37 CFR 41.45 requires payment of an appeal forwarding fee within the time permitted by 37 CFR 41.45(a), unless appellant had timely paid the fee for filing a brief required by 37 CFR 41.20(b) in effect on March 18, 2013.