DETAILED ACTION

Response to Arguments
Applicant’s arguments with respect to claims 1-19 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 06/06/2022 has been entered.




Claim Objections
Claims 17-19 are objected to because of the following informalities:  Please include a comma after the claim number in the preamble.  Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 1 recites the limitation " a decoder receiving the encoded data from the bit rate compressor via the communication network and communicating with the display screen to display an image derived from the encoded data.".  There is insufficient antecedent basis for this limitation in the claim.  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 4-5, 14-15 and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Gonzalez et al. (herein after will be referred to as Gonzalez) (US Patent No. 11,240,284) in view of White et al. (herein after will be referred to as White) (US 20200302241) and in further view of Chang et al. (herein after will be referred to as Chang) (US 20190370591).



Regarding claim 1, Gonzalez discloses a video compression system comprising:
a region of interest extractor receiving an input stream of a first set of video frames from the video source and identifying a region of interest by applying the input stream of the first set of video frames to at least one machine learning model trained to identify a set of different predetermined physical objects in the input stream of the first set of video frames to define a region of interest extracting a predetermined physical object according to the identification category, [See Gonzalez [Abstract] Analyzing the video stream to label one or more regions of a frame with a semantic category.  Also, see Col. 3 lines 28-44, Object detection using a set of machine learning models and the application selects a machine learning model.]
a bit rate compressor receiving an input stream of the first set of video frames and the region of interest from the region of interest extractor and outputting an output stream of video frames based on both the input stream of the first set of video frames and a region of interest defining a first portion of the first set of video frames of the input stream; [See Gonzalez [Abstract] Allocating encoding resources to regions of the frame based on the semantic category.  Also, see Fig. 1, Encoder (128).]
wherein the bit rate compressor encodes the first portion of the first set of video frames at a relatively higher bit rate than a second portion of the first set of video frames outside of the first portion; and  [See Gonzalez [Abstract] Allocating encoding resources to regions of the frame based on the semantic category.  Also, see Fig. 1, Encoder (128).]
a decoder receiving the encoded data from the bit rate compressor via the communication network and communicating with the display screen to display an image derived from the encoded data.  [See Gonzalez [Col. 6 lines 40-45] Transmit the encoded video stream to one or more client systems such that decoding is performed accordingly.]
Gonzalez does not explicitly disclose
a video source communicating through a communication network with a display viewable by a user;  
an input for receiving an identification category related to a physical object of interest and selected from a set of different identification categories;
the training of the machine learning model employing at least one training set linking video frames of a second set of video frames depicting the set of different predetermined physical objects each to a corresponding identification category;  
However, White does disclose
the training of the machine learning model employing at least one training set linking video frames of a second set of video frames depicting the set of different predetermined physical objects each to a corresponding identification category;  [See White [0086] Training data images are used to train an object recognition model.  After the training process, the image recognition system recognizes objects similar to the labeled object in the training data.  For example, in various examples, a new image containing a similar object is provided to the machine learning system, and the machine learning system recognizes 1114 that the similar object matches the object in the training data.]
It would have been obvious to the person of ordinary skill in the art at the time of the effective filing date to modify the system by Gonzalez to add the teachings of White, in order to improve upon video systems by properly training a machine learning model [See White [0024]].
Gonzalez (modified by White) do not explicitly disclose
a video source communicating through a communication network with a display viewable by a user;  
an input for receiving an identification category related to a physical object of interest and selected from a set of different identification categories;
However, Chang does disclose
a video source communicating through a communication network with a display viewable by a user;  [See Chang [Fig. 1]]
an input for receiving an identification category related to a physical object of interest and selected from a set of different identification categories; [See Chang [0058] Receive region information for setting a ROI through a user’s input.  Also, see 0050, The user inputs a category of an object as object detection conditions.]
It would have been obvious to the person of ordinary skill in the art at the time of the effective filing date to modify the system by Gonzalez (modified by White) to add the teachings of White, in order to perform a simple substitution of how the ROI/object detection is performed.  It appears that Gonzalez is performed via an algorithm.  However, replacing an automated means of object detection/classification for setting an ROI via manual means of a user is not obvious.  The board has determined in MPEP 2144.04 (III) that automating a manual activity is not sufficient over the prior art.  Therefore, the same will be said of manual activity of automation is not sufficient.

Regarding claim 4, Gonzalez (modified by White and Chang) disclose the system of claim 1.  Furthermore, Gonzalez discloses
wherein the higher bit rate is realized by at least one of a greater bit depth in pixels of the output stream of video frames and a greater bit transmission rate of pixels in the output stream of the video frame.  [See Gonzalez [Col. 9 lines 1-14] Allocating bitrate budget for regions corresponding to ROI and Non-ROI.]

Regarding claim 5, Gonzalez (modified by White and Chang) disclose the system of claim 1.  Furthermore, Gonzalez discloses
wherein the region of interest extractor includes multiple machine learning models each trained to identify a different predetermined physical object in the stream of the first set of video frames defining a region of interest in the input stream of the first set of video frames and wherein the video compression system uses the identification category to select among the different multiple machine learning models.  [See Gonzalez [Col. 3 lines 15-45] Selecting a machine learning model for object detection such that the application focuses on semantic content that is relevant to the application.   Also, see abstract, allocating encoding resources to regions of the frame based on the semantic category.]

Regarding claim 14, Gonzalez (modified by White and Chang) disclose the system of claim 1.  Furthermore, Gonzalez discloses
wherein the video compression system further provides for multiple network connections and routing data among those connections.  [See Gonzalez [Col. 6 lines 40-45] Transmit the encoded video stream to multiple client systems such that decoding is performed accordingly.]

Regarding claim 15, Gonzalez (modified by White and Chang) disclose the system of claim 1.  Furthermore, Gonzalez does not explicitly disclose
further including a portable wireless device providing a video camera producing the input stream of video frames.  
However, Chang does disclose
further including a portable wireless device providing a video camera producing the input stream of video frames.  [See Chang [0047] Smart phone connected to a camera.]
Applying the same motivation as applied in claim 1.

Regarding claim 17, Gonzalez (modified by White and Chang) disclose the system of claim 1.  Furthermore, Gonzalez discloses
including multiple machine learning models each trained to identify a different predetermined physical object of the set of different identification categories in the input stream of the first set of video frames and wherein the region of interest extractor selects along the multiple machine learning models according to the identification category.  [See Gonzalez [Col. 3 lines 15-45] Selecting a machine learning model for object detection such that the application focuses on semantic content that is relevant to the application.   Also, see abstract, allocating encoding resources to regions of the frame based on the semantic category.]

Regarding claim 18, Gonzalez (modified by White and Chang) disclose the system of claim 1.  Furthermore, Gonzalez does not explicitly disclose
further including a user input operable by the user to provide the identification category.
However, Chang does disclose
further including a user input operable by the user to provide the identification category.  [See Chang [0058] Receive region information for setting a ROI through a user’s input.  Also, see 0050, The user inputs a category of an object as object detection conditions.]
Applying the same motivation as applied in claim 1.
Regarding claim 19, Gonzalez (modified by White and Chang) disclose the system of claim 1.  Furthermore, Gonzalez discloses 
wherein the set of identification categories includes both a person’s face and a black/white board.  [See Gonzalez [Col. 5 lines 21-22] Semantic categories include a person’s face and a whiteboard.]

Claims 2-3 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Gonzalez (US Patent No. 11,240,284) in view of White (US 20200302241) in view of Chang (US 20190370591) and in further view of Farivar et al. (herein after will be referred to as Farivar) (US Patent No. 10,332,261).

Regarding claim 2, Gonzalez (modified by White and Chang) disclose 
the system of claim 1.  Furthermore, Gonzalez does not explicitly disclose
wherein the training set links the second set of video frames and corresponding mask frames outlining the predetermined physical object in a portion of the second set of video frames related to the predetermined object.  
However, Farivar does disclose
wherein the training set links the second set of video frames and corresponding mask frames outlining the predetermined physical object in a portion of the second set of video frames related to the predetermined object.  [See Farivar [Col. 6 last para.] The training platform trains the model (i.e. machine learning) using the training set and the mask images.  In addition, see Col. 2 lines 54-55, training set includes images and mask images.  Also, see abstract, the mask image is associated with an object and Fig. 1A, mask image outlines object.]
It would have been obvious to the person of ordinary skill in the art at the time of the effective filing date to modify the system by Gonzalez (modified by White and Chang) to add the teachings of Farivar, in order to improve upon the accuracy of training a machine learning model [See Farivar [Col. 3 lines 1-3]].

Regarding claim 3, Gonzalez (modified by White, Chang and Farivar) disclose the system of claim 2.  Furthermore, Gonzalez does not explicitly disclose
wherein the mask frames identify in the second set of video frames of the training set a region of interest using a predetermined physical object selected from the group consisting of at least one of a person, a person’s face, or a black/whiteboard in the video frames of the training set.  
However, Farivar does disclose
wherein the mask frames identify in the second set of video frames of the training set a region of interest using a predetermined physical object selected from the group consisting of at least one of a person, a person’s face, or a black/whiteboard in the video frames of the training set.  [See Farivar [Col. 6 last para.] The training platform trains the model (i.e. machine learning) using the training set and the mask images.  In addition, see Col. 2 lines 54-55, training set includes images and mask images.  Also, see abstract, the mask image is associated with an object and Fig. 1A, mask image outlines object.  Also, see Col. 4 lines 14-15, implementations described herein are applied to any target object (such as the persons face and/or whiteboard in the primary reference Gonzalez).]
Applying the same motivation as applied in claim 2.

Regarding claim 16, Gonzalez (modified by White and Chang) disclose the system of claim 1.  Furthermore, Gonzalez does not explicitly disclose
wherein the training set links pairs of an images comprised of the predetermined physical object, and a mask providing an outline of the predetermined physical object.  
However, Farivar does disclose
wherein the training set links pairs of an images comprised of the predetermined physical object, and a mask providing an outline of the predetermined physical object.  [See Farivar [Col. 6 last para.] The training platform trains the model (i.e. machine learning) using the training set and the mask images.  In addition, see Col. 2 lines 54-55, training set includes images (i..e plural) and mask images.  Also, see abstract, the mask image is associated with an object and Fig. 1A, mask image outlines object.]
Applying the same motivation as applied in claim 2.

Claims 6-8 are rejected under 35 U.S.C. 103 as being unpatentable over Gonzalez (US Patent No. 11,240,284) in view of White (US 20200302241) in view of Chang (US 20190370591) and in further view of Quast et al. (herein after will be referred to as Quast) (US 20110051808).
Regarding claim 6, Gonzalez (modified by White and Chang) disclose the system of claim 1.  Furthermore, Gonzalez does not explicitly disclose
wherein the bit rate compressor divides each video frame of the input stream into macro-blocks and provides a different amount of compression to corresponding macro-blocks of each video frame of the output stream according to whether the region of interest overlaps the macro-block.  
However, Quast does disclose
wherein the bit rate compressor divides each video frame of the input stream into macro-blocks and provides a different amount of compression to corresponding macro-blocks of each video frame of the output stream according to whether the region of interest overlaps the macro-block.  [See Quast [Fig. 9 and 0092] ROI is classified as an extended area when the ROI overlaps or is partially contained within macroblocks.  Therefore, these macroblocks will be compressed the same as the ROI macroblocks and different from non-ROI macroblocks.]
It would have been obvious to the person of ordinary skill in the art at the time of the effective filing date to modify the system by Gonzalez (modified by White and Chang) to add the teachings of Quast, in order to improve upon ROI coding by extending the ROI with a window such that the entirety of the ROI is coded at a higher compression rate when the ROI partially overlaps with RONI macroblocks.

Regarding claim 7, Gonzalez (modified by White,Chang and Quast) disclose the system of claim 6.  Furthermore, Gonzalez discloses
further including a bit rate decompressor communicating with the bit rate compressor to receive the output stream to provide different amount of decompression to each macro-block of the output stream according to information transmitted with the macro-blocks of the output stream.  [See Gonzalez [Col. 6 lines 40-45] Along with information about how the superblocks of each frame were encoded so that the client systems decode the superblocks accordingly.  It is inherent that the client systems use a decoder or decompressor to decode the superblocks.]

Regarding claim 8, Gonzalez (modified by White,Chang and Quast) disclose the system of claim 6.  Furthermore, Gonzalez does not explicitly disclose
further including a bit rate decompressor communicating with the bit rate compressor to receive the output stream and to decompress the output stream according to one of: MPEG2, H.264, HEVC, VPNS, VP9, and AVP1.  
However, Quast does disclose
further including a bit rate decompressor communicating with the bit rate compressor to receive the output stream and to decompress the output stream according to one of: MPEG2, H.264, HEVC, VPNS, VP9, and AVP1.  [See Quast [0003] H.264 compression method.]
Applying the same motivation as applied in claim 6.

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Gonzalez (US Patent No. 11,240,284) in view of White (US 20200302241) in view of Chang (US 20190370591) and in further view of Georgis (US 20210174197).

Regarding claim 9, Gonzalez (modified by White and Chang) disclose the system of claim 1.  Furthermore, Gonzalez does not explicitly disclose
wherein the machine learning model of the region of interest extractor is a deep neural network being a convolution on a neural network having more than three layers.  
However, Georgis does disclose
wherein the machine learning model of the region of interest extractor is a deep neural network being a convolution on a neural network having more than three layers.  [See Georgis [0023 and Fig. 1] N numbers of layers of a first neural network and shows atleast 4 layers.]
It would have been obvious to the person of ordinary skill in the art at the time of the effective filing date to modify the system by Gonzalez (modified by White and Chang) to add the teachings of Georgis, in order to improve upon neural network processing by increasing the number of layers utilized in a neural network to obtain an optimum solution.

Claims 10-11 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Gonzalez (US Patent No. 11,240,284) in view of White (US 20200302241) in view of Chang (US 20190370591) and in further view of Zhai et al. (herein after will be referred to as Zhai) (US 20210099731).

Regarding claim 10, Gonzalez (modified by White and Chang) disclose the system of claim 1.  Furthermore, Gonzalez does not explicitly disclose
further including a super resolution preprocessor receiving the input stream of the first set of video frames and the output stream of video frames as a training set to develop a machine learning super resolution model relating the input video stream to the output video stream and
wherein the video compression system transmits weights associated with the machine learning super resolution model with the output stream of video frames for use in reconstructing a viewable video stream.
However, Zhai does disclose
further including a super resolution preprocessor receiving the input stream of the first set of video frames and the output stream of video frames as a training set to develop a machine learning super resolution model relating the input video stream to the output video stream and  [See Zhai [0026] Compare the input image to the decoded image, which occurs in the encoder (Fig. 5).]
wherein the video compression system transmits weights associated with the machine learning super resolution model with the output stream of video frames for use in reconstructing a viewable video stream. [See Zhai [0021 and Figs. 2-3] Decoder weights are received from an encoder and are used for image decoding.]
It would have been obvious to the person of ordinary skill in the art at the time of the effective filing date to modify the system by Gonzalez (modified by White and Chang) to add the teachings of Zhai, in order to improve upon video compression by utilizing neural network weights together with video compression [See Zhai [0004 and 0011]].

Regarding claim 11, Gonzalez (modified by White, Chang and Zhai) disclose the system of claim 10.  Furthermore, Gonzalez does not explicitly disclose
further including a super resolution post processor receiving the transmitted weights from the super resolution preprocessor and 
communicating with a bit rate decompressor receiving the output stream of video frames from the bit rate compressor to decompress the output stream into a decompressed video stream;
wherein the super resolution post processor applies the decompressed video stream to the machine learning super resolution model using the transmitted weights to reconstruct the viewable video stream.  
However, Zhai does disclose
further including a super resolution post processor receiving the transmitted weights from the super resolution preprocessor and [See Zhai [0021 and Figs. 2-3] Decoder weights are received from an encoder and are used for image decoding.  The preprocessor and postprocessor are being interpreted as encoder/decoder.]
communicating with a bit rate decompressor receiving the output stream of video frames from the bit rate compressor to decompress the output stream into a decompressed video stream; [See Zhai [0021 and Figs. 2-3] Decoder weights are received from an encoder and are used for image decoding.  The bit rate compressor and the bit rate decompressor are being interpreted as encoder/decoder and the encoder/decoder communicate over a link (Figs. 2-3).]
wherein the super resolution post processor applies the decompressed video stream to the machine learning super resolution model using the transmitted weights to reconstruct the viewable video stream.  [See Zhai [Fig. 3] Decoding process uses the code/weights for image decoding.]
Applying the same motivation as applied in claim 10.

Regarding claim 13, Gonzalez (modified by White, Chang and Zhai) disclose the system of claim 10.  Furthermore, Gonzalez does not explicitly disclose
wherein the weights associated with the machine learning super resolution model are updated on a periodic basis during the video transmission.  
However, Zhai does disclose
wherein the weights associated with the machine learning super resolution model are updated on a periodic basis during the video transmission.  [See Zhai [0026] Adjustments are made periodically to the weights.]
Applying the same motivation as applied in claim 10.

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Gonzalez (US Patent No. 11,240,284) in view of White (US 20200302241) in view of Chang (US 20190370591) in view of Zhai (US 20210099731) and in further view of Georgis (US 20210174197).

Regarding claim 12, Gonzalez (modified by White, Chang and Zhai) disclose the system of claim 10.  Furthermore, Gonzalez does not explicitly disclose
wherein the machine learning model of the first and super resolution post processors are a deep neural network being a convolutional neural network having more than three layers.  
However, Georgis does disclose
wherein the machine learning model of the first and super resolution post processors are a deep neural network being a convolutional neural network having more than three layers.  [See Georgis [0023 and Fig. 1] N numbers of layers of a first neural network and shows atleast 4 layers.  Also, see 0013, CNN.]
It would have been obvious to the person of ordinary skill in the art at the time of the effective filing date to modify the system by Gonzalez (modified by White, Chang and Zhai) to add the teachings of Georgis, in order to improve upon neural network processing by increasing the number of layers utilized in a neural network to obtain an optimum solution.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Gruteser (US 20210110191) – para. 0070-0075 – Object detection is based on CNN and dynamic ROI encoding where the ROI is compressed at a higher quality compared to the background or non-ROI area.
Varadarajan (US 20190007690) – para. 0011- Object detection using any deep neural network and the ROI is compressed at a higher quality compared to compressing all areas equally.
Zhang (US 20170150148) – para. 0033 – ROI identification is based on machine learning.  

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAMES T BOYLAN whose telephone number is (571)272-8242.  The examiner can normally be reached on Monday-Friday 7am-3pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JAMIE ATALA can be reached on 571-272-7384.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/JAMES T BOYLAN/Primary Examiner, Art Unit 2486