Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This Office Action is in response to an AMENDMENT entered on August 9, 2022 for patent application 16/618,335 filed on November 29, 2019.
 

Claims 1-8 are pending.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (Pub. No.: US 2017/0347061) in view of Glennon et al. (Pub. No.: US 2016/0301944).
Regarding claim 1, Wang discloses a method for enhancing resolution at a server for providing video data for streaming, the method comprising: a processing operation for processing the video data (Fig. 22, paras. [0532]-[0554]); a generating operation for acquiring grid generation pattern information based on the processed video data and generating a neural network file required to enhance resolution of the video data based on the grid generation pattern information (paras. [0017], [0080], [0141], [0201], [0574], Figs. 21, 23 and 24); and a transmitting operation for, in response to reception of a streaming request from a user device, dividing requested video data and a neural network file required to recover resolution of the requested video data and transmitting the divided video data and the divided neural network file to the user device (paras. [0068], [0529], [0546], [0570], Figs. 23 and 24, paras. [0116]-[0128]; “the second node sends a request… for transmission of the selected algorithm.” Initially a reference to an algorithm (i.e. neural network file) is sent, but the actual algorithm can be sent, also.). Although Wang discloses using convolutional neural networks (para. [0129], for example), it could be argued that Wang does not explicitly disclose wherein the generating operation comprises acquiring the grid generation pattern corresponding to at least one difference between the video data processed and original video data unprocessed, by inputting the processed video data and the original video data into a Convolutional Neural Network (CNN) algorithm. However, in analogous art, Glennon discloses that when compressing video, one step can comprise calculating differences between the original video data stream and the filtered/compressed video data stream, wherein “the video reconstruction comparator 103 compares the reconstructed video stream to the original video datastream to calculate differences between two datastreams. For example, as any decompression and/or video interpolation after filtering introduces errors, these errors can be captured by the video reconstruction comparator 103. From there, the generator 104 generates the decompression tool, in the process element 208, by encoding those errors, any decimation filter factors, and/or any codec compression factors such that they may be included in the filtered/compressed datastream for subsequent use in decompression. In this regard, the combiner 105 combines the filtered/compressed video datastream with the decompression tool, in the process element 209, and transmits and/or stores the compressed video data, in the process element 210 (para. [0023]; See also Fig. 1, elements 103 and 104; Fig. 2, element 207).” Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Wang to allow for the generating operation to comprise acquiring the grid generation pattern corresponding to at least one difference between the video data processed and original video data unprocessed, by inputting the processed video data and the original video data into a Convolutional Neural Network (CNN) algorithm, wherein Wang discloses the concept of convolutional neural networks (para. [0129], for example). This would have produced predictable and desirable results, in that it would allow for additional relevant information to be used in order to potentially improve compression techniques and reduce bandwidth usage and/or improve image quality.
Regarding claim 2, the combination of Wang and Glennon discloses the method of claim 1, and further discloses wherein the generating operation comprises: a file generating operation for generating a basic neural network file based on a plurality of video data items included in a preset data set; an additional learning operation for, in response to a determination that any acquired new video data satisfies an additional learning condition, performing additional learning on the new video data, wherein the additional learning is performed through an artificial neural network algorithm to which the basic neural network file is applied; and a specialized neural network file generating operation for generating a downscaled file of the new video data as a result of the additional learning and a specialized neural network file corresponding to the new video data (Wang, paras. [0025], [0102]-[0103], [0132] and [0447]).
Regarding claim 3, the combination of Wang and Glennon discloses the method of claim 2, and further discloses wherein the additional learning operation comprises an operation for determining whether the additional learning condition is satisfied according to a structural similarity (SSIM) and a peak-signal-to-noise ratio (PSNR) that are obtained by performing resolution recovery on the downscaled file of the new video data based on the basic neural network (Wang, para. [0318]).
Regarding claim 4, the combination of Wang and Glennon discloses the method of claim 1, and further discloses wherein the processing operation comprises: a dividing operation for dividing the video data into a plurality of chucks by bundling a plurality of frames having a match rate of image objects being equal to or greater than a reference into one chunk; and a size changing operation for performing primary change to reduce a size of an image included in the video data by a preset value from an original size and selectively performing secondary change to enlarge the image having gone through the primary change to the original size (Wang, paras. [0378], [0386] and [0411]).
Regarding claim 5, the combination of Wang and Glennon discloses the method of claim 3, and further discloses wherein: the processing operation comprises a characteristic area extracting operation for extracting a characteristic area including a characteristic image on the basis of each frame or division unit of the video data and assigning a learning importance to the extracted characteristic area (Wang, para. [0371]); and the characteristic area comprises an image object of which an image object importance corresponding to a content field is equal to or greater than a preset value (Wang, paras. [0079], [0184] and [0491]).
Regarding claim 8, Wang discloses a method for enhancing resolution at a server for providing video data for streaming, the method comprising: a processing operation for processing the video data (Fig. 22, paras. [0532]-[0554]); a generating operation for acquiring grid generation pattern information based on the processed video data and generating a neural network file required to enhance resolution of the video data based on the grid generation pattern information (paras. [0017], [0080], [0141], [0201], [0574], Figs. 21, 23 and 24); a transmitting operation for, in response to reception of a streaming request from a user device, dividing requested video data and a neural network file required to recover resolution of the requested video data and transmitting the divided video data and the divided neural network file (paras. [0068], [0529], [0546], [0570], Figs. 23 and 24, paras. [0116]-[0128]; “the second node sends a request… for transmission of the selected algorithm.” Initially a reference to an algorithm (i.e. neural network file) is sent, but the actual algorithm can be sent, also.), and an operation for matching the divided video data and the divided neural network file and performing artificial neural network algorithm computation on the divided video data using the matched neural network file to recover resolution of the divided video data (Fig. 24, element 2460, para. [0571]). Although Wang discloses using convolutional neural networks (para. [0129], for example), it could be argued that Wang does not explicitly disclose wherein the generating operation comprises acquiring the grid generation pattern corresponding to at least one difference between the video data processed and original video data unprocessed, by inputting the processed video data and the original video data into a Convolutional Neural Network (CNN) algorithm. However, in analogous art, Glennon discloses that when compressing video, one step can comprise calculating differences between the original video data stream and the filtered/compressed video data stream, wherein “the video reconstruction comparator 103 compares the reconstructed video stream to the original video datastream to calculate differences between two datastreams. For example, as any decompression and/or video interpolation after filtering introduces errors, these errors can be captured by the video reconstruction comparator 103. From there, the generator 104 generates the decompression tool, in the process element 208, by encoding those errors, any decimation filter factors, and/or any codec compression factors such that they may be included in the filtered/compressed datastream for subsequent use in decompression. In this regard, the combiner 105 combines the filtered/compressed video datastream with the decompression tool, in the process element 209, and transmits and/or stores the compressed video data, in the process element 210 (para. [0023]; See also Fig. 1, elements 103 and 104; Fig. 2, element 207).” Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Wang to allow for the generating operation to comprise acquiring the grid generation pattern corresponding to at least one difference between the video data processed and original video data unprocessed, by inputting the processed video data and the original video data into a Convolutional Neural Network (CNN) algorithm, wherein Wang discloses the concept of convolutional neural networks (para. [0129], for example). This would have produced predictable and desirable results, in that it would allow for additional relevant information to be used in order to potentially improve compression techniques and reduce bandwidth usage and/or improve image quality.


Claims 6 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (Pub. No.: US 2017/0347061) in view of Glennon et al. (Pub. No.: US 2016/0301944), and further in view of Kamiya (Pat. No.: US 5,293,454).
Regarding claim 6, the combination of Wang and Glennon discloses the method of claim 1, wherein the generating operation comprises: a learning importance identifying a learning importance assigned to a characteristic area or a specific frame chuck of learning video data and extracting option information indicated by the learning importance (Wang, para. [0371]); and a neural network file calculating operation for inputting original data in an original size and processed data reduced to a preset rate in the learning video data (Wang, paras. [0378], [0386] and [0411]) into a convolution neural network (CNN) algorithm to be learned (Wang, paras. [0129]-[0130], [0317]), wherein a neural network file is generated including a parameter and an activation function of an artificial neural network (Wang, para. [0551]), the parameter and the activation function which cause a match rate between a computation result value obtained by inputting the processed data into an artificial neural network and the original data to be equal to or greater than a preset value (Wang, paras. [0378], [0386] and [0411]), but the combination does not disclose wherein the option information comprises a learning number and information regarding whether learning is performed through similar data. However, in analogous art, Kamiya discloses that “[i]n evaluation by the neural network, the boundaries of the above categories, which were obtained by learning, are used for outputting the categories to which the inputted patterns belong. Therefore, it is essential that the learning patterns used for learning of the neural network should be the type (1) learning patterns. Namely, the type (2) learning patterns are disposed adjacent to the central portion of category A and thus, do not greatly contribute to learning of the neural network, which is performed for obtaining the boundaries of the categories, thereby resulting in extreme increase of learning time. Meanwhile, since the type (3) learning patterns are exceptional patterns or defective patterns, learning based on the type (3) learning patterns becomes learning regarding exceptional or improper examples, i.e. excessive learning. Hence, learning based on the type (3) learning patterns increases learning time extremely and degrades classification performance at the time of evaluation (col. 3, ln. 23-41).” Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Wang and Glennon to allow for the option information to comprise a learning number and information regarding whether learning is performed through similar data. This would have produced predictable and desirable results, in that it would allow for more efficient learning and allocation or resources.
Regarding claim 7, the combination of Wang, Glennon and Kamiya discloses the method of claim 6, and further discloses wherein: the generating operation further comprises a similar data acquiring operation for, when performing learning on learning data with a learning importance set thereto, in response to identifying an instruction for performing leaning using the similar data, acquiring the similar data similar to a target image to be learned; and the similar data is acquired based on similarity in resolution and color combinations (Wang, para. [0489]. This claim is rejected on the same grounds as claim 6.).


Response to Arguments
Applicant's arguments filed August 9, 2022 have been fully considered but they are not persuasive.
Regarding Applicant’s arguments on pages 7-10:
Claims 1-5 and 8 are rejected under 35 U.S.C. 102(a)() as being anticipated by Wang et al. (Pub. No.: US 2017/0347061) in view of Glennon et al. (Pub. No.: US 2016/0301944). 
Claim 1 recites "A method for enhancing resolution at a server for providing video data for streaming, the method comprising: 
a processing operation for processing the video data; a generating operation for acquiring grid generation pattern information based on the processed video data and generating a neural network file required to enhance resolution of the video data based on the grid generation pattern information; and 
a transmitting operation for, in response to reception of a streaming request from a user device, dividing requested video data and a neural network file required to recover resolution of the requested video data and transmitting the divided video data and the divided neural network file to the user device, wherein the generating operation comprises acquiring the grid generation pattern corresponding to at least one difference between the video data processed and original video data unprocessed, by inputting the processed video data and the original video data into a Convolutional Neural Network (CNN) algorithm."(emphasis added).
An aspect(s) of the present invention discloses the followings: 
[0093] The neural network file calculator 132c may perform learning by inputting a material processed by the material processor 131 into a preset artificial neural network model. In this case, the neural network file calculator 132c may extract information on a grid generated in the course of changing original data into processed data (grid generation pattern information), by inputting the original data and the processed data (reduced to a preset rate) into a CNN algorithm). More specifically, the grid generation pattern information calculated by the neural network file calculator 132c may be calculated based on a difference between the original data and the processed data, and may include pattern information regarding a location of the grid, and color change of the grid, and the like."(see paragraph[0093] of the present invention). 
The Office Action acknowledged that Wang et al. fails to disclose "wherein the generating operation comprises acquiring the grid generation pattern corresponding to at least one difference between the video data processed and original video data unprocessed, by inputting the processed video data and the original video data into a Convolutional Neural Network (CNN) algorithm."
Glennon et al. discusses "[abstract] Systems and methods presented herein provide for video compression and decompression. In one embodiment, a video compression system includes a decimation filter operable to receive a video datastream. The system also includes a video codec operable to compress the filtered video datastream and a comparator operable to compare the video datastream to the filtered-compressed video datastream, and to determine a difference video datastream based on the comparison. The system also includes a generator operable to generate a tool for decompressing the filtered- compressed video datastream based on the difference video datastream. 
[0023] In the process element 207, the video reconstruction comparator 103 compares the reconstructed video stream to the original video datastream to calculate differences between two datastreams.  x n  these errors can be captured by the video reconstruction comparator 103. From there, the generator 104 generates the decompression tool, in the process element 208, by encoding those errors, any decimation filter factors, and/or any codec compression factors such that they may be included in the filtered/compressed datastream for subsequent use in decompression. 
In this regard, the combiner 105 combines the filtered/compressed video datastream with the decompression tool, in the process element 209, and transmits and/or stores the compressed video data, in the process element 210.(see abstract and paragraph[0023] of Glennon et al.). 
As shown above, Glennon et al. merely discloses "to filter the video datastream to remove spatial data components and temporal data components of the video datastream." 
However, Glennon et al. fails to disclose [the language of claim 1] as recited in claim 1.
Wang discusses the followings: 
Thus, it is respectfully submitted that none of the cited references teach or suggest the features as recited in claim 1. 


Examiner’s response:
Examiner sees no specific arguments which may be responded to. Applicant simply alleges that the prior art of record does not disclose the claim language in question, and then cites language from Applicant’s specification and further language from the cited art. Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.


Regarding Applicant’s arguments on pages 10-13:
In addition, claim 1 recites "a generating operation for acquiring grid generation pattern information based on the processed video data and generating a neural network file required to enhance resolution of the video data based on the grid generation pattern information..." 
Wang reference fails to disclose "generating a neural network file" as recited in claim 1. 
FIG. 9 of the present invention clearly shows aspect(s) of the present invention as shown below.
An aspect of the present invention discusses "[0019] According to an embodiment of the present disclosure, , a method for enhancing resolution at a server for providing video data for streaming includes a processing operation for generating the video data, a generating operation for acquiring grid generation pattern information based on the processed video data and generating a neural network file required to enhance resolution of the video data based on the grid generation pattern information, and a transmitting operation for, in response to reception of a streaming re quest from a user device, dividing requested video data and a neural network file required to recover resolution of the requested video data and transmitting the di vided video data and the divided neural network file to the user device."(see para graph[0019] of the present invention). 
Aspect(s) of the present invention discusses "[33]ln the user device 200, each video file segment may be matched with a corresponding neural network file segment, thereby enhancing resolution. More specifically, the neural network file may include data on an artificial neural network algorithm for recovering resolution of the video file, and accordingly, the user device may perform an artificial neural network computing process using the respective video file segments and the neural network file segments so as to recover resolution." [40] The neural network file may contain information necessary to recover resolution of damaged image data to be similar to original data through an artificial neural network algorithm, and may include information on various parameters necessary to be selected when the artificial neural network algorithm is driven."(see paragraphs[0033] and [0040] of present invention). 
Wang fails to disclose "dividing requested video data and a neural network file required to recover resolution of the requested video data and transmitting the divided video data and the divided neural network file to the user device" as recited in claim 1. 
Wang et al. discusses "[0017] To stream video to consumers using available streaming data bandwidth, media content providers can down-sample or transcode the video content for transmission over a network at one or a variety of bitrates so that the resolution of the video can be appropriate for the bitrate available over each connection or to each device and correspondingly the amount of data transferred over the network can be better matched to the available reliable data rates. For example, a significant proportion of current consumer Internet connections are not able to reliably support continuous streaming of video at an Ultra HD resolution, so video needs to be streamed at a lower quality or lower resolution to avoid buffering delays. [0080] Optionally, the method is performed at a first network node within a network. Furthermore, optionally the hierarchical algorithm may be transmitted to a second network node in the network. [0141] According to an aspect, there is provided a method for enhancing visual data when communicating visual data over a network from a first node to a second node, the method at the first node comprising the steps of: reducing the quality of one or more sections of higher- quality visual data to one or more sections of lower-quality visual data; developing at least one hierarchical algorithm operable to increase the quality of the one or more sections of lower quality visual data using the one or more sections of higher-quality visual data to enhance the developed at least one hierarchical algorithm, wherein the developed at least one hierarchical algorithm corresponds to the one or more sections of lower quality visual data; transmitting the one or more sections of lower- quality visual data to the second node; and communicating to the second node at least one of the developed at least one hierarchical algorithms that corresponds to the one or more sections of lower-quality visual data transmitted to the second node; wherein the second node is able to substantially reproduce the one or more sections of higher-quality visual data from the transmitted one or more sections of lower- quality visual data using the developed at least hierarchical algorithm that corresponds to the one or more sections of lower-quality visual data. [0201] In some embodiments off-site, or 'cloud computing', systems allow for the performance of computerised tasks on a server not necessarily local to the site of the recording or reconstruction of a section of visual data. This allows for more powerful servers to be used according to the budget available for such services, and hence increased parallel processing of different example based models in some of these embodiments. The off-site system used in such embodiments could also provide a backup of any sections of visual data passing through the servers, thereby offering a solution in the case of loss of data at a site local to the recording or reconstruction of a section of visual data. If the computing system is scalable, as is preferable, then a growing amount of visual data processing work can be accommodated should the need arise in these embodiments. [0574] Alternatively, in other embodiments, the image artefact removal process can be performed as part of the upscaling process itself. In such embodiments, several reconstruction models can be trained to reproduce the higher resolution image or video at the first node from a number of different downsampled images or videos, each with a different artefact severity. In  such embodiments, these can either all be transmitted with the downsampled video, or the required model can be transmitted to the second network node from the first node once a request for the model containing the artefact severity of the received downsampled image or video has been sent from the second node to the first node. In either case, in some embodiments the model best matching the artefact severity of the received downsampled image or video is used to substantially recreate the original high resolution video."(see paragraphs[0017],[0080],[0141],[0201], and [0574] of Wang). 
As shown above, it is unclear whether Wang discusses grid generation pattern information as recited in claim 1. 


Examiner’s response:
Examiner disagrees with Applicant’s assertion that “it is unclear whether Wang discusses grid generation pattern information as recited in claim 1.” Although Wang does not use the term ‘grid,’ figure 21 of Wang, which was cited in the rejection of claim 1, clearly shows a grid, and the disclosure of Wang regarding this figure, as well as the remainder of the cited sections, disclose what can reasonably be understood by one of ordinary skill in the art before the effective filing date of the claimed invention as grid generation pattern information.


Regarding Applicant’s arguments on pages 13-17:
Further, Wang fails to disclose "generating neural network file" as recited in claim 1. 
As noted above, Wang et al. fails to disclose "a generating operation for acquiring grid generation pattern information based on the processed video data and generating a neural network file required to enhance resolution of the video data based on the grid generation pattern information..." as recited in claim 1. 
In addition, claim 1 recites "a transmitting operation for, in response to reception of a streaming request from a user device, dividing requested video data and a neural network file required to recover resolution of the requested video data and transmitting the divided video data and the divided neural network file to the user device." 
Wang et al. discusses "[0068] Visual data artefacts and/or noise can often be introduced into visual data during processing, particularly during processing to compress visual data or during transmission of the visual data across a network. Such introduced artefacts can include blurring, pixelation, blocking, ringing, aliasing, missing data, and other marks, blemishes, defects, and abnormalities in the visual data. These artefacts in visual data can degrade the user experience when viewing the visual data. Furthermore, these artefacts in visual data can also reduce the effectiveness of visual data processing techniques, such as image super resolution, as well as other visual tasks such as image classification and segmentation, that use processed images as an input. [0529] To improve on the above mentioned approaches for image artefact removal, it is proposed to use deep learning techniques and neural networks such as recurrent neural network and convolutional neural network models. [0546] In embodiments, the compression or quantisation process introduces image artefacts to the scene, producing a training scene that will be used together with the original uncompressed and unquantised version of the scene to train and/or optimise an image artefact removal model. In these embodiments, different levels of artefact severity can be introduced by varying the level of compression and/or quantisation performed at this stage. [0570] In some embodiments, the image artefact removal model identified for each scene is then applied to the downsampled scene at step 2450 in order to substantially recreate the original downsampled image transmitted by the transmitting node. Alternatively, in other embodiments the identified model can be used after the upscaling process of step 2460 to substantially remove the upscaled image artefacts from the recreated higher resolution video. 
As noted above, Wang et al. discusses "[0116] By transmitting a lower-quality version of a visual data in some embodiments, such as a section of low-quality visual data or series of visual data sections, together with a library reference to an algorithm (i.e. any or all of an algorithm, reconstruction algorithm, model, reconstruction model, parameters or reconstruction parameters) to aid reconstruction of higher quality visual data, such as a high-resolution video frame or series of frames, in at least some embodiments less data can be transferred over a network to enable high-quality visual data to be viewed compared to transmitting the high quality visual data alone. [0117] Optionally, the steps of transmitting the one or more sections of lower-quality visual data to the second node and transmitting to the second node one or more references corresponding to the one or more selected algorithms that correspond to the one or more sections of lower-quality visual data transmitted to the second node occur together, or substantially simultaneously.
[0118] By transmitting both visual data and one or more references to algorithms in a library of algorithms, a reduced amount of data can be transmitted as only one or more references to algorithms are transmitted instead of the algorithms themselves. 
[0119] Optionally, the algorithm is a hierarchical algorithm. [0120] In some embodiments, the algorithms used are hierarchical algorithms. It should be noted that algorithms could also be referred to as models, representations, parameters or functions. In some of these embodiments, hierarchical algorithms can enable substantially accurate reconstruction of visual data, e.g. produce a higher quality high-resolution video from the low-resolution video that is transmitted, for example where quality can be measured by a low error rate in comparison to the original high-resolution video. [0121] Optionally, the algorithm is a non-linear algorithm. 
[0122] In some embodiments, the use of non-linear algorithms can be more flexible and expressive than dictionary learning based approaches, and use fewer coefficients for the reconstruction of higher-quality visual data. In some of these embodiments, this can allow the reconstruction of the sections of higher-quality to be substantially accurate. [0123] Optionally, the algorithm is selected from a library of algorithms stored at any of: the first node; the second node; a centralised database in the network; or a distributed database in the network. [0124] In some embodiments, a library of algorithms can allow for the selection of an substantially optimal, if not the most optimal, algorithm available in the library to reconstruct the lower-quality visual data into higher quality visual data. In some of these embodiments, the use of a library can also allow the selected algorithm to be referred to by a reference identifier. In certain embodiments, libraries can be provided at both nodes, and/or in centralised or distributed databases, and optionally can use common or synchronised reference identifiers for the same algorithms. 
[0125] Optionally, the received reference corresponds to an algorithm stored in a library at the second node. [0126] By providing common or synchronised libraries of algorithms at both the first and the second nodes in at least some embodiments, and by transmitting a reference or reference identifier when transmitting the corresponding lower-quality visual data to allow selection of matching algorithms from both libraries using the reference identifier to identify the selected algorithm, only the reference identifier and the lower-quality visual data needs to be transmitted  between the nodes thus data transmission is reduced as the algorithm itself doesn't need to be transmitted. [0127] Optionally, if the second node cannot identify the selected algorithm, the second node sends a request to any of: the first node; a centralised database; or a distributed database for transmission of the selected algorithm to the second node. [0128] In some embodiments, by configuring the second node to be able to request models from a node (for example a first node or alternatively another node or a node to which multiple libraries are synchronised, depending on embodiment) in situations where the libraries at the second node and other node are not synchronised, the higher-quality visual data can still be reconstructed even if the transmitted reference does not correspond to an algorithm stored at the second node. This can prevent errors in the reconstruction process in some embodiments."(see paragraphs[0068],[0529],[0546],[0570], [0116]-[0128] of Wang). 
As noted above, Wang discusses centralized database or a distributed database for transmission, but fails to disclose "a transmitting operation for, in response to reception of a streaming request from a user device" as recited in claim  1. 
Further, The Supreme Court in KSR Int'l Co. v. Teleflex Inc., 550 U.S. 398, 415-421, 82 USPQ2d 1385, 1395-97 (2007) identified a number of rationales to support a conclusion of obviousness which are consistent with the proper "functional approach" to the determination of obviousness as laid down in Graham. The key to supporting any rejection under 35 U.S.C. 103 is the clear articulation of the reason(s) why the claimed invention would have been obvious. The Supreme Court in KSR noted that the analysis supporting a rejection under 35 U.S.C. 103 should be made explicit. In Ball Aerosol v. Ltd. Brands, 555 F.3d 984, 89 USPQ2d 1870 (Fed. Cir. 2009), the Federal Circuit offered additional instruction as to the need for an explicit analysis. The Federal Circuit explained that the Supreme Court's requirement for an explicit analysis does not require record evidence of an explicit teaching of a motivation to combine in the prior art. 
The Office Action merely asserts many paragraphs of Wang for rejecting  claim 1. However, it is unclear which portions of Wang discusses exactly "a transmitting operation for, in response to reception of a streaming request from a user device, dividing requested video data and a neural network file required to recover resolution of the requested video data and transmitting the divided video data and the divided neural network file to the user device" as recited in claim 1. 
Thus, Wang et al. fails to disclose at least "grid generation pattern formation and dividing neural network file" as recited in claim 1. 
Accordingly, it is respectfully submitted that the combination of Wang et al. and Glennon et al. fails to teach or suggest the features as recited claim 1. 


Examiner’s response:
Applicant states that “it is unclear which portions of Wang discusses exactly "a transmitting operation for, in response to reception of a streaming request from a user device, dividing requested video data and a neural network file required to recover resolution of the requested video data and transmitting the divided video data and the divided neural network file to the user device. "” 
Examiner cited, in part, paragraphs [0116]-[0128] in reference to the claim limitation in question. Paragraph [0116] states:
By transmitting a lower-quality version of a visual data in some embodiments, such as a section of low-quality visual data or series of visual data sections, together with a library reference to an algorithm (i.e. any or all of an algorithm, reconstruction algorithm, model, reconstruction model, parameters or reconstruction parameters) to aid reconstruction of higher quality visual data, such as a high-resolution video frame or series of frames, in at least some embodiments less data can be transferred over a network to enable high-quality visual data to be viewed compared to transmitting the high quality visual data alone.

This paragraph alone discloses essentially the limitation in question, and the other citations offer further details. Thus, Examiner maintains the rejection.


Regarding Applicant’s arguments on pages 17-19:
Claim 2 recites "The method of claim 1, wherein the generating operation comprises: 
a file generating operation for generating a basic neural network file based on a plurality of video data items included in a preset data set; 
an additional learning operation for, in response to a determination that any acquired new video data satisfies an additional learning condition, performing additional learning on the new video data, wherein the additional learning is performed through an artificial neural network algorithm to which the basic neural network file is applied; and 
a specialized neural network file generating operation for generating a downscaled file of the new video data as a result of the additional learning and a specialized neural network file corresponding to the new video data." Wang et al. discusses "[0025] The above formula further illustrates a direct relationship between the bitrate and the resolution of the video, i.e. variables w and h. In order to reduce the resolution of video, several techniques exist to downscale the resolution of video data to reduce the bitrate. [0102] Optionally, the hierarchical  algorithm is developed from a known hierarchical algorithm. [0103] In some embodiments, developing a new hierarchical algorithm from a known hierarchical algorithm that was trained on similar visual data to the visual data on which the new algorithm is to be trained can reduce the time and/or computational effort required to train the new hierarchical algorithm. [0132] In some embodiments, by dividing the visual data into smaller sections, where the sections can be sequences of frames or portions of one or more frames, and where the division can be based on a particular metric for similarity, more efficient models can be selected. For example, in some embodiments multiple sections can be grouped, all of which comprise part of a landscape shot, and one model can be used to reconstruct the scene, i.e. sequence of frames, as opposed to a using a different model for every separate frame in the scene. In some embodiments, if the next scene in the visual data is very different (for example a scene with significant movement after a scene showing a still landscape), then the scene can be detected as being very different and a new model can be selected accordingly for the scene. In some embodiments, specific models can be selected for each scene or section, allowing at least some optimisation or adapting of the reconstruction model(s) in comparison to the use of a generic model for the whole of the visual data. [0447] Optionally, within step 2750, in some embodiments the full-resolution frames can be grouped into scenes or sections of frames having common features, otherwise known as "scene selection". The video data is split or clustered into scenes to enable more specific optimisation. By scene, it is meant a consecutive group or sequence of frames comprising a section of video, which at the coarsest level can be the entire video or at the most granular level can be a single frame. In some embodiments, the scenes can be arranged according to the order in which they were filmed or into a sequence for appropriate decoding of any group of pictures (as scenes will typically comprise related visual content and appropriate ordering can allow compression to work efficiently, for example)."(see paragraphs[0025],[0102]-[0103], [0132], and [0447] of the present invention). 
As noted above, Wang et al. merely discusses "by dividing the visual data into smaller sections, where the sections can be sequences of frames or portions of one or more frames, and where the division can be based on a particular metric for similarity, more efficient models can be selected. For example, in some embodiments multiple sections can be grouped, all of which comprise part of a landscape shot, and one model can be used to reconstruct the scene, i.e. sequence of frames, as opposed to a using a different model for every separate frame in the scene. a new hierarchical algorithm from a known hierarchical algorithm that was trained on similar visual data to the visual data on which the new algorithm is to be trained can reduce the time and/or computational effort required to train the new hierarchical algorithm." 
Thus, Wang et al. fails to disclose "a specialized neural network file generating operation for generating a downscaled file of the new video data as a result of the additional learning and a specialized neural network file corresponding the downscaled file of the new video data based on the basic neural network" as recited in claim 2. 
Accordingly, it is respectfully submitted that Wang et al. fails to disclose features as recited claim 2. 


Examiner’s response:
Examiner sees no specific arguments which may be responded to. Applicant simply alleges that the prior art of record does not disclose the claim language in question, and then cites language from Applicant’s specification and further language from the cited art. Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.


Regarding Applicant’s arguments on pages 19-23:
Claim 3 recites "The method of claim 2, wherein the additional learning operation comprises an operation for determining whether the additional learning condition is satisfied according to a structural similarity (SSIM) and a peak-signal-to- noise ratio (PSNR) that are obtained by performing resolution recovery on the downscaled file of the new video data based on the basic neural network." Claim 3 is also allowable due at least to its dependency from claim 1, as well as for the additional recitations therein. 
In addition, it is respectfully submitted that Wang et al. fails to disclose the features due to the same or similar rationales as claim 1. 
Further, claims 4-5 also allowable due at least their dependencies of claim 1, as well as for the additional recitations therein. 
Claims 6 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over  Wang et al.(Pub. No.: US 2017/0347061) in view of Glennon et al. (Pub. No.: US 2016/0301944), and further in view of Kamiya (Pat. No.: US 5,293,454). 
Claim 6 recites "The method of claim 1, wherein the generating operation comprises: 
a learning importance identifying a learning importance assigned to a characteristic area or a specific frame chuck of learning video data and extracting option information indicated by the learning importance; and 
Wang et al. recites "[0371] The series of convolutions in a convolutional neural network model of some embodiments allow the neural network to be used to search for pixel correlations in a region far larger than the initial patch decomposition, but applying a weighting to weaken the importance of correlations for locations further away from the pixel of interest. In contrast, linear sparse coding approaches like dictionary learning are restricted to looking for correlations in an initial patch size, to avoid the computational complexity of searching the entire image. As a result, the method of these embodiments can more fully exploit the natural or non-local redundancies of the video frames and between a sequence of video frames." (see paragraph[0371] of Wang et al.). 
Further, claim 6 recites "a neural network file calculating operation for inputting original data in an original size and processed data reduced to a preset rate in the learning video data into a convolution neural network (CNN) algorithm to be learned, wherein a neural network file is generated including a parameter and an activation function of an artificial neural network, the parameter and the activation function which cause a match rate between a computation result value obtained by inputting the processed data into an artificial neural network and the original data to be equal to or greater than a preset value, wherein the option information comprises a learning number and information regarding whether learning is performed through similar data."
Wang et al. discloses "[0378] Optionally, at step 90 (or step 190), in some embodiments the full-resolution frames can be grouped into scenes or sections of frames having common features, otherwise known as "scene selection". The video data is split or clustered into scenes to enable more specific training and optimisation. By scene, it is meant a consecutive group or sequence of frames, which at the coarsest level can be the entire video, a single frame or at the most granular level can be or a section/segment of a frame.[0386] Regardless whether the video has been broken into frames or scenes (i.e. groups of multiple frames) in step 90 (or step 190) or remains as a sequence of frames from step 80 (or step 190), each frame is down-sampled into lower resolution frames at a suitably lower resolution. Optionally, in some embodiments this step can occur before the frames are grouped into scenes in step 80 (or step 190), so step 90 (or step 190) can be exchanged with step 90 (or step 190) in these alternative embodiments. The lower- resolution frame can, for example, be 33% to 50% of the data size relative to the data size of the original-resolution frame, but in should be appreciated that in embodiments the lower resolution can be any resolution that is lower than the original resolution of the video."(see paragraph [0378] of Wang et al.) The Office Action acknowledges that "Wang does not disclose wherein the option information comprises a learning number and information regarding whether learning is performed through similar data." 
Kamiya discloses "In evaluation by the neural network, the boundaries of the above categories, which were obtained by learning, are used for outputting the categories to which the inputted patterns belong. Therefore, it is essential that the learning patterns used for learning of the neural network should be the type (1) learning patterns. Namely, the type (2) learning patterns are disposed adjacent to the central portion of category A and thus, do not greatly contribute to learning of the neural network, which is performed for obtaining the boundaries of the categories, thereby resulting in extreme increase of learning time. Meanwhile, since the type (3) learning patterns are exceptional patterns or defective patterns, learning based on the type (3) learning patterns becomes learning regarding exceptional or improper examples, i.e. excessive learning. Hence, learning based on the type (3) learning patterns increases learning time extremely and degrades classification performance at the time of evaluation."(see col. 3, lines 23-41 of Kamiya). 
As shown above, Kamiya discusses about merely learning time, but fails to disclose "wherein the option information comprises a learning number” as recited in claim 6. 
Accordingly, it is respectfully submitted that the combination of Wang et al. Glennon et al., and Kamiya fails to teach or suggest the features as recited in claim 6. 
Claim 7 recites "The method of claim 6, wherein: the generating operation further comprises a similar data acquiring operation for, when performing learning on learning data with a learning importance set thereto, in response to identifying an instruction for performing leaning using the similar data, acquiring the similar data similar to a target image to be learned; and the similar data is acquired based on similarity in resolution and color combinations."(emphasis added). 
Wang et al. discusses "[0489] Alternatively, and in order for any delay to be reduced between the arrival of the lower resolution video 1510 and the generation of a higher resolution output video 370, in some embodiments basic scene selection in step 1610 can be accomplished by grouping frames into scenes chronologically, for example by applying time stamps and collecting together a predetermined range of timestamps. Initially, in these embodiments the first frame or section of frame can be analysed and a metric created to enable reference to the library of reconstruction models in step 1620. In such embodiments, if a subsequent frame or section of a subsequent frame is sufficiently similar according to a comparison metric, then the subsequent frame or frames can be included as part of a group of frames with the first frame. In these embodiments, this process can continue until a subsequent frame is not sufficiently similar to the previous frame or frames, at which point the frame or group of frames are collated into a scene in step 410. in these embodiment the process then starts to find the next scene, starting from the insufficiently similar scene. In such embodiments, each scene is processed in the order in which it was decoded from the received video 1510."(see paragraph[0489] of Wang et al.). 
As shown above, paragraph[0489] of Wang merely discusses comparing similarity between subsequent frames, but fails to disclose “[the language of claim 7]” as recited in claim 7. 
As noted above, Wang et al. fails to disclose "the similar data is acquired based on similarity in resolution and color combinations" as recited in claim 7. 
Accordingly, it is respectfully submitted that the combination of Wang et al.
Glennon et al., and Kamiya fails to teach or suggest the features as recited in claim
7.


Examiner’s response:
Examiner sees no specific arguments which may be responded to. Applicant simply alleges that the prior art of record does not disclose the claim language in question, and then cites language from Applicant’s specification and further language from the cited art. Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
Further, Examiner disagrees that “Kamiya discusses about merely learning time.” The cited section from column 3 of Kamiya discusses many different elements of learning patterns. Further, the language in question in claim 6 which Kamiya is brought in to help teach is so broad that the disclosure of Kamiya is detailed enough such that one of ordinary skill in the art before the effective filing date of the claimed invention would have found it obvious to modify Wang based on said teaching in order to arrive at Applicant’s invention as currently claimed. Were Applicant to more narrowly claim their invention, the prior art of record could be overcome.


Conclusion
Claims 1-8 are rejected.
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Joshua D Taylor whose telephone number is (571)270-3755. The examiner can normally be reached Monday - Friday 8 am - 6 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nasser Goodarzi can be reached on 571-272-4195. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Joshua D Taylor/Primary Examiner, Art Unit 2426                                                                                                                                                                                                        October 25, 2022