DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 3, 5, 14 and 16 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 3, 5, 14 and 16 recite the limitation "the group" in 3, 2, 4 and 2 respectively.  There is insufficient antecedent basis for this limitation in the claim.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 6, 9 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Pan (MM'17, October 23-27, 2017 Mountain View, CA, USA, Page 1789-1798) in .
-Regarding claim 1,  Pan discloses a method for implementing processing video, comprising (Figure 2): receiving a text-based description of at least one active scene (Figure 1; Figure 6); representing, by a processor device, the text-based description as a word embedding matrix (Figure 2 sentence (S) ; page 1792, section 3.2.1); using a text encoder implemented by neural network (Figure 2 generator network (G), LSTM-based encoder) to output at least one frame level textual representation (Figure 2 (input of discriminator (D) – frame and motion discriminators path), page 1792, 2nd column, 1st paragraph, “synthetic frame”,                         
                            
                                
                                    f
                                
                                
                                    s
                                    y
                                    n
                                
                                
                                    i
                                
                            
                        
                    ) and at least one video level representation (Figure 2 (input of discriminator (D) – video discriminator path), ), page 1792, 2nd column, 1st paragraph, “synthetic video”,                         
                            
                                
                                    v
                                
                                
                                    s
                                    y
                                    n
                                
                            
                        
                    ) of the word embedding matrix (Figure 2 sentence (S); page 1792, section 3.2.1); generating, by a shared generator, at least one frame by frame video based on the at least one frame level textual representation (Abstract; Figure 2, frame and motion discriminators path (grey path)), the at least one video level representation (Abstract; Figure 2, video discriminator path (blue path)) and noise vectors sampled from a Gaussian distribution (Figure 2, Noise (z); section 3.1, “normal distribution”); generating a frame level and a video level convolutional filter (Figure 2, last two blocks of video discriminator path, and frame & motion discriminators path) of a video discriminator to classify frames and video of the at least one frame by frame video as true or false (Abstract; Figure 2, discriminator network (D), page 1792, section 3.2.2 1st paragraph, “real/fake”, “whether the input video is real” ); and training a Figure 2) that includes the text encoder (Figure 2 generator network (G)), the video discriminator (Figure 2 discriminator Network (D)) and the shared generator in a generative adversarial network (Figure 2, TGANs-C) to convergence (page 1791 section 3, 1st paragraph, “The training … performed by optimizing the generator network and discriminator network …. in a two-player minimax game mechanism”; page 1794, algorithm 1).
Pan is silent to teach a shared generator.
In the same field of endeavor, Tulyakov teaches generating, by a shared generator (Tulyakov: Figure 2, generator                         
                            
                                
                                    G
                                
                                
                                    I
                                
                            
                        
                    ), at least one frame by frame video (Tulyakov: Figure 2, set                         
                            
                                
                                    V
                                
                                ~
                            
                        
                    ) based on the at least one frame level textual representation (Tulyakov: Figure 2,                         
                            
                                
                                    Z
                                
                                
                                    C
                                
                            
                        
                    ), the at least one video level representation (Tulyakov: Figure 2,                         
                            
                                
                                    Z
                                
                                
                                    M
                                
                            
                        
                    ), and noise vectors sampled from a Gaussian distribution (Tulyakov: Figure 2,                         
                            
                                
                                    є
                                
                                
                                    (
                                    k
                                    )
                                
                            
                        
                    , page 4, 1st column, 4th paragraph, line 2 “Gaussian distribution”); generating a frame level and a video level convolutional filter of a video discriminator to classify frames and video of the at least one frame by frame video as true or false (Tulyakov: Figure 2,                         
                            
                                
                                    
                                        
                                            D
                                        
                                        
                                            I
                                        
                                    
                                    ,
                                     
                                    D
                                
                                
                                    V
                                
                            
                            )
                        
                    ; and training a conditional video generator (Tulyakov: Figure 2                        
                            )
                             
                        
                    that includes the text encoder, the video discriminator (Tulyakov: Figure 2,                         
                            
                                
                                    D
                                
                                
                                    V
                                
                            
                            )
                        
                    , and the shared generator (Tulyakov: Figure 2, generator                         
                            
                                
                                    G
                                
                                
                                    I
                                
                            
                        
                    ) in a generative adversarial network to convergence (Tulyakov: page 4, 2nd column, fourth paragraph,  “                        
                            
                                
                                    D
                                
                                
                                    V
                                
                            
                        
                     … sufficient for training                         
                            
                                
                                    G
                                
                                
                                    I
                                
                            
                        
                     and                         
                            
                                
                                    R
                                
                                
                                    M
                                
                            
                             
                        
                    … using                         
                            
                                
                                    D
                                
                                
                                    I
                                
                            
                        
                    …improves the convergence of the adversarial training”, equations (4)-(6)).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Pan with the 
Pan in view of Tulyakov is silent to teach a processor device for implementing the method to process video. 
However, Yu is an analogous art pertinent to the problem to be solved in this application and further discloses a method for implementing processing video, comprising, representing, by a processor device, the text-based description as a word embedding matrix (Yu: Abstract; FIGS. 1-2, 9)
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Pan in view of Tulyakov with the teaching of Yu in order to provide a better implementation of word embedding method based on deep learning.
-Regarding claim 2, the modification further discloses using the trained conditional video generator to generate at least one further video from additional text (Pan: FIG. 2; section 3.2.1, “embedded word sequence”; section 4.2; FIGS. 5-6).
-Regarding claim 3, the modification further discloses generating the at least one further video for an input to a process selected from the group consisting of multimedia applications (Pan: Abstract; section 5, 1st paragraph), generating synthetic datasets (Pan: Abstract, “synthetic datasets”, page 1792, 2nd column, 1st paragraph, “synthetic video”), model-based reinforcement learning systems (Pan: Abstract; Figure 2; algorithm 1), and domain adaptation (Pan: Abstract; Figure 2; algorithm 1).
Pan: Abstract; Figure 2; algorithm 1; section 3.2.2, “real video”, “synthetic one”; equations (3)-(4)).
-Regarding claim 6, the modification further discloses wherein using the trained conditional video generator to generate at least one further video (Pan: Abstract; Figure 2) further comprises: producing a variable length video.
Pan is silent to teach producing a variable length video.
In the same field of endeavor, Tulyakov teaches wherein using the trained conditional video generator to generate at least one further video further comprises: producing a variable length video (Tulyakov: Abstract; Figure 2; page 4, 2nd column, 1st paragraph, “video length K can vary”; FIG. 6).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Pan with the teaching of Tulyakov by using shared generator to generate video from text in order to handle videos of a varying length in latent space and produce high resolution video.
-Regarding claim 9, the modification further discloses determining a photo-realistic video synthesis using a deeper generator- discriminator (Pan: section 3, 1st paragraph, “optimizing the generator network and discriminator network”, “synthetic video …with matched caption”, “enhance the image reality and semantic alignment”; Figures 2, 6).
-Regarding claim 11, Pan is silent to teach wherein the shared frame generator network further comprises: a motion and content decomposed generative adversarial 
In the same field of endeavor, Tulyakov teaches wherein the shared frame generator network (Tulyakov: Figure 2, generator                         
                            
                                
                                    G
                                
                                
                                    I
                                
                            
                        
                    ) further comprises: a motion and content decomposed generative adversarial network (MoCoGAN) that generates a video by mapping a sequence of random vectors to a sequence of video frames (Tulyakov: Abstract; Figure 2).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Pan with the teaching of Tulyakov by using shared generator to generate video from text in order to handle videos of a varying length in latent space and produce high resolution video.
Claims 12-15 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Pan (MM'17, October 23-27, 2017 Mountain View, CA, USA, Page 1789-1798) in view of Tulyakov (arXiv 1707.04993v2 14 Dec 2017) , and further in view of Kliger (U.S. PG-PUB NO. 2018/0336439 A1).
-Regarding claim 12,  Pan discloses that receive a text-based description of at least one active scene (Figure 1; Figure 6); represent the text-based description as a word embedding matrix (Figure 2 sentence (S) ; page 1792, section 3.2.1); and use a text encoder implemented by neural network (Figure 2 generator network (G), LSTM-based encoder) to output at least one frame level textual representation (Figure 2 (input of discriminator (D) – frame and motion discriminators path), page 1792, 2nd column, 1st paragraph, “synthetic frame”,                         
                            
                                
                                    f
                                
                                
                                    s
                                    y
                                    n
                                
                                
                                    i
                                
                            
                        
                    ) and at least one video level representation (Figure 2 (input of discriminator (D) – video discriminator path), ), page 1792, 2nd column, 1st paragraph, “synthetic video”,                         
                            
                                
                                    v
                                
                                
                                    s
                                    y
                                    n
                                
                            
                        
                    ) of the word embedding matrix (Figure 2 sentence (S); page 1792, section 3.2.1); generate, by a shared generator, at least one frame by frame video based on the at least one frame level textual representation (Abstract; Figure 2, frame and motion discriminators path (grey path)), the at least one video level representation (Abstract; Figure 2, video discriminator path (blue path)) and noise vectors sampled from a Gaussian distribution (Figure 2, Noise (z); section 3.1, “normal distribution”); and train a conditional video generator (Figure 2) that includes the text encoder (Figure 2 generator network (G)), the video discriminator (Figure 2 discriminator Network (D)) and the shared generator in a generative adversarial network (Figure 2, TGANs-C) to convergence (page 1791 section 3, 1st paragraph, “The training … performed by optimizing the generator network and discriminator network …. in a two-player minimax game mechanism”; page 1794, algorithm 1).
Pan is silent to teach a shared generator.
In the same field of endeavor, Tulyakov teaches that generate, by a shared generator (Tulyakov: Figure 2, generator                         
                            
                                
                                    G
                                
                                
                                    I
                                
                            
                        
                    ), at least one frame by frame video (Tulyakov: Figure 2, set                         
                            
                                
                                    V
                                
                                ~
                            
                        
                    ) based on the at least one frame level textual representation (Tulyakov: Figure 2,                         
                            
                                
                                    Z
                                
                                
                                    C
                                
                            
                        
                    ), the at least one video level representation (Tulyakov: Figure 2,                         
                            
                                
                                    Z
                                
                                
                                    M
                                
                            
                        
                    ), and noise vectors sampled from a Gaussian distribution (Tulyakov: Figure 2,                         
                            
                                
                                    є
                                
                                
                                    (
                                    k
                                    )
                                
                            
                        
                    , page 4, 1st column, 4th paragraph, line 2 “Gaussian distribution”); and train a conditional video generator (Tulyakov: Figure 2                        
                            )
                             
                        
                    that includes the text encoder, the video discriminator (Tulyakov: Figure 2,                         
                            
                                
                                    D
                                
                                
                                    V
                                
                            
                            )
                        
                    , and the shared generator (Tulyakov: Figure 2, generator                         
                            
                                
                                    G
                                
                                
                                    I
                                
                            
                        
                    ) in a generative adversarial Tulyakov: page 4, 2nd column, fourth paragraph,  “                        
                            
                                
                                    D
                                
                                
                                    V
                                
                            
                        
                     … sufficient for training                         
                            
                                
                                    G
                                
                                
                                    I
                                
                            
                        
                     and                         
                            
                                
                                    R
                                
                                
                                    M
                                
                            
                             
                        
                    … using                         
                            
                                
                                    D
                                
                                
                                    I
                                
                            
                        
                    …improves the convergence of the adversarial training”, equations (4)-(6)).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Pan with the teaching of Tulyakov by using shared generator to generate video from text in order to handle videos of a varying length in latent space and produce high resolution video.
Pan in view of Tulyakov is silent to teach a computer system for processing video, comprising: a processor device operatively coupled to a memory device, the processor device being configured to generate a video from text description.
However, Kliger is an analogous art pertinent to the problem to be solved in this application and further discloses a computer system (Kliger: FIG. 1, FIGS. 4-5) for processing video (Kliger: [0047], “GPU 408 may be configured to render … videos”), comprising: a processor device (Kliger: FIG. 4, CPU 402, GPU 408) operatively coupled to a memory device (Kliger: FIG. 4, device 404), the processor device being configured to generate a video from text description (Kliger: Abstract; [0027] “GANs may be used … text-to-image generation, video generation”;  [0033], “input may be visual data, text”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Pan in view of Tulyakov with the teaching of Kliger in order to provide a better system configuration for the implementation of generating video from text.
Pan: FIG. 2; section 3.2.1, “embedded word sequence”; section 4.2; FIGS. 5-6).
Pan in view of Tulyakov is silent to teach wherein the processor device is further configured to: use the trained conditional video generator to generate at least one further video from additional text.
However, Kliger is an analogous art pertinent to the problem to be solved in this application and further discloses wherein the processor device (Kliger: FIG. 1, FIGS. 4-5) is further configured to: use the trained conditional video generator to generate at least one further video from additional text.
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Pan in view of Tulyakov with the teaching of Kliger in order to provide a better system configuration for the implementation of generating video from text. 
-Regarding claim 14, Pan in view of Tulyakov teaches that generate the at least one further video for an input to a process selected from the group consisting of multimedia applications (Pan: Abstract; section 5, 1st paragraph), generating synthetic datasets (Pan: Abstract, “synthetic datasets”, page 1792, 2nd column, 1st paragraph, “synthetic video”), model-based reinforcement learning systems (Pan: Abstract; Figure 2; algorithm 1), and domain adaptation (Pan: Abstract; Figure 2; algorithm 1).
Pan in view of Tulyakov is silent to teach wherein the processor device is further configured to performance the above steps.
Kliger: FIG. 1, FIGS. 4-5) is further configured to: generate the at least one further video for an input to a process selected from the group consisting of multimedia applications, generating synthetic datasets, model-based reinforcement learning systems, and domain adaptation.
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Pan in view of Tulyakov with the teaching of Kliger in order to provide a better system configuration for the implementation of generating video from text. 
-Regarding claim 15, Pan in view of Tulyakov teaches that train an artificial intelligence system based on the at least one further video (Pan: Abstract; Figure 2; algorithm 1; section 3.2.2, “real video”, “synthetic one”; equations (3)-(4)).
Pan in view of Tulyakov is silent to teach wherein the processor device is further configured to performance the above steps.
However, Kliger is an analogous art pertinent to the problem to be solved in this application and further discloses wherein the processor device (Kliger: FIG. 1, FIGS. 4-5) is further configured to: train an artificial intelligence system based on the at least one further video.
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Pan in view of Tulyakov with the teaching of Kliger in order to provide a better system configuration for the implementation of generating video from text. 
Pan: Abstract; Figure 2).
Pan is silent to teach producing a variable length video.
In the same field of endeavor, Tulyakov teaches wherein, when using the trained conditional video generator to generate at least one further video further comprises: producing a variable length video (Tulyakov: Abstract; Figure 2; page 4, 2nd column, 1st paragraph, “video length K can vary”; FIG. 6).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Pan with the teaching of Tulyakov by using shared generator to generate video from text in order to handle videos of a varying length in latent space and produce high resolution video.
Pan in view of Tulyakov is silent to teach wherein the processor device is further configured to performance the above steps.
However, Kliger is an analogous art pertinent to the problem to be solved in this application and further discloses , wherein, when using the trained conditional video generator to generate at least one further video, the processor device (Kliger: FIG. 1, FIGS. 4-5) is further configured to: produce a variable length video.
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Pan in view of Tulyakov with the teaching of Kliger in order to provide a better system configuration for the implementation of generating video from text. 
s 20 is rejected under 35 U.S.C. 103 as being unpatentable over Pan (MM'17, October 23-27, 2017 Mountain View, CA, USA, Page 1789-1798) in view of Tulyakov (arXiv 1707.04993v2 14 Dec 2017) , further in view of Yu (U.S. PG-PUB NO. 2017/0127016 A1) and in view of Kliger (U.S. PG-PUB NO. 2018/0336439 A1).
-Regarding claim 20, Pan discloses receiving a text-based description of at least one active scene (Figure 1; Figure 6); representing, by a processor device, the text-based description as a word embedding matrix (Figure 2 sentence (S) ; page 1792, section 3.2.1); using a text encoder implemented by neural network (Figure 2 generator network (G), LSTM-based encoder) to output at least one frame level textual representation (Figure 2 (input of discriminator (D) – frame and motion discriminators path), page 1792, 2nd column, 1st paragraph, “synthetic frame”,                         
                            
                                
                                    f
                                
                                
                                    s
                                    y
                                    n
                                
                                
                                    i
                                
                            
                        
                    ) and at least one video level representation (Figure 2 (input of discriminator (D) – video discriminator path), ), page 1792, 2nd column, 1st paragraph, “synthetic video”,                         
                            
                                
                                    v
                                
                                
                                    s
                                    y
                                    n
                                
                            
                        
                    ) of the word embedding matrix (Figure 2 sentence (S); page 1792, section 3.2.1); generating, by a shared generator, at least one frame by frame video based on the frame level textual representation (Abstract; Figure 2, frame and motion discriminators path (grey path)), the video level representation (Abstract; Figure 2, video discriminator path (blue path)) and noise vectors sampled from a Gaussian distribution (Figure 2, Noise (z); section 3.1, “normal distribution”); generating a frame level and a video level convolutional filter (Figure 2, last two blocks of video discriminator path, and frame & motion discriminators path) of a video discriminator to classify frames and video of the at least one frame by frame video as true or false (Abstract; Figure 2, discriminator network (D), page 1792, section 3.2.2 1st paragraph, “real/fake”, “whether the input video is real” ); and training a conditional video generator (Figure 2) that includes the text encoder (Figure 2 generator network (G)), the video discriminator (Figure 2 discriminator Network (D)) and the shared generator in a generative adversarial network (Figure 2, TGANs-C) to convergence (page 1791 section 3, 1st paragraph, “The training … performed by optimizing the generator network and discriminator network …. in a two-player minimax game mechanism”; page 1794, algorithm 1).
Pan is silent to teach a shared generator.
In the same field of endeavor, Tulyakov teaches generating, by a shared generator (Tulyakov: Figure 2, generator                         
                            
                                
                                    G
                                
                                
                                    I
                                
                            
                        
                    ), at least one frame by frame video (Tulyakov: Figure 2, set                         
                            
                                
                                    V
                                
                                ~
                            
                        
                    ) based on the frame level textual representation (Tulyakov: Figure 2,                         
                            
                                
                                    Z
                                
                                
                                    C
                                
                            
                        
                    ), the video level representation (Tulyakov: Figure 2,                         
                            
                                
                                    Z
                                
                                
                                    M
                                
                            
                        
                    ), and noise vectors sampled from a Gaussian distribution (Tulyakov: Figure 2,                         
                            
                                
                                    є
                                
                                
                                    (
                                    k
                                    )
                                
                            
                        
                    , page 4, 1st column, 4th paragraph, line 2 “Gaussian distribution”); generating a frame level and a video level convolutional filter of a video discriminator to classify frames and video of the at least one frame by frame video as true or false (Tulyakov: Figure 2,                         
                            
                                
                                    
                                        
                                            D
                                        
                                        
                                            I
                                        
                                    
                                    ,
                                     
                                    D
                                
                                
                                    V
                                
                            
                            )
                        
                    ; and training a conditional video generator (Tulyakov: Figure 2                        
                            )
                             
                        
                    that includes the text encoder, the video discriminator (Tulyakov: Figure 2,                         
                            
                                
                                    D
                                
                                
                                    V
                                
                            
                            )
                        
                    , and the shared generator (Tulyakov: Figure 2, generator                         
                            
                                
                                    G
                                
                                
                                    I
                                
                            
                        
                    ) in a generative adversarial network to convergence (Tulyakov: page 4, 2nd column, fourth paragraph,  “                        
                            
                                
                                    D
                                
                                
                                    V
                                
                            
                        
                     … sufficient for training                         
                            
                                
                                    G
                                
                                
                                    I
                                
                            
                        
                     and                         
                            
                                
                                    R
                                
                                
                                    M
                                
                            
                             
                        
                    … using                         
                            
                                
                                    D
                                
                                
                                    I
                                
                            
                        
                    …improves the convergence of the adversarial training”, equations (4)-(6)).

Pan in view of Tulyakov is silent to teach a processor device for implementing the method to process video. 
However, Yu is an analogous art pertinent to the problem to be solved in this application and further discloses a method for implementing processing video, comprising, representing, by a processor device, the text-based description as a word embedding matrix (Yu: Abstract; FIGS. 1-2, 9)
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the teaching of Pan in view of Tulyakov with the teaching of Yu in order to provide a better implementation of word embedding method based on deep learning.
Pan in view of Tulyakov, further in view of Yu is silent to teach a computer program product for implementing a text filter generative adversarial network (TF-GAN), the computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computing device to cause the computing device to perform the method to generate a video from text description.
However, Kliger is an analogous art pertinent to the problem to be solved in this application and further discloses a computer program product (Kliger: FIG. 1, FIG. 4) for implementing a text filter generative adversarial network (TF-GAN), the computer Kliger: FIG. 5 readable media 500)  having program instructions embodied therewith, the program instructions executable by a computing device to cause the computing device to perform the method to generate a video from text description (Kliger: [0056]-[0057]).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Pan in view of Tulyakov, further in view of Yu with the teaching of Kliger in order to provide a better system configuration for the implementation of generating video from text.
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Pan (MM'17, October 23-27, 2017 Mountain View, CA, USA, Page 1789-1798) in view of Tulyakov (arXiv 1707.04993v2 14 Dec 2017) , further in view of Yu (U.S. PG-PUB NO. 2017/0127016 A1), and in view of Chen (IEEE TR. on Cognitive and Development Systems, Vol 11, Issue 1, March 2019, page 13-25). 
-Regarding claim 5, Pan in view of Tulyakov, and further in view of Yu is silent to teach wherein the artificial intelligence system is selected from the group consisting of a control system for self-driving cars and a surveillance system.
In the same field of endeavor, Chen teaches wherein the artificial intelligence system is selected from the group consisting of a control system for self-driving cars and a surveillance system (Chen: Abstract; Figure 1).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Pan in view of Tulyakov, and further in view of Yu with the teaching of Chen by using textual .
Claim 7 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Pan (MM'17, October 23-27, 2017 Mountain View, CA, USA, Page 1789-1798) in view of Tulyakov (arXiv 1707.04993v2 14 Dec 2017) , further in view of Yu (U.S. PG-PUB NO. 2017/0127016 A1), and in view of Sherman (U.S. PG-PUB NO. 2012/0100825 A1). 
-Regarding claim 7, Pan in view of Tulyakov, and further in view of Yu is silent to teach wherein the at least one active scene includes a text-based dangerous traffic scene description.
In the same field of endeavor, Sherman teaches wherein the at least one active scene includes a text-based dangerous traffic scene description (Sherman: Abstract; Fig. 1; [0018], “text description provided by observer … video, audio”; [0021], “dangerous driving”, “suspicious package”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Pan in view of Tulyakov, and further in view of Yu with the teaching of Sherman by using textual description of active scene to generate video from text in order to help and improve the performance of prioritizing and routing emergent activity reporting system.
-Regarding claim 8, Pan in view of Tulyakov, and further in view of Yu is silent to teach wherein the at least one active scene includes a suspicious scene description.
In the same field of endeavor, Sherman teaches wherein the at least one active scene includes a suspicious scene description (Sherman: Abstract; Fig. 1; [0018], “text description provided by observer … video, audio”; [0021], “dangerous driving”, “suspicious package”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Pan in view of Tulyakov, and further in view of Yu with the teaching of Sherman by using textual description of active scene to generate video from text in order to help and improve the performance of prioritizing and routing emergent activity reporting system.
Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Pan (MM'17, October 23-27, 2017 Mountain View, CA, USA, Page 1789-1798) in view of Tulyakov (arXiv 1707.04993v2 14 Dec 2017) , further in view of Kliger (U.S. PG-PUB NO. 2018/0336439 A1), and in view of Chen (IEEE TR. on Cognitive and Development Systems, Vol 11, Issue 1, March 2019, page 13-25). 
-Regarding claim 16, Pan in view of Tulyakov, and further in view of Kliger is silent to teach wherein the artificial intelligence system is selected from the group consisting of a control system for self-driving cars and a surveillance system.
In the same field of endeavor, Chen teaches wherein the artificial intelligence system is selected from the group consisting of a control system for self-driving cars and a surveillance system (Chen: Abstract; Figure 1).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Pan in view of Tulyakov, and further in view of Kliger with the teaching of Chen by using textual description of active scene to generate video from text in order to help and improve the .
Claim 18 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Pan (MM'17, October 23-27, 2017 Mountain View, CA, USA, Page 1789-1798) in view of Tulyakov (arXiv 1707.04993v2 14 Dec 2017) , further in view of Kliger (U.S. PG-PUB NO. 2018/0336439 A1), and in view of Sherman (U.S. PG-PUB NO. 2012/0100825 A1). 
-Regarding claim 18, Pan in view of Tulyakov, and further in view of Kliger is silent to teach wherein the at least one active scene includes a text-based dangerous traffic scene description.
In the same field of endeavor, Sherman teaches wherein the at least one active scene includes a text-based dangerous traffic scene description (Sherman: Abstract; Fig. 1; [0018], “text description provided by observer … video, audio”; [0021], “dangerous driving”, “suspicious package”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Pan in view of Tulyakov, and further in view of Kliger with the teaching of Sherman by using textual description of active scene to generate video from text in order to help and improve the performance of prioritizing and routing emergent activity reporting system.
-Regarding claim 19, Pan in view of Tulyakov, and further in view of Kliger is silent to teach wherein the at least one active scene includes a suspicious scene description.
In the same field of endeavor, Sherman teaches wherein the at least one active scene includes a suspicious scene description (Sherman: Abstract; Fig. 1; [0018], “text description provided by observer … video, audio”; [0021], “dangerous driving”, “suspicious package”).
Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to combine the teaching of Pan in view of Tulyakov, and further in view of Kliger with the teaching of Sherman by using textual description of active scene to generate video from text in order to help and improve the performance of prioritizing and routing emergent activity reporting system.
Allowable Subject Matter
Claim 10 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to XIAO LIU whose telephone number is (571)272-4539.  The examiner can normally be reached on Monday-Thursday and Alternate Fridays 8:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nay Maung can be reached on (571) 272-7882.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.







/XIAO LIU/Examiner, Art Unit 2664                                                                                                                                                                                             

/PING Y HSIEH/Primary Examiner, Art Unit 2664