DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) was submitted on 04/07/2020.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-6, and 10-19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

The independent claims 1, 11, and 12, recite:
accessing, at an information handling device, a dynamic visual media corpus, wherein the dynamic visual media corpus comprises a plurality of dynamic visual media scripts;
segmenting each of the plurality of dynamic visual media scripts into scenes;
generating, for each of the plurality of dynamic visual media scripts, a character fingerprint identifying topics corresponding to each character within a corresponding dynamic visual media script, wherein the generating comprises (i) extracting both characters and topics from the dynamic visual media script and (ii) associating each of the topics with a corresponding character, wherein the character fingerprint identifies costumes of a given character and a topic corresponding to each costume; and
producing, for each scene within each dynamic visual media script, a scene vector identifying (iii) the topics included within a corresponding scene and (iv) a character fingerprint for each character occurring within the scene.
 
These limitations relate to a human organizing of activities; and reads on a human obtaining and observing dynamic visual media (e.g., multiple drawings, photographs, etc. on paper) along with textual description of the same media; categorizing (i.e., segmenting) each of the drawings/photographs into different categories (i.e., topics); writing in paper descriptions of characters present in each photograph (i.e., character fingerprint) and associating each category with the character(s) and topics (e.g., costume/clothing information); and writing a list containing the observed/categorized topics along with character description for each photograph.
This judicial exception is not integrated into a practical application because for example: the claim language recites “an information handling device”.  Also, in [0040] of the as filed specification, “As shown in FIG. 4, computer system/server 12' in computing node 10' is shown in the form of a general-purpose computing device.” Therefore, a general-purpose computer or computing device is described and mainly used as an application thereof. Accordingly, these additional elements do not integrate the abstract idea into a practical idea because it does not impose any meaningful limits on practicing the abstract idea. 
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional elements of using a computer is listed as a general computing device as noted. The claim is not patent eligible. 
With respect to claims 2 and 13, the claims recite:
wherein the segmenting comprises identifying, utilizing at least one scene segmentation technique to enclose semantic boundaries, a segment by identifying a portion of the dynamic visual media script having a costume and corresponding context that are consistent throughout the portion.

This relates to a human organizing of activities. This reads on a human categorizing (i.e., segmenting) each of the drawings/photographs into different categories (i.e., topics) following predefined rules/criteria to identify character’s costumes and context of photographs. No additional limitations are present. 	

With respect to claims 3 and 14, the claims recite:
generating a topic-relationship graph across the dynamic visual media corpus, wherein the topic-relationship graph represents topics occurring within the plurality of dynamic visual media scripts as nodes and relationships between the topics as edges, wherein the edges are weighted with an occurrence frequency.

This relates to a human organizing of activities. This reads on a human drawing on paper a graph with categories (i.e., topics) and their relationships between them as well as including a value of occurrence (previously counting how many times certain topic/relationship occurs). No additional limitations are present. 	

With respect to claims 4 and 15, the claims recite:
wherein the producing a scene vector comprises utilizing the topic-relationship graph.

This relates to a human organizing of activities. This reads on a human writing a list containing the observed/categorized topics along with character description for each photograph using the drawing on paper with the graph with categories (i.e., topics) and their relationships between them. No additional limitations are present. 	

With respect to claims 5 and 16, the claims recite:
generating a scene-level character fingerprint by applying a time window corresponding to a scene to the dynamic visual media script, wherein the generating is carried out for the applied time window.

This relates to a human organizing of activities. This reads on a human writing in paper character descriptions (i.e., character fingerprints) in drawings/photographs from certain dates/times (i.e., labeled with certain dates/times). No additional limitations are present. 	

With respect to claims 6 and 17, the claims recite:
comprising receiving a dynamic visual script for recommendation of at least one costume.
This relates to a human organizing of activities. This reads on a human receiving drawing/photograph in paper from another human and is asked to recommend an appropriate costume/outfit/clothing. No additional limitations are present. 	


With respect to claims 10, the claims recite:
generating a textual script for each scene corresponding to a dialogue included in a corresponding scene; and
wherein the generating and the producing are based upon the textual script.

This relates to a human organizing of activities. This reads on a human writing on paper textual description of the drawings/photographs corresponding to a caption (i.e., dialogue) included in said drawing/photograph; therefore basing the description and the list written containing the observed/categorized topics along with character description for each photograph on text (i.e., caption/ textual description). No additional limitations are present.

Also, claims 11-19 are rejected under 35 U.S.C. 101 because they are drawn to a “signal” per se as recited in the preamble and as such is non-statutory subject matter. On paragraphs [0047-49] of the as filed Specification, the term “computer program product” is defined as “[0047] The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.” However, the term “computer readable program code” is not defined as to what the scope of the term is meant to encompass, other than: “[0049] Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network.”. Hence, one of ordinary skilled in the art can interpret the terms of “computer readable storage media” and “computer readable program code” to include transitory signals and non-transitory signals. It does not appear that a claim reciting a signal encoded with functional descriptive material falls within any of the categories of patentable subject matter set forth in § 101. First, a claimed signal is clearly not a "process" under § 101 because it is not a series of steps. The other three § 101 classes of machine, compositions of matter and manufactures "relate to structural entities and can be grouped as ‘product’ claims in order to contrast them with process claims." 1 D. Chisum, Patents § 1.02 (1994).
The Applicant’s Specification presents a broad definition as to what the “computer program product” and “computer readable program code” cover. Also, the Applicant’s as filed Specification is silent in the definition of the “computer readable program code”. Hence, it appears that the claims appear to be drawn towards transitory signals, which is not subject matter eligible. In order to overcome the present rejection, the Applicant is advised to amend the claims by using the following terminology: “non-transitory computer program product,” “non-transitory computer readable program code,” and “non-transitory computer readable storage medium”. Such example terminology has been also found in the Official Gazette 1851 OG 212.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-2, 6-9, 11-13, and 17-19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Zhu, Shizhan, et al. "Be your own prada: Fashion synthesis with structural coherence." Proceedings of the IEEE international conference on computer vision. 2017. [Zhu et al.].

As to independent claim 1, Zhu et al. teaches:
A method for training a machine-learning model used to provide recommendations for costumes for characters within a dynamic visual media (see Figure 1: “Given an original wearer’s input photo (left) and different textual descriptions (second column), our model generates new outfits [i.e., costume] onto the photograph [i.e., dynamic visual media] (right three columns) while preserving the pose and body shape of the wearer [i.e., character].” and ¶ 10 of 1. Introduction (left column, page 1681): “To train our model, we extend the DeepFashion dataset [8] by annotating a subset of 79K upper-body images with sentence descriptions and human body annotations.”), the method comprising:
accessing, at an information handling device, a dynamic visual media corpus, wherein the dynamic visual media corpus comprises a plurality of dynamic visual media scripts (see ¶ 10 of 1. Introduction (left column, page 1681): “To train our model, we extend the DeepFashion dataset [8] by annotating a subset of 79K upper-body images with sentence descriptions [i.e., scripts] and human body annotations.” and Figure 2: “generator [i.e., information handling device]”);
segmenting each of the plurality of dynamic visual media scripts into scenes (see ¶ 3 of 3.5. Implementation Details and Dataset: “To train our framework we extended the publicly available DeepFashion dataset [8] with richer annotations (captions and segmentation maps). […] Training our algorithm requires segmentation maps and captions for each image. We manually annotated one sentence per photo [i.e., script (sentence) into scenes (photo)], describing only the visual facts (e.g., the color, texture of the clothes or the length of the sleeves), avoiding any subjective assessments. For segmentation, we first applied a semantic segmentation method (VGG model fine-tuned on the ATR dataset [7]) to all the images, and then manually checked correctness. We manually relabeled the incorrectly segmented samples with GrabCut [13].”);
generating, for each of the plurality of dynamic visual media scripts, a character fingerprint identifying topics corresponding to each character within a corresponding dynamic visual media script, wherein the generating comprises (i) extracting both characters and topics from the dynamic visual media script and (ii) associating each of the topics with a corresponding character, wherein the character fingerprint identifies costumes of a given character and a topic corresponding to each costume (see ¶ 4 of 3.1 Overview of FashionGAN: “To capture further information about the wearer [i.e., character], we extract a vector of binary attributes [i.e., vector identifying topics and character fingerprint], a, from the person’s face, body and other physical characteristics. Examples of attributes [i.e., topics] include gender, long/short hair, wearing/not wearing sunglasses and wearing/not wearing hat. The attribute vector may additionally capture the mean RGB values of skin color, as well as the aspect ratio of the person, representing coarse body size [i.e. association of topics with corresponding character (or wearer) (i.e., character fingerprints)]. These are the properties that our final generated image should ideally preserve.”); and
producing, for each scene within each dynamic visual media script, a scene vector identifying (iii) the topics included within a corresponding scene and (iv) a character fingerprint for each character occurring within the scene (see  ¶ 1 and 4 of 3.1 Overview of FashionGAN citation as in limitation above. “Our method requires training data in order to learn the mapping from one photo to the other given the description. […] we extract a vector of binary attributes [i.e., vector identifying topics and character fingerprint], a, from the person’s face, body and other physical characteristics. Examples of attributes [i.e., topics] include gender, long/short hair, wearing/not wearing sunglasses and wearing/not wearing hat. The attribute vector may additionally capture the mean RGB values of skin color, as well as the aspect ratio of the person, representing coarse body size [i.e. association of topics with corresponding character (or wearer) (i.e., character fingerprints)].” 
Here, the created/extracted/ produced vector of binary attributes [i.e., vector identifying topics and character fingerprint]; wherein attributes [i.e., topics]; and additional character/wearer/person characteristics such as, mean RGB values of skin color, as well as the aspect ratio of the person, representing coarse body size [i.e. association of topics with corresponding character (or wearer) (i.e., character fingerprints)] in each photo [i.e., scene].).

As to independent claim 11, Zhu et al. teaches:
An apparatus for training a machine-learning model used to provide recommendations for costumes for characters within a dynamic visual media (see Figure 1 citation as in claim 1 and Figure 2: Proposed Framework, “generators”), the apparatus comprising:
at least one processor (see ¶ 10 of 1. Introduction: “To train our model, we extend the DeepFashion dataset [8] by annotating a subset of 79K upper-body images with sentence descriptions and human body annotations1. Extensive quantitative and qualitative comparisons are performed against existing GAN baselines and 2D nonparametric approaches. […] 1The data and code can be found at  http://mmlab.ie.cuhk.edu.hk/projects/FashionGAN/.” 
It is interpreted that the trained model and GAN baselines, etc are implemented/run in a computing device that contains a processing device and a non-transitory computer readable storage medium.); and
a computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor (see ¶ 10 of 1. Introduction citation and comment as in limitation above.), the computer readable program code comprising:
computer readable program code configured to:
[perform the limitations as in claim 1].

As to independent claim 12, Zhu et al. teaches:
A computer program product for training a machine-learning model used to provide recommendations for costumes for characters within a dynamic visual media (see ¶ 10 of 1. Introduction: “To train our model, we extend the DeepFashion dataset [8] by annotating a subset of 79K upper-body images with sentence descriptions and human body annotations1. Extensive quantitative and qualitative comparisons are performed against existing GAN baselines and 2D nonparametric approaches. […] 1The data and code can be found at  http://mmlab.ie.cuhk.edu.hk/projects/FashionGAN/.”), the computer program product comprising:
a computer readable storage medium having computer readable program code embodied therewith (see ¶ 10 of 1. Introduction citation and comment as in limitation above.), the computer readable program code executable by a processor comprising:
computer readable program code configured to: 
[perform the limitations as in claim 1].

Regarding claims 2 and 13, Zhu et al. teaches the limitations as in claim 1 and 12, above.
Zhu et al. further teaches:
wherein the segmenting comprises identifying, utilizing at least one scene segmentation technique to enclose semantic boundaries, a segment by identifying a portion of the dynamic visual media script having a costume and corresponding context that are consistent throughout the portion (see ¶ 1 of 3.2. Segmentation Map Generation (Gshape): “Our first generator Gshape aims to generate the semantic segmentation map S˜ by conditioning on the spatial constraint ↓m(S0) [i.e., semantic boundaries], the design coding d ∈ R D, and the Gaussian noise zS ∈ R 100. We now provide more details about this model. To be specific, assume that the original image is of height m and width n, i.e., I0 ∈ R m×n×3 . We represent the segmentation map S0 of the original image using a pixel-wise one-hot encoding, i.e., S0 ∈ {0, 1} m×n×L, where L is the total number of labels. In our implementation, we use L = 7 corresponding to background, hair, face, upper-clothes, pants/shorts, legs, and arms [i.e., segment identifying portion of dynamic visual media script having a costume and corresponding context].”).

Regarding claims 6 and 17, Zhu et al. teaches the limitations as in claim 1 and 12, above.
Zhu et al. further teaches:
comprising receiving a dynamic visual script for recommendation of at least one costume (see Figure 1: “Given an original wearer’s input photo (left) and different textual descriptions (second column) [i.e., dynamic visual script], our model generates new outfits onto the photograph (right three columns) [i.e., recommendation of at least one costume] while preserving the pose and body shape of the wearer.”).

Regarding claims 7 and 18, Zhu et al. teaches the limitations as in claim 6 and 12, above.
Zhu et al. further teaches:
comprising providing at least one recommendation for a costume within the received dynamic visual script, by utilizing the trained machine- learning model (see Figure 1: “Given an original wearer’s input photo (left) and different textual descriptions (second column) [i.e., dynamic visual script], our model generates new outfits onto the photograph (right three columns) while preserving the pose and body shape of the wearer [i.e., recommendation of at least one costume]” and ¶ 2 of 3.1. Overview of FashionGAN: “Our method requires training data in order to learn [i.e., machine-learning model] the mapping from one photo to the other given the description.”).


Regarding claim 8, Zhu et al. teaches the limitations as in claim 1, above.
Zhu et al. further teaches:
wherein the providing at least one recommendation comprises:
segmenting the received dynamic visual script (see ¶ 3 of 3.5. Implementation Details and Dataset citation as in claim 1, above.);
generating a character fingerprint for each character within the received dynamic visual script (see ¶ 4 of 3.1 Overview of FashionGAN citation as in claim 1, above.); and
producing a scene vector for each scene within the received dynamic visual script (see  ¶ 4 of 3.1 Overview of FashionGAN citation as in claim 1, above.).

Regarding claims 9 and 19, Zhu et al. teaches the limitations as in claim 8 and 18, above.
Zhu et al. further teaches:
wherein the providing at least one recommendation comprises applying a generative adversarial network to each scene vector of the received dynamic visual script against the scene vectors of the machine-learning model, thereby generating one or more costumes for the dynamic visual media scene (see Figure 2: “Proposed framework. Given an input photograph of a person and a sentence description of a new desired outfit, our model first generates a segmentation map S˜ using the generator from the first GAN [i.e., GAN applied to scene vector of received dynamic visual script]. We then render the new image with another GAN, with the guidance from the segmentation map generated in the previous step [i.e., GAN scene vector of machine-learning model]. At test time, we obtain the final rendered image with a forward pass through the two GAN networks ”).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Zhu et al. "Be your own prada: Fashion synthesis with structural coherence." Proceedings of the IEEE international conference on computer vision. 2017. as applied to claims 1 and 12 above, and further in view of Maetz et al. (US 20160117311 A1).

As to independent claim 20, Zhu et al. teaches the limitations as in claim 1, above.
Zhu et al. further teaches:
A method, comprising:
generating a character fingerprint for each character within the received (see ¶ 4 of 3.1 Overview of FashionGAN citation as in claim 1, above.);
producing a scene vector for each scene within the received (see  ¶ 1 and 4 of 3.1 Overview of FashionGAN citation as in claim 1, above.
Here, the created/extracted/ produced vector of binary attributes [i.e., vector identifying topics and character fingerprint]; wherein attributes [i.e., topics]; and additional character/wearer/person characteristics such as, mean RGB values of skin color, as well as the aspect ratio of the person, representing coarse body size [i.e. association of topics with corresponding character (or wearer) (i.e., character fingerprints)] in each photo [i.e., scene].); and
providing at least one recommendation for a costume within the received within the corpus, is (i) segmented into scenes, (ii) has a character fingerprint generated for each character, and (iii) has scene vectors produced for each scene within the (see Figure 1: “Given an original wearer’s input photo (left) and different textual descriptions (second column) [i.e., dynamic visual script], our model generates new outfits onto the photograph (right three columns) while preserving the pose and body shape of the wearer [i.e., recommendation of at least one costume]”  ¶ 2 of 3.1. Overview of FashionGAN: “Our method requires training data in order to learn [i.e., machine-learning model] the mapping from one photo to the other given the description.”, and ¶ 3 of 3.5. Implementation Details and Dataset, and ¶ 4 of 3.1 Overview of FashionGAN citations as in claim 1, above.);
the providing at least one recommendation comprising applying a conditional generative adversarial network to each scene vector of the received (see Figure 2 citation as in claim 9, above and ¶ 1 of 3.4 Training: “… following the typical conditional GAN training procedure.”
Here, the script was interpreted as a dynamic visual script as in claim 1 [i.e., model generating new outfits onto photographs] and the conditional generative adversarial network [i.e., first and second GAN (generative adversarial network)].)
	
As mentioned above, Zhu et al. does not explicitly teach, but Maetz et al. does teach:
receiving a movie script (see ¶ [0030]: “The proposed method and apparatus analyze the different elements that make up (comprise) the story, for example by analyzing the movie script scene by scene.” 
Here, movie script [i.e., descriptive text/sentences] and its segmentation into scenes [i.e., photographs]);
segmenting the received movie script into scenes (see ¶ [0030]: “That is, a movie, TV program, script, etc. is segmented into scenes for analysis.”);
Zhu et al.  and Maetz et al.  are both considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Zhu et al. to incorporate the teachings of Maetz et al.  of receiving a movie script and segmenting the received movie script into scenes which provides the benefit of allowing for better sharing among creators and communication of the result ([0006] of Maetz et al.)

Claims 3 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Zhu et al. "Be your own prada: Fashion synthesis with structural coherence." Proceedings of the IEEE international conference on computer vision. 2017. as applied to claims 1 and 12 above, and further in view of Brereton et al. (US 20150067569 A1).

Regarding claims 3 and 14, Zhu et al. teaches the limitations as in claim 1 and 12, above.
However, Zhu et al. does not explicitly teach, but Brereton et al. does teach:
comprising generating a topic-relationship graph across the dynamic visual media corpus, wherein the topic-relationship graph represents topics occurring within the plurality of dynamic visual media scripts as nodes and relationships between the topics as edges, wherein the edges are weighted with an occurrence frequency (see ¶ [0050]: “For example, referring to FIG. 5, a block diagram of an example topic map graph [i.e., topic-relationship graph] 500 is shown. When processing the topic map 210, a graph corresponding to the topic map is generated. The graph is defined as a couple (V,E) where V is the set of vertices or nodes (i.e., abstraction of a topic or an occurrence within the topic map) and E is the set of edges (i.e., the association between topics or the relationship between a topic and one or more occurrences) [i.e., edges with occurrences].
Zhu et al.  and Brereton et al.  are both considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Zhu et al.  to incorporate the teachings of Brereton et al. of generating a topic-relationship graph across the dynamic visual media corpus, wherein the topic-relationship graph represents topics occurring within the plurality of dynamic visual media scripts as nodes and relationships between the topics as edges, wherein the edges are weighted with an occurrence frequency which provides the benefit of allowing a user to obtain information regarding surrounding nodes as well as all other connections of which the node is a part([0049] of Brereton et al.).

Claims 4 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Zhu et al. "Be your own prada: Fashion synthesis with structural coherence." Proceedings of the IEEE international conference on computer vision. 2017 in combination with Brereton et al. (US 20150067569 A1) as applied to claims 3 and 14 above, and further in view of Emam et al. (US 20070156748 A1).

Regarding claims 4 and 15, Zhu et al. teaches the limitations as in claim 3 and 14, above.
However, Zhu et al. does not explicitly teach, but Emam et al. does teach:
wherein the producing a scene vector comprises utilizing the topic-relationship graph (see ¶ [0066-67]: “extracting a feature vector from the unstructured data for each identified named entities and relations; [0067] representing said entities and relations in a topic graph wherein nodes represent the entities and edges represent the relations between said entities;”).
Zhu et al. in combination with Brereton et al. and Emam et al.  are both considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Zhu et al. in combination with Brereton et al. to incorporate the teachings of Emam et al. wherein the producing a scene vector comprises utilizing the topic-relationship graph which provides the benefit of allowing the user to configure an automatic digital content generator to generate electronic contents according to the form and and language of its choice ([0075] of Emam et al.).

Claims 5 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Zhu et al. "Be your own prada: Fashion synthesis with structural coherence." Proceedings of the IEEE international conference on computer vision. 2017. as applied to claims 1 and 12 above, and further in view of Lau (US 20120303643 A1).

Regarding claims 5 and 16, Zhu et al. teaches the limitations as in claim 1 and 12, above.
However, Zhu et al. does not explicitly teach, but Lau does teach:
comprising generating a scene-level character fingerprint by applying a time window corresponding to a scene to the dynamic visual media script, wherein the generating is carried out for the applied time window (see ¶ [0020 and 0083]: “[0020] In a variation of process 100, instead of text aligning (104) multiple metadata sources directly, process 100 can text align to one or more time-alignments and use the time-alignments to align the metadata sources. [0083] …The scene description, the division into scenes, and the characters are derived from the script. Descriptors and caption are taken from the closed-caption file (along with timestamps [i.e., time window] modified as described below).”).
Zhu et al.  and Lau are both considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Zhu et al.  to incorporate the teachings of Lau  of generating a scene-level character fingerprint by applying a time window corresponding to a scene to the dynamic visual media script, wherein the generating is carried out for the applied time window which provides the benefit of correlating variations of the same underlying piece of content and then the correlation to merge the multiple sets of metadata into one multi-track set ([0022] of Lau).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Zhu et al. "Be your own prada: Fashion synthesis with structural coherence." Proceedings of the IEEE international conference on computer vision. 2017. as applied to claim 1 above, and further in view of Trim et al. (US 20200097502 A1).

Regarding claim 10, Zhu et al. teaches the limitations as in claim 1, above.
Zhu et al. further teaches: 
wherein the generating and the producing are based upon the textual script (see  ¶ 4 of 3.1 Overview of FashionGAN citation as in limitation above. vector of binary attributes [i.e., vector identifying topics and character fingerprint]). [i.e., textual script: language description].)
However, Zhu et al. does not explicitly teach, but Trim et al. does teach:
comprising generating a textual script for each scene corresponding to a dialogue included in a corresponding scene (see ¶ [0068]: “The NLP component 524 may perform a linguistic analysis that analyzes verbal communications in the video segment 510 (e.g., what is being said in a given scene of a video). In one aspect, one or more artificial intelligence (“AI”) or NLP instances may be provided such as, for example, Speech-To-Text operation to create a transcript of spoken content, and Natural Language Understanding to classify the topics, tone, and emotional state of the spoken content.”);
Zhu et al.  and Trim et al. are both considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Zhu et al.  to incorporate the teachings of Trim et al.  of generating a textual script for each scene corresponding to a dialogue included in a corresponding scene which provides the benefit of analyzing verbal communications in the video segment ([0068] of Trim et al.).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Keisha Y Castillo-Torres whose telephone number is (571)272-3975. The examiner can normally be reached Monday - Friday, 9:00 am - 4:00 pm (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on (571)272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

Keisha Y. Castillo-Torres
Examiner
Art Unit 2659



/Keisha Y. Castillo-Torres/Examiner, Art Unit 2659                                                                                                                                                                                                        

/PIERRE LOUIS DESIR/Supervisory Patent Examiner, Art Unit 2659