Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-18 are rejected under 35 U.S.C. 101 
Regarding Claim 1,
Step 1 Analysis: Claim 1 is directed to a machine, which falls within one of the four statutory categories. 
Step 2A Prong 1 Analysis: Claim 1 recites, in part, “extract objects from at least one image and/or video; analyze information related to the objects extracted from the at least one image and/or video; select a style of a story to be generated based on the extracted objects and the analyzed information; generate the story by using the extracted objects and the analyzed information; and change an object included in the generated story into the selected style.”. The limitations of “extract objects from at least one image and/or video; analyze information related to the objects extracted from the at least one image and/or video; select a style of a story to be generated based on the extracted objects and the analyzed information; generate the story by using the extracted objects and the analyzed information; and change an object included in the generated story into the selected style.”, as drafted, are processes that, under broadest reasonable interpretation, covers the performance of the limitation in the mind which falls within the “Mental Processes” grouping of abstract ideas. The limitations of:
extract objects from at least one image and/or video can be considered to be an observation in the human mind. 
analyze information related to the objects extracted from the at least one image and/or video can be considered to be an observation in the human mind. 
select a style of a story to be generated based on the extracted objects and the analyzed information can be considered to be an evaluation done in the human mind.  
generate the story by using the extracted objects and the analyzed information can be considered to be an evaluation done in the human mind.  
and change an object included in the generated story into the selected style can be considered to be an evaluation done in the human mind. 
Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements – “an electronic device” and “a processor”. Thus, the elements in the claim are recited at a high level of generality (i.e. as a generic processor performing a generic computer function of generating the story by using the extracted objects and the analyzed information) such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim as a whole is directed to an abstract idea. Accordingly, these additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Please see MPEP §2106.04.(a)(2).III.C. The claim is directed to an abstract idea.
Step 2B Analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of utilizing an electronic device and a processor to perform the steps of the claimed process amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Please see MPEP §2106.05(b). The claim is not patent eligible.
Regarding dependent claims 2-8, they do not overcome the deficiencies of the rejected independent claim 1 and they are also rejected. 
Regarding claim 9,
Step 1 Analysis: Claim 9 is directed to a machine, which falls within one of the four statutory categories. 
Step 2A Prong 1 Analysis: Claim 1 recites, in part, “extract objects from at least one image and/or video; analyze information related to the objects extracted from the at least one image and/or video; select a style of a story to be generated based on the extracted objects and the analyzed information; change the extracted objects and the analyzed information into the selected style; and generate the story by using the changed objects and information.” The limitations of “extract objects from at least one image and/or video; analyze information related to the objects extracted from the at least one image and/or video; select a style of a story to be generated based on the extracted objects and the analyzed information; change the extracted objects and the analyzed information into the selected style; and generate the story by using the changed objects and information,” as drafted, are processes that, under broadest reasonable interpretation, covers the performance of the limitation in the mind which falls within the “Mental Processes” grouping of abstract ideas. The limitations of:
extract objects from at least one image and/or video can be considered to be an observation in the human mind. 
analyze information related to the objects extracted from the at least one image and/or video can be considered to be an observation in the human mind. 
select a style of a story to be generated based on the extracted objects and the analyzed information can be considered to be an evaluation done in the human mind.  
and change the extracted objects and the analyzed information into the selected style can be considered to be an evaluation done in the human mind. 
generate the story by using the changed objects and the information can be considered to be an evaluation done in the human mind.  
Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements – “an electronic device” and “a processor”. Thus, the elements in the claim are recited at a high level of generality (i.e. as a generic processor performing a generic computer function of generating the story by using the extracted objects and the analyzed information) such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim as a whole is directed to an abstract idea. Accordingly, these additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Please see MPEP §2106.04.(a)(2).III.C. The claim is directed to an abstract idea.
Step 2B Analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of utilizing an electronic device and a processor to perform the steps of the claimed process amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Please see MPEP §2106.05(b). The claim is not patent eligible.
Regarding dependent claim 10 does not overcome the deficiencies of the rejected independent claim 9 and they are also rejected. 
Regarding Claim 11,
Step 1 Analysis: Claim 11 is directed to method/process, which falls within one of the four statutory categories. 
Step 2A Prong 1 Analysis: Claim 1 recites, in part, “extracting objects from at least one image and/or video; analyzing information related to the objects extracted from the at least one image and/or video; selecting a style of a story to be generated based on the extracted objects and the analyzed information; generating the story by using the extracted objects and the analyzed information; and changing an object included in the generated story into the selected style.” The limitations of “extracting objects from at least one image and/or video; analyzing information related to the objects extracted from the at least one image and/or video; selecting a style of a story to be generated based on the extracted objects and the analyzed information; generating the story by using the extracted objects and the analyzed information; and changing an object included in the generated story into the selected style,” as drafted, are processes that, under broadest reasonable interpretation, covers the performance of the limitation in the mind which falls within the “Mental Processes” grouping of abstract ideas. The limitations of:
extracting objects from at least one image and/or video can be considered to be an observation in the human mind. 
analyzing information related to the objects extracted from the at least one image and/or video can be considered to be an observation in the human mind. 
selecting a style of a story to be generated based on the extracted objects and the analyzed information can be considered to be an evaluation done in the human mind.  
generating the story by using the extracted objects and the analyzed information can be considered to be an evaluation done in the human mind.  
and change an object included in the generated story into the selected style can be considered to be an evaluation done in the human mind. 
Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements – “an electronic device”. Thus, the elements in the claim are recited at a high level of generality (i.e. as a generic electronic performing a generic computer function of generating the story by using the extracted objects and the analyzed information) such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim as a whole is directed to an abstract idea. Accordingly, these additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Please see MPEP §2106.04.(a)(2).III.C. The claim is directed to an abstract idea.
Step 2B Analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of utilizing an electronic device and a processor to perform the steps of the claimed process amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Please see MPEP §2106.05(b). The claim is not patent eligible.
Regarding dependent claims 12-18, they do not overcome the deficiencies of the rejected independent claim 1 and they are also rejected. 
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 3, 4, 6, 8-11, 13, 14, 16, 18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kevin Allekotte et al. (“US 2017/0115853 A1” hereinafter as “Allekotte”).
	Regarding claim 1, Allekotte discloses an electronic device comprising a processor ([0039] discloses a system includes one or more devices such as computer, laptop, tablet with one or more processors), wherein the processor is configured to ([0040] discloses the processor can be execute the instructions to perform the operations): extract objects from at least one image and/or video ([0023] discloses an image recognition step is used to identify one or more items or objects depicted in an image); analyze information related to the objects extracted from the at least one image and/or video ([0023] further discloses information obtained from metadata related to the image; moreover, [0015] discloses metadata can include information associated with the image such as location data, description of the content or context of the image which can be understood as the information related to the object as claimed; [0023] further discloses an inferred image tag is determined at least in part from the metadata which indicates analyzing of information as claimed, therefore the metadata information can be understood as the analyzed information); select a style of a story to be generated based on the extracted objects and the analyzed information ([0023] discloses an image tag obtained from the metadata, the metadata obtained includes the data from the image recognition step which includes extracted objects as discussed previously; moreover, [0016] discloses one or more image tags are determined which include broad descriptors such as “food”, “drink;” furthermore, [0021] disclose the user can select one of the image tags and an image caption can be generated based on the selected image tag; FIG. 1 shows the image and the different image tags and generated image caption; therefore, an image tag can be understood as the style as claimed and the caption can be understood as the story as claimed; In an embodiment, [0022], the caption can be generated automatically by using inferred tags/candidate tags with confidence values, therefore selection of tag can be done automatically without human interference at this step); generate the story by using the extracted objects and the analyzed information ([0023] discloses the image caption is generated based on the information from metadata and the detected objected); and change an object included in the generated story into the selected style (the disclosed invention teaches that changing of an object means changing a word indication of an object in the story sentence according to in page 14, third to last paragraph of the applications’ specification; this is taught in Allekotte’s [0027], which discloses one of the instances when the user select a new image tag, a new image caption is generated to add or replace the objects related to the image tags into the new caption which follows a corresponding caption template [0019, 0024] where the blank spaces of the template can be changed such as, “eating ____ at ____” where each of the blank spaces “____” can be changed; multiple word description of an image under BRI can be understood as a story).
	Regarding claim 3, Allekotte discloses the electronic device of claim 1, wherein the processor is further configured to: add a new object in addition to the extracted objects (FIG 2 and FIG 3 shows the same image being processed where there are additional tags generated which can be selected and more than one tag can be selected and change the generated caption accordingly according to FIG. 3 of items 104 and 110); and generate a story by considering the new object (FIG. 3 shows the new object is added into the caption “sushi” along with the first object “sushi bar” so the caption is changed from “relaxing at the sushi bar” of FIG.2 to “eating sushi at the sushi bar” when additional tags are selected; therefore, this covers the instances when the first caption is generated “relaxing at the sushi bar” and when additional tags are selected, selecting of additional tags indicates pulling additional items stored in the metadata as discussed above in claim 1, and the additional items/objects being added into the new caption “eating sushi at the sushi bar” where sushi is the added item/object).
	Regarding claim 4, Allekotte discloses the electronic device of claim 1, wherein the processor is further configured to generate a story by considering information on applications related to the objects ([0030] discloses metadata can include information such as ownership data, copyright information, location data [GPS data, etc.]; these information are obtained understood as being obtained from other applications such as social medias, GPS application, the device’s default information library as claimed).
	Regarding claim 6, Allekotte discloses the electronic device of claim 1, wherein the processor is further configured to: analyze at least one of a scene or a context of the at least one image and/or video ([0019], lines 6-10, discloses the system generate different caption templates prior to generation of the caption, the templates associate with different activities or scenes of the image recognition data which is obtained from the image); and generate a story by considering information related to the analyzed scene ([0019], lines 15-17, discloses the caption templates are used to generate the caption by specifying the image tag to be inserted into the template’s blank spaces).
	Regarding claim 8, Allekotte discloses the electronic device of claim 1, wherein the processor is further configured to generate a story by using at least some of plural sentences stored in a database ([0035] discloses the caption can be generated by selecting an image caption template from the one or more determined caption templates which indicate different templates being stored which indicates a plurality of sentences being stored as claimed which can covers the instances of there is a database storing the caption templates; moreover, [0037] discloses the information such as, the image, the metadata, the image recognition data, the selected tag(s) and the generated caption are stored in a database).  
	Regarding claim 9, Allekotte discloses an electronic device comprising a processor ([0039] discloses a system includes one or more devices such as computer, laptop, tablet with one or more processors), wherein the processor is configured to ([0040] discloses the processor can be execute the instructions to perform the operations): extract objects from at least one image and/or video ([0023] discloses an image recognition step is used to identify one or more items or objects depicted in an image); analyze information related to the objects extracted from the at least one image and/or video ([0023] further discloses information obtained from metadata related to the image; moreover, [0015] discloses metadata can include information associated with the image such as location data, description of the content or context of the image which can be understood as the information related to the object as claimed; [0023] further discloses an inferred image tag is determined at least in part from the metadata which indicates analyzing of information as claimed, therefore the metadata information can be understood as the analyzed information); select a style of a story to be generated based on the extracted objects and the analyzed information ([0023] discloses an image tag obtained from the metadata; moreover, [0016] discloses one or more image tags are determined which include broad descriptors such as “food”, “drink;” furthermore, [0021] disclose the user can select one of the image tags and an image caption can be generated based on the selected image tag; FIG. 1 shows the image and the different image tags and generated image caption; therefore, an image tag can be understood as the style as claimed and the caption can be understood as the story as claimed; In an embodiment, [0022], the caption can be generated automatically by using inferred tags/candidate tags with confidence values, therefore selection of tag can be done automatically without human interference at this step); change the extracted objects and the analyzed information into the selected style (the disclosed invention teaches that changing of an object means changing a word indication of an object in the story sentence according to in page 14, third to last paragraph of the applications’ specification; this is taught in Allekotte’s [0027], which discloses one of the instances when the user select a new image tag, a new image caption is generated to add or replace the objects related to the image tags into the new caption which follows a corresponding caption template [0019, 0024] where the blank spaces of the template can be changed such as, “eating ____ at ____” where “____” can be changed; moreover, FIG. 2 and FIG. 3’ element 106 shows the difference the activity “relaxing” in FIG. 2 being changed to “eating” in FIG.3 based on the selected tags 104 and 110, which indicates a changed in analyzed information according to activity being detected ([0019], lines 6-10, discloses caption templates can be associated with different activities or scenes based on the image recognition data associated with the image); and generate the story by using the changed objects and information ([0023] discloses the image caption is generated based on the information from metadata and the detected objected; moreover, FIG. 3 shows the new caption being generated based on additional tag 110; multiple word description of an image under BRI can be understood as a story).
	Regarding claim 10, Allekotte discloses the electronic device of claim 9, wherein the processor is further configured to generate a story by using at least some of plural sentences stored in a database ([0035] discloses the caption can be generated by selecting an image caption template from the one or more determined caption templates which indicate different templates being stored which indicates a plurality of sentences being stored as claimed which can covers the instances of there is a database storing the caption templates; moreover, [0037] discloses the information such as, the image, the metadata, the image recognition data, the selected tag(s) and the generated caption are stored in a database).  
	Regarding claim 11, Allekotte discloses a method of story generation for an electronic device, the method comprising: extracting objects from at least one image and/or video ([0023] discloses an image recognition step is used to identify one or more items or objects depicted in an image); analyzing information related to the objects extracted from the at least one image and/or video ([0023] further discloses information obtained from metadata related to the image; moreover, [0015] discloses metadata can include information associated with the image such as location data, description of the content or context of the image which can be understood as the information related to the object as claimed; [0023] further discloses an inferred image tag is determined at least in part from the metadata which indicates analyzing of information as claimed, therefore the metadata information can be understood as the analyzed information); selecting a style of a story to be generated based on the extracted objects and the analyzed information ([0023] discloses an image tag obtained from the metadata; moreover, [0016] discloses one or more image tags are determined which include broad descriptors such as “food”, “drink;” furthermore, [0021] disclose the user can select one of the image tags and an image caption can be generated based on the selected image tag; FIG. 1 shows the image and the different image tags and generated image caption; therefore, an image tag can be understood as the style as claimed and the caption can be understood as the story as claimed; In an embodiment, [0022], the caption can be generated automatically by using inferred tags/candidate tags with confidence values, therefore selection of tag can be done automatically without human interference at this step); generating the story by using the extracted objects and the analyzed information ([0023] discloses the image caption is generated based on the information from metadata and the detected objected); and change an object included in the generated story into the selected style (the disclosed invention teaches that changing of an object means changing a word indication of an object in the story sentence according to in page 14, third to last paragraph of the applications’ specification; this is taught in Allekotte’s [0027], which discloses one of the instances when the user select a new image tag, a new image caption is generated to add or replace the objects related to the image tags into the new caption which follows a corresponding caption template [0019, 0024] where the blank spaces of the template can be changed such as, “eating ____ at ____” where each of the blank spaces “____” can be changed; multiple word description of an image under BRI can be understood as a story).
Regarding claim 13, Allekotte discloses the method of claim 11, further comprising adding a new object in addition to the extracted objects (FIG 2 and FIG 3 shows the same image being processed where there are additional tags generated which can be selected and more than one tag can be selected and change the generated caption accordingly according to FIG. 3 of items 104 and 110); and generating a story by considering the new object (FIG. 3 shows the new object is added into the caption “sushi” along with the first object “sushi bar” so the caption is changed from “relaxing at the sushi bar” of FIG.2 to “eating sushi at the sushi bar” when additional tags are selected; therefore, this covers the instances when the first caption is generated “relaxing at the sushi bar” and when additional tags are selected, selecting of additional tags indicates pulling additional items stored in the metadata as discussed above in claim 1, and the additional items/objects being added into the new caption “eating sushi at the sushi bar” where sushi is the added item/object).
	Regarding claim 14, Allekotte discloses the method of claim 11, further comprising generating a story by considering information on applications related to the objects ([0030] discloses metadata can include information such as ownership data, copyright information, location data [GPS data, etc.]; these information are obtained understood as being obtained from other applications such as social medias, GPS application, the device’s default information library as claimed).
	Regarding claim 16, Allekotte discloses the method of claim 11, further comprising analyzing at least one of a scene or a context of the at least one image and/or video ([0019], lines 6-10, discloses the system generate different caption templates prior to generation of the caption, the templates associate with different activities or scenes of the image recognition data which is obtained from the image); and generating a story by considering information related to the analyzed scene ([0019], lines 15-17, discloses the caption template is used to generate the caption by specifying the image tag to be inserted into the template’s blank spaces).
	Regarding claim 18, Allekotte discloses the method of claim 1, further comprising generating a story by using at least some of plural sentences stored in a database ([0035] discloses the caption can be generated by selecting an image caption template from the one or more determined caption templates which indicate different templates being stored which indicates a plurality of sentences being stored as claimed which can covers the instances of there is a database storing the caption templates; moreover, [0037] discloses the information such as, the image, the metadata, the image recognition data, the selected tag(s) and the generated caption are stored in a database).  
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 2, 5, 12, 15 are rejected under 35 U.S.C. 103 as being unpatentable over Kevin Allekotte et al. (“US 2017/0115853 A1” hereinafter as “Allekotte”) in view of Harish Kasina (“US 2018/0357211 A1” hereinafter as “Kasina”)
Regarding claim 2, Allekotte discloses the electronic device of claim 1, wherein the processor is further configured to: assign weights to the analyzed information ([0022] discloses the determined image tags may include inferred/candidate tags which have associated confidence values; moreover, [0023] discloses the image tags are determined from the metadata, therefore the inferred/candidate tags can be understood as analyzed information from the metadata and the confidence values can be understood as the weights as claimed); and generate a story by considering the assigned weights ([0022] discloses generating image caption based on the inferred/candidate tags).
However, Allekotte does not explicitly discloses wherein the processor is further configured to: assign weights to the extracted objects before generation of a story considering the assigned weights.
In the same field of image caption/description generation (title, Kasina), Kasina discloses assigning weights to the extracted objects before generation of a story considering the assigned weights ([0092] discloses a weight is assigned to a word by assigning a weight to the word’s entropy feature, the weight-assigned entropy feature is used to determine the probability of the word being a good matching word to generate the caption based on determining if the word has high probability of occurrence and has not yet been used in the caption, this is done by comparing the candidate word with words in an attribute set first obtained from the image; moreover, [0090] discloses the candidates are obtained from an image through an image-to-word classification; [0052] discloses the words in the attribute set are words indication of objects such as, landmarks according to [0052, line 8], which means the candidate words are also objects since they are used to compared with words of the attribute set; therefore, this covers the instances of when each of the word indicates an object extracted from an image and each word has a weight assigned to it as claimed).
Thus, it would have been obvious for a person of ordinary skill in the art before the effective filing date to modify Allekotte’s device to perform object extraction from an image and analyze information associated with the objects, and assign weights to the extracted objects and the analyzed information to generate an image caption as taught by Kasina to arrive at the claimed invention discussed above. Such a modification is the result of combing prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been to add additional words into the caption that have yet been mentioned in the caption based on assigned weights ([0091], lines 3-7, Kasina).
Regarding claim 5, Allekotte discloses the electronic device of claim 1; however, Allekotte does not explicitly discloses wherein the processor is further configured to analyze at least one of a positional relationship, an emotional relationship, or a subjective relationship between the extracted objects.
In the same field of image caption/description generation (title, Kasina), Kasina discloses analyzing at least one of a positional relationship, an emotional relationship, or a subjective relationship between the extracted objects ([0052], the album processing component further determine attributes associated with each input image, the attributes can be emotions of the people in the image and relationship of those people in the image; furthermore, [0053] discloses the knowledge lookup component uses the attribute information to obtain preliminary narrative information which indicates an analyzing of the relationships between people step as claimed).
Thus, it would have been obvious for a person of ordinary skill in the art before the effective filing date to modify Allekotte’s device to perform object extraction from an analyze the emotional or subjective relationship between the extracted objects as taught by Kasina to arrive at the claimed invention discussed above. Such a modification is the result of combing prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been to obtain preliminary narrative information ([0053], lines 103, Kasina) and based on the preliminary narrative information can create a cohesive album narrative ([0055], lines 1-2, Kasina)
Regarding claim 12, Allekotte discloses the method of claim 11 further comprising assigning weights to the analyzed information ([0022] discloses the determined image tags may include inferred/candidate tags which have associated confidence values; moreover, [0023] discloses the image tags are determined from the metadata, therefore the inferred/candidate tags can be understood as analyzed information from the metadata and the confidence values can be understood as the weights as claimed); and generating a story by considering the assigned weights ([0022] discloses generating image caption based on the inferred/candidate tags).
However, Allekotte does not explicitly discloses wherein the processor is further configured to: assign weights to the extracted objects before generation of a story considering the assigned weights.
In the same field of image caption/description generation (title, Kasina), Kasina discloses assigning weights to the extracted objects before generation of a story considering the assigned weights ([0092] discloses a weight is assigned to a word by assigning a weight to the word’s entropy feature, the weight-assigned entropy feature is used to determine the probability of the word being a good matching word to generate the caption based on determining if the word has high probability of occurrence and has not yet been used in the caption, this is done by comparing the candidate word with words in an attribute set first obtained from the image; moreover, [0090] discloses the candidates are obtained from an image through an image-to-word classification; [0052] discloses the words in the attribute set are words indication of objects such as, landmarks according to [0052, line 8], which means the candidate words are also objects since they are used to compared with words of the attribute set; therefore, this covers the instances of when each of the word indicates an object extracted from an image and each word has a weight assigned to it as claimed).
Thus, it would have been obvious for a person of ordinary skill in the art before the effective filing date to modify Allekotte’s method to perform object extraction from an image and analyze information associated with the objects, and assign weights to the extracted objects and the analyzed information to generate an image caption as taught by Kasina to arrive at the claimed invention discussed above. Such a modification is the result of combing prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been to add additional words into the caption that have yet been mentioned in the caption based on assigned weights ([0091], lines 3-7, Kasina).
Regarding claim 15, Allekotte discloses the method of claim 1; however, Allekotte does not explicitly discloses the method further comprising analyzing at least one of a positional relationship, an emotional relationship, or a subjective relationship between the extracted objects.
In the same field of image caption/description generation (title, Kasina), Kasina discloses analyzing at least one of a positional relationship, an emotional relationship, or a subjective relationship between the extracted objects ([0052], the album processing component further determine attributes associated with each input image, the attributes can be emotions of the people in the image and relationship of those people in the image; furthermore, [0053] discloses the knowledge lookup component uses the attribute information to obtain preliminary narrative information which indicates an analyzing of the relationships between people step as claimed).
Thus, it would have been obvious for a person of ordinary skill in the art before the effective filing date to modify Allekotte’s method to perform object extraction from an analyze the emotional or subjective relationship between the extracted objects as taught by Kasina to arrive at the claimed invention discussed above. Such a modification is the result of combing prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been to obtain preliminary narrative information ([0053], lines 103, Kasina) and based on the preliminary narrative information can create a cohesive album narrative ([0055], lines 1-2, Kasina)
Claims 7, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Kevin Allekotte et al. (“US 2017/0115853 A1” hereinafter as “Allekotte”) in view of Kim Chang Hyun et al. (foreign patent document “KR 10-2018-0059030” hereinafter as “Hyun”)
Regarding claim 7, Allekotte discloses the electronic device of claim 1 of extracting objects and generating a story based on the extracted objects in a target as discussed previously in claim 1 where a target can be understood as the image or a video including sound since the objects are said to be extracted from it based on claim’s language; However, Allekotte does not explicitly disclose in case that a video is included in the target, the processor is further configured to: analyze a sound included in the video; and generate the story by considering information related to the analyzed sound.
In the same field of caption generation (title, Hyun), Hyun discloses in case that a video is included in the target ([0009] discloses the invention is for a multimedia image including a video and a voice) the processor is further configured to: analyze a sound included in the video ([0009] discloses the invention analyzes the image and recognize objects from the image and from the extracted objects, focus on a sentence/caption object of the image, and the invention also extracts a caption sentence based on speech through a speech recognition; moreover, [0033] discloses a caption/voice synchronization unit is used to generate caption data such as converting speech into words which indicates an analyzing of sound as claimed); and generate a story by considering information related to the analyzed sound ([0033] discloses the caption/voice synchronization unit discussed previously is used to generate caption data using the information obtained from the voice recognition unit; moreover, [0009] as discussed previously discloses the invention generate a caption sentence based on the voice recognition; this invention is to synchronize the audio with the subtitle by based on the subtitle or caption of the image match synchronize it with the generated caption sentence generated through voice recognition).
Thus, it would have been obvious for a person of ordinary skill in the art before the effective filing date to modify Allekotte’s device to perform object extraction from an analyze sound information in a target from which the objects are extracted and generate the caption sentence based on the analyzed sound information as taught by Hyun to arrive at the claimed invention discussed above. Such a modification is the result of combing prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been to intelligently determine an optimal subtitle output environment and constitute a subtitle and synchronize the audio and the image video ([0006] and [0009], Hyun).
Regarding claim 17, Allekotte discloses the method of claim 11 of extracting objects and generating a story based on the extracted objects in a target as discussed previously in claim 1 where a target can be understood as the image or a video including sound since the objects are said to be extracted from it based on claim’s language; However, Allekotte does not explicitly disclose in case that a video is included in the target, the processor is further configured to: analyze a sound included in the video; and generate the story by considering information related to the analyzed sound.
In the same field of caption generation (title, Hyun), Hyun discloses in case that a video is included in the target ([0009] discloses the invention is for a multimedia image including a video and a voice) the processor is further configured to: analyze a sound included in the video ([0009] discloses the invention analyzes the image and recognize objects from the image and from the extracted objects, focus on a sentence/caption object of the image, and the invention also extracts a caption sentence based on speech through a speech recognition; moreover, [0033] discloses a caption/voice synchronization unit is used to generate caption data such as converting speech into words which indicates an analyzing of sound as claimed); and generate a story by considering information related to the analyzed sound ([0033] discloses the caption/voice synchronization unit discussed previously is used to generate caption data using the information obtained from the voice recognition unit; moreover, [0009] as discussed previously discloses the invention generate a caption sentence based on the voice recognition; this invention is to synchronize the audio with the subtitle by based on the subtitle or caption of the image match synchronize it with the generated caption sentence generated through voice recognition).
Thus, it would have been obvious for a person of ordinary skill in the art before the effective filing date to modify Allekotte’s method to perform object extraction from an analyze sound information in a target from which the objects are extracted and generate the caption sentence based on the analyzed sound information as taught by Hyun to arrive at the claimed invention discussed above. Such a modification is the result of combing prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been to intelligently determine an optimal subtitle output environment and constitute a subtitle and synchronize the audio and the image video ([0006] and [0009], Hyun).
Pertinent Prior Arts
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
US2019200050A1: social media, texting purpose, extract objects and info., based on social media change the context based on language of user, and generate caption or story.
KR 20120111855 A: generate a story based on user info. (log info.) Collected from at least one electronic device.
US11386292: attention map, on an image change focus or attention on particular parts/latent space and generate captions based on the attended parts

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHUONG HAU CAI whose telephone number is (571)272-9424. The examiner can normally be reached M-F 8:30 am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire X. Wang can be reached on (571) 270-1051. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/PHUONG HAU CAI/
Examiner, Art Unit 2663   

/CLAIRE X WANG/Supervisory Patent Examiner, Art Unit 2663