Notice of Pre-AIA  or AIA  Status
DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is in response to communications filed May 31, 2022.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on May 31, 2022 has been entered.

Response to Arguments
Applicant's arguments filed May 31, 2022 regarding the rejection of claims 1-20 under 35 U.S.C 103 have been fully considered but they are moot in view of the new grounds of rejection.

Status of Claims
Claim 1-20 are pending, of which claims 1, 8, and 15 are in independent form. Claims 1-20 are rejected under 35 U.S.C. 103.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Kale et al. (US 2021/0034657) (hereinafter Kale) in view of Chavez et al. (US 10,685,057) (hereinafter Chavez), and in further view of Nguyen (US 10,437,833) (hereinafter Nguyen).
Regarding claim 1, Kale teaches a computer-implemented method for determining and arranging images to include in a visual representation, the method comprising: receiving a textual statement (Kale; For example, as shown in FIG. 4, the digital content contextual tagging system 106 can receive a search query in an interface 402 (e.g., "person with animal on white shirt") (receiving a textual statement). Additionally, as illustrated in FIG. 4, the digital content contextual tagging system 106 can provide digital images as search results in response to the search query in the interface 402. Indeed, as shown in FIG. 4, the digital content contextual tagging system 106 provides digital images that include tags related to one or more terms in the search query. [0058];); identifying a plurality of terms in the textual statement that are to be visualized in the visual representation, (Kale; Furthermore, as used herein, the word "tag" can refer to a description (or information) including one or more terms and/or values. In particular, the word "tag" can refer to a description, that represents an object, scene, attribute, and/or another aspect (e.g., verbs, nouns, adjectives, etc.) portrayed in a digital content item (e.g., a digital image), with terms and/or values (e.g., a keyword). Indeed, the word "tag" can refer to conceptual labels (i.e., textual keywords) (identifying a plurality of terms in the textual statement) used to describe image attributes. As an example, a tag can include text within metadata for a digital media content item. Additionally, a tag can include text from a vocabulary (or dictionary). Moreover, as used herein, the word "tag characteristic" can refer to information indicating one or more attributes of a tag. In particular, the word "tag characteristic" can refer to information such as a tag size, tag complexity, and/or tag language. Furthermore, as used herein, the word "tag size" can refer to the length of a tag. In particular, the word "tag size" can refer to the length of a tag in regard to the number of characters of a tag and/or the number of terms in a tag. [0031]; As previously mentioned, the digital content contextual tagging system 106 utilizes a search query to determine and/or associate multi-term contextual tags to one or more digital images (e.g., the search query in the interface 402 in FIG. 4). Indeed, the digital content contextual tagging system 106 can receive a search query that includes any string of text. Furthermore, the digital content contextual tagging system 106 can utilize a search engine to provide one or more search results based on the string of text of the search query. For instance, the digital content contextual tagging system 106 can provide one or more images that include independent tags and/or multi-term contextual tags that match the search query. [0063]; Furthermore, the digital content contextual tagging system 106 can compare the tags of the digital image with the significant keywords of a search query. For instance, the digital content contextual tagging system 106 can determine, utilizing a natural language processing technique, significant keywords within a search query. Then, the digital content contextual tagging system 106 determines whether a selected digital image includes all of the significant keywords within the search query. Indeed, if the selected digital image does include all of the significant keywords determined in the search query, the digital content contextual tagging system 106 can associate a multi-term contextual tag (from the search query) to the selected digital image. [0070]);
determining a global coherence and a local coherence for each of the sequences based on the tags of the images (Kale; In one or more embodiments, the digital content contextual tagging system determines and associates multi-term contextual tags (and scores) with images using a correspondence between user search queries, tags of the images, and user selections of the images (determining a global coherence) in response to the user search queries. Furthermore, in some embodiments, the digital content contextual tagging system identifies additional images based on similarities with the images associated with the multi-term contextual tags (e.g., using a k-nearest neighbor algorithm to cluster the digital images (local coherence for each of the sequences based on the tags of the images)). Then, in one or more embodiments, the digital content contextual tagging system propagates the multi-term contextual tags to the additional images based on a combination of tag scores and image similarity scores. Moreover, in some embodiments, the digital content contextual tagging system provides images that are associated with multi-term contextual tags in response to receiving search queries that include such multi-term contextual tags. [0022];); 
selecting one of the sequences based on the global coherence and the local coherence (Kale; In one or more embodiments, the digital content contextual tagging system 106 utilizes a variety of methods to compare the one or more image descriptors to identify the similar image descriptors. For example, the digital content contextual tagging system 106 can utilize methods (or algorithms) such as, but not limited to, k-nearest neighbor algorithm, cosine similarity calculations, other clustering techniques, and/or embedding spaces to compare the one or more image descriptors (based on the global coherence and the local coherence) to identify the similar image descriptors (e.g., to identify a cluster of similar images). For instance, the digital content contextual tagging system 106 can utilize a k-nearest neighbor algorithm to determine distance values (e.g., a Euclidean distance) between image descriptors within a higher dimensional space (e.g., a Euclidean space). Then, the digital content contextual tagging system 106 can utilize a "k" number of image descriptors (e.g., a number selected and/or configured by a neural network (selecting one of the sequences), user of the digital content contextual tagging system 106, and/or the digital content contextual tagging system 106) based on the determined distance values. [0084];). 
Kale does not explicitly teach generating a plurality of sequences of images where each image in a given one of the sequences is associated with one of the terms, each image being associated with at least one tags each image in the given one of the sequences being arranged according to the statement sequence;  the plurality of terms being ordered in a statement sequence that corresponds to an input for the textual statement; the local coherence being determined for each of the sequences in the statement order; and generating the visual representation where the images of the selected sequence are included in the statement sequence.
Chavez teaches generating a plurality of sequences of images where each image in a given one of the sequences is associated with one of the terms, each image being associated with at least one tag, each image in the given one of the sequences being arranged according to the statement sequence (see Figs. 4-5, col. 2 ln 1-11, col. 16 ln 8-26, discloses generating a first and second collection of images associated with predetermined keywords found in received search query and each image is associated with an image identifier, the first and second collections of images are prioritized in a listing of prioritized images that are prioritized by a degree of relevancy to a search query)
Kale/Chavez are analogous arts as they are each from the same field of endeavor of database systems.
Before the effective filing date of the invention it would have been obvious to a person of ordinary skill in the art to modify the system of Kale to arrange images according to order from disclosure of Chavez. The motivation to combine these arts is disclosed by Chavez as “reducing the volume of the image search space, and hence decreasing the latency in identifying images with the relevant style class or classes” (Col 16, lines 30-32) and arranging images according to order is well known to persons of ordinary skill in the art, and therefore one of ordinary skill would have good reason to pursue the known options within his or her technical grasp that would lead to anticipated success.
Kale/Chavez do not explicitly teach the plurality of terms being ordered in a statement sequence that corresponds to an input for the textual statement; the local coherence being determined for each of the sequences in the statement order; and generating the visual representation where the images of the selected sequence are included in the statement sequence.
Nguyen teaches the plurality of terms being ordered in a statement sequence that corresponds to an input for the textual statement (see col. 3 ln 25-34, col. 17 ln 4-15, discloses ordering of terms in an ontology, mapping a sentence to a corresponding query, such as query “pasta sauce goes on pasta” to ontology ordered statement “sauce goes on pasta”, [food:noodle]); the local coherence being determined for each of the sequences in the statement order (see col. 17 ln 4-15, discloses determining local coherence in mapping child or sibling “sauce is put on pho” to parent “sauce goes on pasta” for query “pasta sauce goes on pasta”); and generating the visual representation where the images of the selected sequence are included in the statement sequence (see Fig. 3, col. 5 ln 65-67, col. 14 ln 45-51, discloses generating a pop-up of visualizations of selected sequences included in statement sequence).
Kale/Chavez/Nguyen are analogous arts as they are each from the same field of endeavor of database systems.
Before the effective filing date of the invention it would have been obvious to a person of ordinary skill in the art to modify the system of Kale/Chavez to include an statement sequence from disclosure of Chavez. The motivation to combine these arts is disclosed by Chavez as “permits better summarization by extracting the relevant portions of text and presenting the relevant section to the user in one or more presentations” (Col. 2 lines 54-56) and including an statement sequence is well known to persons of ordinary skill in the art, and therefore one of ordinary skill would have good reason to pursue the known options within his or her technical grasp that would lead to anticipated success.




Regarding claim 8, Kale teaches a product for determining and arranging images to include in a visual representation, the method comprising: receiving a textual statement (Kale; For example, as shown in FIG. 4, the digital content contextual tagging system 106 can receive a search query in an interface 402 (e.g., "person with animal on white shirt") (receiving a textual statement). Additionally, as illustrated in FIG. 4, the digital content contextual tagging system 106 can provide digital images as search results in response to the search query in the interface 402. Indeed, as shown in FIG. 4, the digital content contextual tagging system 106 provides digital images that include tags related to one or more terms in the search query. [0058];); identifying a plurality of terms in the textual statement that are to be visualized in the visual representation, (Kale; Furthermore, as used herein, the word "tag" can refer to a description (or information) including one or more terms and/or values. In particular, the word "tag" can refer to a description, that represents an object, scene, attribute, and/or another aspect (e.g., verbs, nouns, adjectives, etc.) portrayed in a digital content item (e.g., a digital image), with terms and/or values (e.g., a keyword). Indeed, the word "tag" can refer to conceptual labels (i.e., textual keywords) (identifying a plurality of terms in the textual statement) used to describe image attributes. As an example, a tag can include text within metadata for a digital media content item. Additionally, a tag can include text from a vocabulary (or dictionary). Moreover, as used herein, the word "tag characteristic" can refer to information indicating one or more attributes of a tag. In particular, the word "tag characteristic" can refer to information such as a tag size, tag complexity, and/or tag language. Furthermore, as used herein, the word "tag size" can refer to the length of a tag. In particular, the word "tag size" can refer to the length of a tag in regard to the number of characters of a tag and/or the number of terms in a tag. [0031]; As previously mentioned, the digital content contextual tagging system 106 utilizes a search query to determine and/or associate multi-term contextual tags to one or more digital images (e.g., the search query in the interface 402 in FIG. 4). Indeed, the digital content contextual tagging system 106 can receive a search query that includes any string of text. Furthermore, the digital content contextual tagging system 106 can utilize a search engine to provide one or more search results based on the string of text of the search query. For instance, the digital content contextual tagging system 106 can provide one or more images that include independent tags and/or multi-term contextual tags that match the search query. [0063]; Furthermore, the digital content contextual tagging system 106 can compare the tags of the digital image with the significant keywords of a search query. For instance, the digital content contextual tagging system 106 can determine, utilizing a natural language processing technique, significant keywords within a search query. Then, the digital content contextual tagging system 106 determines whether a selected digital image includes all of the significant keywords within the search query. Indeed, if the selected digital image does include all of the significant keywords determined in the search query, the digital content contextual tagging system 106 can associate a multi-term contextual tag (from the search query) to the selected digital image. [0070]);
determining a global coherence and a local coherence for each of the sequences based on the tags of the images (Kale; In one or more embodiments, the digital content contextual tagging system determines and associates multi-term contextual tags (and scores) with images using a correspondence between user search queries, tags of the images, and user selections of the images (determining a global coherence) in response to the user search queries. Furthermore, in some embodiments, the digital content contextual tagging system identifies additional images based on similarities with the images associated with the multi-term contextual tags (e.g., using a k-nearest neighbor algorithm to cluster the digital images (local coherence for each of the sequences based on the tags of the images)). Then, in one or more embodiments, the digital content contextual tagging system propagates the multi-term contextual tags to the additional images based on a combination of tag scores and image similarity scores. Moreover, in some embodiments, the digital content contextual tagging system provides images that are associated with multi-term contextual tags in response to receiving search queries that include such multi-term contextual tags. [0022];); 
selecting one of the sequences based on the global coherence and the local coherence (Kale; In one or more embodiments, the digital content contextual tagging system 106 utilizes a variety of methods to compare the one or more image descriptors to identify the similar image descriptors. For example, the digital content contextual tagging system 106 can utilize methods (or algorithms) such as, but not limited to, k-nearest neighbor algorithm, cosine similarity calculations, other clustering techniques, and/or embedding spaces to compare the one or more image descriptors (based on the global coherence and the local coherence) to identify the similar image descriptors (e.g., to identify a cluster of similar images). For instance, the digital content contextual tagging system 106 can utilize a k-nearest neighbor algorithm to determine distance values (e.g., a Euclidean distance) between image descriptors within a higher dimensional space (e.g., a Euclidean space). Then, the digital content contextual tagging system 106 can utilize a "k" number of image descriptors (e.g., a number selected and/or configured by a neural network (selecting one of the sequences), user of the digital content contextual tagging system 106, and/or the digital content contextual tagging system 106) based on the determined distance values. [0084];). 
Kale does not explicitly teach generating a plurality of sequences of images where each image in a given one of the sequences is associated with one of the terms, each image being associated with at least one tags each image in the given one of the sequences being arranged according to the statement sequence;  the plurality of terms being ordered in a statement sequence that corresponds to an input for the textual statement; the local coherence being determined for each of the sequences in the statement order; and generating the visual representation where the images of the selected sequence are included in the statement sequence.
Chavez teaches generating a plurality of sequences of images where each image in a given one of the sequences is associated with one of the terms, each image being associated with at least one tag, each image in the given one of the sequences being arranged according to the statement sequence (see Figs. 4-5, col. 2 ln 1-11, col. 16 ln 8-26, discloses generating a first and second collection of images associated with predetermined keywords found in received search query and each image is associated with an image identifier, the first and second collections of images are prioritized in a listing of prioritized images that are prioritized by a degree of relevancy to a search query)
Kale/Chavez are analogous arts as they are each from the same field of endeavor of database systems.
Before the effective filing date of the invention it would have been obvious to a person of ordinary skill in the art to modify the system of Kale to arrange images according to order from disclosure of Chavez. The motivation to combine these arts is disclosed by Chavez as “reducing the volume of the image search space, and hence decreasing the latency in identifying images with the relevant style class or classes” (Col 16, lines 30-32) and arranging images according to order is well known to persons of ordinary skill in the art, and therefore one of ordinary skill would have good reason to pursue the known options within his or her technical grasp that would lead to anticipated success.
Kale/Chavez do not explicitly teach the plurality of terms being ordered in a statement sequence that corresponds to an input for the textual statement; the local coherence being determined for each of the sequences in the statement order; and generating the visual representation where the images of the selected sequence are included in the statement sequence.
Nguyen teaches the plurality of terms being ordered in a statement sequence that corresponds to an input for the textual statement (see col. 3 ln 25-34, col. 17 ln 4-15, discloses ordering of terms in an ontology, mapping a sentence to a corresponding query, such as query “pasta sauce goes on pasta” to ontology ordered statement “sauce goes on pasta”, [food:noodle]); the local coherence being determined for each of the sequences in the statement order (see col. 17 ln 4-15, discloses determining local coherence in mapping child or sibling “sauce is put on pho” to parent “sauce goes on pasta” for query “pasta sauce goes on pasta”); and generating the visual representation where the images of the selected sequence are included in the statement sequence (see Fig. 3, col. 5 ln 65-67, col. 14 ln 45-51, discloses generating a pop-up of visualizations of selected sequences included in statement sequence).
Kale/Chavez/Nguyen are analogous arts as they are each from the same field of endeavor of database systems.
Before the effective filing date of the invention it would have been obvious to a person of ordinary skill in the art to modify the system of Kale/Chavez to include an statement sequence from disclosure of Chavez. The motivation to combine these arts is disclosed by Chavez as “permits better summarization by extracting the relevant portions of text and presenting the relevant section to the user in one or more presentations” (Col. 2 lines 54-56) and including an statement sequence is well known to persons of ordinary skill in the art, and therefore one of ordinary skill would have good reason to pursue the known options within his or her technical grasp that would lead to anticipated success.
	
Regarding claim 15, Kale teaches a system for determining and arranging images to include in a visual representation, the method comprising: receiving a textual statement (Kale; For example, as shown in FIG. 4, the digital content contextual tagging system 106 can receive a search query in an interface 402 (e.g., "person with animal on white shirt") (receiving a textual statement). Additionally, as illustrated in FIG. 4, the digital content contextual tagging system 106 can provide digital images as search results in response to the search query in the interface 402. Indeed, as shown in FIG. 4, the digital content contextual tagging system 106 provides digital images that include tags related to one or more terms in the search query. [0058];); identifying a plurality of terms in the textual statement that are to be visualized in the visual representation, (Kale; Furthermore, as used herein, the word "tag" can refer to a description (or information) including one or more terms and/or values. In particular, the word "tag" can refer to a description, that represents an object, scene, attribute, and/or another aspect (e.g., verbs, nouns, adjectives, etc.) portrayed in a digital content item (e.g., a digital image), with terms and/or values (e.g., a keyword). Indeed, the word "tag" can refer to conceptual labels (i.e., textual keywords) (identifying a plurality of terms in the textual statement) used to describe image attributes. As an example, a tag can include text within metadata for a digital media content item. Additionally, a tag can include text from a vocabulary (or dictionary). Moreover, as used herein, the word "tag characteristic" can refer to information indicating one or more attributes of a tag. In particular, the word "tag characteristic" can refer to information such as a tag size, tag complexity, and/or tag language. Furthermore, as used herein, the word "tag size" can refer to the length of a tag. In particular, the word "tag size" can refer to the length of a tag in regard to the number of characters of a tag and/or the number of terms in a tag. [0031]; As previously mentioned, the digital content contextual tagging system 106 utilizes a search query to determine and/or associate multi-term contextual tags to one or more digital images (e.g., the search query in the interface 402 in FIG. 4). Indeed, the digital content contextual tagging system 106 can receive a search query that includes any string of text. Furthermore, the digital content contextual tagging system 106 can utilize a search engine to provide one or more search results based on the string of text of the search query. For instance, the digital content contextual tagging system 106 can provide one or more images that include independent tags and/or multi-term contextual tags that match the search query. [0063]; Furthermore, the digital content contextual tagging system 106 can compare the tags of the digital image with the significant keywords of a search query. For instance, the digital content contextual tagging system 106 can determine, utilizing a natural language processing technique, significant keywords within a search query. Then, the digital content contextual tagging system 106 determines whether a selected digital image includes all of the significant keywords within the search query. Indeed, if the selected digital image does include all of the significant keywords determined in the search query, the digital content contextual tagging system 106 can associate a multi-term contextual tag (from the search query) to the selected digital image. [0070]);
determining a global coherence and a local coherence for each of the sequences based on the tags of the images (Kale; In one or more embodiments, the digital content contextual tagging system determines and associates multi-term contextual tags (and scores) with images using a correspondence between user search queries, tags of the images, and user selections of the images (determining a global coherence) in response to the user search queries. Furthermore, in some embodiments, the digital content contextual tagging system identifies additional images based on similarities with the images associated with the multi-term contextual tags (e.g., using a k-nearest neighbor algorithm to cluster the digital images (local coherence for each of the sequences based on the tags of the images)). Then, in one or more embodiments, the digital content contextual tagging system propagates the multi-term contextual tags to the additional images based on a combination of tag scores and image similarity scores. Moreover, in some embodiments, the digital content contextual tagging system provides images that are associated with multi-term contextual tags in response to receiving search queries that include such multi-term contextual tags. [0022];); 
selecting one of the sequences based on the global coherence and the local coherence (Kale; In one or more embodiments, the digital content contextual tagging system 106 utilizes a variety of methods to compare the one or more image descriptors to identify the similar image descriptors. For example, the digital content contextual tagging system 106 can utilize methods (or algorithms) such as, but not limited to, k-nearest neighbor algorithm, cosine similarity calculations, other clustering techniques, and/or embedding spaces to compare the one or more image descriptors (based on the global coherence and the local coherence) to identify the similar image descriptors (e.g., to identify a cluster of similar images). For instance, the digital content contextual tagging system 106 can utilize a k-nearest neighbor algorithm to determine distance values (e.g., a Euclidean distance) between image descriptors within a higher dimensional space (e.g., a Euclidean space). Then, the digital content contextual tagging system 106 can utilize a "k" number of image descriptors (e.g., a number selected and/or configured by a neural network (selecting one of the sequences), user of the digital content contextual tagging system 106, and/or the digital content contextual tagging system 106) based on the determined distance values. [0084];). 
Kale does not explicitly teach generating a plurality of sequences of images where each image in a given one of the sequences is associated with one of the terms, each image being associated with at least one tags each image in the given one of the sequences being arranged according to the statement sequence;  the plurality of terms being ordered in a statement sequence that corresponds to an input for the textual statement; the local coherence being determined for each of the sequences in the statement order; and generating the visual representation where the images of the selected sequence are included in the statement sequence.
Chavez teaches generating a plurality of sequences of images where each image in a given one of the sequences is associated with one of the terms, each image being associated with at least one tag, each image in the given one of the sequences being arranged according to the statement sequence (see Figs. 4-5, col. 2 ln 1-11, col. 16 ln 8-26, discloses generating a first and second collection of images associated with predetermined keywords found in received search query and each image is associated with an image identifier, the first and second collections of images are prioritized in a listing of prioritized images that are prioritized by a degree of relevancy to a search query)
Kale/Chavez are analogous arts as they are each from the same field of endeavor of database systems.
Before the effective filing date of the invention it would have been obvious to a person of ordinary skill in the art to modify the system of Kale to arrange images according to order from disclosure of Chavez. The motivation to combine these arts is disclosed by Chavez as “reducing the volume of the image search space, and hence decreasing the latency in identifying images with the relevant style class or classes” (Col 16, lines 30-32) and arranging images according to order is well known to persons of ordinary skill in the art, and therefore one of ordinary skill would have good reason to pursue the known options within his or her technical grasp that would lead to anticipated success.
Kale/Chavez do not explicitly teach the plurality of terms being ordered in a statement sequence that corresponds to an input for the textual statement; the local coherence being determined for each of the sequences in the statement order; and generating the visual representation where the images of the selected sequence are included in the statement sequence.
Nguyen teaches the plurality of terms being ordered in a statement sequence that corresponds to an input for the textual statement (see col. 3 ln 25-34, col. 17 ln 4-15, discloses ordering of terms in an ontology, mapping a sentence to a corresponding query, such as query “pasta sauce goes on pasta” to ontology ordered statement “sauce goes on pasta”, [food:noodle]); the local coherence being determined for each of the sequences in the statement order (see col. 17 ln 4-15, discloses determining local coherence in mapping child or sibling “sauce is put on pho” to parent “sauce goes on pasta” for query “pasta sauce goes on pasta”); and generating the visual representation where the images of the selected sequence are included in the statement sequence (see Fig. 3, col. 5 ln 65-67, col. 14 ln 45-51, discloses generating a pop-up of visualizations of selected sequences included in statement sequence).
Kale/Chavez/Nguyen are analogous arts as they are each from the same field of endeavor of database systems.
Before the effective filing date of the invention it would have been obvious to a person of ordinary skill in the art to modify the system of Kale/Chavez to include an statement sequence from disclosure of Chavez. The motivation to combine these arts is disclosed by Chavez as “permits better summarization by extracting the relevant portions of text and presenting the relevant section to the user in one or more presentations” (Col. 2 lines 54-56) and including an statement sequence is well known to persons of ordinary skill in the art, and therefore one of ordinary skill would have good reason to pursue the known options within his or her technical grasp that would lead to anticipated success.

	
As for claims 2, 9 and 16, Kale/Chavez/Nguyen teaches the method, product and system of claims 1, 8 and 15, further comprising: identifying a plurality of image buckets having a respective associated term based on the identified terms in the textual statement, wherein the images for the given sequence are selected from the identified image buckets (Kale; One or more embodiments of the present disclosure include a digital content contextual tagging system that can determine multi-term contextual tags for digital content and propagate the multi-term contextual tags to additional digital content. In particular, the digital content contextual tagging system can utilize a multi-modal learning framework to mine relevant tag combinations from search engine user behavior data and propagate these across an image database (plurality of image buckets) (e.g., to other similar images). More specifically, based on the assumption that user behavior is a form of weak labeling, the digital content contextual tagging system utilizes user queries and their image click signal as one modality of signal and a deep neural network for image understanding as the other modality to perform visual grounding of tag correlations. [0019]; Moreover, the digital content contextual tagging system 106 can utilize the clusters to propagate multi-term contextual tags to additional digital images (identifying a plurality of image buckets having a respective associated term based on the identified terms in the textual statement). For instance, as shown in FIG. 3, the digital content contextual tagging system 106 can identify digital images with multi-term contextual tags (and corresponding tag scores and/or similarity scores) in act 304 from within a cluster of images (of act 302). Moreover, the digital content contextual tagging system 106 can generate, from the identified digital images of act 304, aggregated scores for the multi-term contextual tags in act 306. For instance, the digital content contextual tagging system 106 can provide weights to the aggregated scores (of act 306) utilizing various factors such as the tag scores and/or similarity scores corresponding to the identified digital images (of act 304). A more detailed description of the digital content contextual tagging system 106 identifying digital images that include multi-term contextual tags from a cluster of images and/or aggregating scores for multi-term contextual tags corresponding to the identified digital images is described in greater detail in FIG. 5B. [0053]; Moreover, in one or more embodiments, the digital content contextual tagging system 106 utilizes clustering algorithms (and/or techniques) to cluster digital images from a collection of digital images based on semantic and/or visual similarities of the digital images (wherein the images for the given sequence are selected from the identified image buckets) (e.g., the image descriptors corresponding to the digital images). For instance, the digital content contextual tagging system 106 can utilize clustering techniques such as, but not limited to, K-Means clustering and/or recursive K-Means clustering to cluster the digital images (or image descriptors) from the collection of digital images into clusters of a desirable size based on the similarity of the digital images. In particular, the digital content contextual tagging system 106 can analyze the one or more image descriptors generated from the collection of images to identify image descriptors that are similar. Indeed, the digital content contextual tagging system 106 can determine distance values between the image descriptors to identify similar image descriptors (e.g., to identify a cluster of similar images). [0083];).

As for claims 3, 10 and 17, Kale/Chavez/Nguyen teaches the method, product and system of claims 1, 8 and 15, wherein the global coherence is sequence agnostic (Kale; Additionally, the digital content contextual tagging system can identify images that have multi-term contextual tags (from the cluster). Then, the digital content contextual tagging system can utilize scores corresponding to such images and/or such multi-term contextual tags (e.g., tag scores and similarity scores) to generate aggregated scores for the multi-term contextual tags belonging to the cluster (e.g., by weighting the scores). Indeed, the digital content contextual tagging system can utilize the aggregated scores and/or other characteristics of the multi-term contextual tags (e.g., tag size) to rank the multi-term contextual tags. Furthermore, the digital content contextual tagging system can filter (or prune) the multi-term contextual tags based on the rankings to determine a final set (and/or list) of multi-term contextual tags for the cluster of images (global coherence is sequence agnostic). Indeed, the digital content contextual tagging system can associate the final set of multi-term contextual tags with images from the cluster (e.g., which includes the additional images). [0025];).

As for claims 4, 11 and 18, Kale/Chavez/Nguyen teaches the method, product and system of claims 1, 8 and 15, wherein the local coherence is based on adjacent images within the sequence of images (Kale; Furthermore, as used herein, the word "ranking score" (sometimes referred to as a "ranking") can refer to a value and/or ordering that represents a position of an item relative to other items. In particular, the word "ranking score" can refer to a value and/or ordering that represents a hierarchical position of a multi-term contextual tag in relation to other multi-term contextual tags based on the relevance of the multi-term contextual tags to a digital content item (local coherence is based on adjacent images within the sequence of images) and/or cluster of digital content. For instance, a ranking score can include a normalized score (from 0 to 1) for multi-term contextual tags that determines a hierarchical position for the multi-term contextual tags in a list or set (e.g., 0 being the lowest rank and 1 being the highest rank). [0042];).

As for claims 5, 12 and 19, Kale/Chavez/Nguyen teaches the method, product and system of claims 1, 8 and 15, wherein the selected sequence is selected based on whether the global coherence is at least a predetermined global threshold, whether the local coherence is at least a predetermined local threshold, or a combination thereof (Kale; Furthermore, the digital content contextual tagging system 106 can utilize search engine logs to determine multi-term contextual tags for digital images and/or tag scores for the digital images. In particular, the digital content contextual tagging system 106 can aggregate multi-term contextual tags determined from search queries and images selected in response to those search queries from a search engine log. Furthermore, the digital content contextual tagging system 106 can utilize such aggregated multi-term contextual tag and image combinations to determine a click frequency (e.g., a query-image frequency distribution) for each multi-term contextual tag that is determined to correspond to a digital image from the search engine logs. Moreover, the digital content contextual tagging system 106 can utilize the query-image frequency distribution to prune (e.g., select a final set of multi-term contextual tags for the digital images in the search queries) by using a frequency threshold (selected sequence is selected based on whether the global coherence is at least a predetermined global threshold, whether the local coherence is at least a predetermined local threshold, or a combination thereof). In particular, the digital content contextual tagging system can utilize the frequency threshold as a hyper-parameter utilized to control the quality and number of multi-term contextual tags that are associated with the digital images from the search queries and/or search query logs. [0076];).

As for claims 6, 13 and 20, Kale/Chavez/Nguyen teaches the method, product and system of claims 1, 8 and 15, wherein the selected sequence has a highest global coherence, a highest local coherence, or a combination thereof (Kale; As used herein, the word "tag score" (sometimes referred to as a "multi-term contextual tag score") can refer to a value that represents a confidence and/or relevance of tag. In particular, the word "tag score" can refer to a value that represents a confidence and/or relevance of a tag (e.g., a multi-term contextual tag) in relation to a digital content item (highest local coherence). For instance, a tag score can represent a confidence and/or relevance value that indicates the likelihood of a tag belonging to a digital content item. For example, a tag score can be a numerical value such as "0.95" for a multi-term contextual tag (e.g., "a woman wearing a red dress and a blue hat") for an image that portrays a woman wearing a red dress and a blue hat because the multi-term contextual tag represents the image. Furthermore, a tag score can be based on a selection frequency. [0034]; Additionally, as used herein, the word "aggregated score" (sometimes referred to as an "aggregated multi-term contextual tag score," "aggregated tag score," or "aggregate score") can refer to weighted and/or a combination one or more tag scores belonging to a tag (e.g., a multi-term contextual tag). In particular, the word "aggregated score" can refer to weighted and/or a combination of one or more tag scores belonging to a multi-term contextual tag from one or more digital content items that indicates an illustrative multi-term contextual tag score across the one or more digital items that include the one or more tag scores (highest global coherence) belonging to the multi-term contextual tag. [0036];).

As for claims 7 and 14, Kale/Chavez/Nguyen teaches the method and product and system of claims 1 and 8, computer-implemented method of claim 1, wherein the global coherence and the local coherence is based on a tag popularity (Kale; Additionally, the digital content contextual tagging system can identify images that have multi-term contextual tags (from the cluster). Then, the digital content contextual tagging system can utilize scores corresponding to such images and/or such multi-term contextual tags (e.g., tag scores and similarity scores) to generate aggregated scores for the multi-term contextual tags belonging to the cluster (e.g., by weighting the scores). Indeed, the digital content contextual tagging system can utilize the aggregated scores and/or other characteristics of the multi-term contextual tags (e.g., tag size) to rank the multi-term contextual tags (global coherence and the local coherence is based on a tag popularity). Furthermore, the digital content contextual tagging system can filter (or prune) the multi-term contextual tags based on the rankings to determine a final set (and/or list) of multi-term contextual tags for the cluster of images. Indeed, the digital content contextual tagging system can associate the final set of multi-term contextual tags with images from the cluster (e.g., which includes the additional images). [0025];).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to COURTNEY HARMON whose telephone number is (571)270-5861. The examiner can normally be reached M-F 9am - 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela Reyes can be reached on 517-270-1006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Courtney Harmon/Examiner, Art Unit 2159                                                                                                                                                                                                        /Mariela Reyes/Supervisory Patent Examiner, Art Unit 2159