Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This Office Action is in response to CLAIMS entered for patent application 17/470,441 filed on September 9, 2021.


Claims 1-20 are pending.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4-7, 9, 11, 14 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Boquet et al. (Pub. No.: US 2019/0377823) in view of Curtis et al. (Pub. No.: US 2010/0088726) and Bou et al. (Pub. No.: US 2019/0258671).
Regarding claim 1, Boquet discloses a non-transitory computer-readable medium storing instructions that, when executed by at least one processor, cause a computer system to: generate, utilizing a neural network, feature vectors for images (Fig. 1, elements 20 and 30, paras. [0028]-[0033], Figs. 2A-2C.); select one or more tagged feature vectors from a set of tagged feature vectors based on distances between the feature vector and the one or more tagged feature vectors from the set of tagged feature vectors (Fig. 1, elements 40 and 50, paras. [0040]-[0046]).
Boquet does not disclose wherein the instructions cause the computer system to extract a plurality of frames from a video; and thus does not disclose generating feature vectors for the plurality of frames from a video, nor generate a set of tags to associate with the video by selecting tags from the one or more tagged feature vectors and tagging the video with the set of tags. However, in analogous art, Curtis discloses that when generating recommended tags, the system will use “tags used by or recommended to other users for video items 34 in the video repository 24 that have audio and/or video content similar to that of the video item 34 (para. [0037]),” which teaches that tags may be selected from tagged related videos, which when combined with the teaching of Boquet, can be seen as tagged feature vectors. Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet to allow for the computer to extract a plurality of frames from a video, generate feature vectors for the plurality of frames from a video, generate a set of tags to associate with the video by selecting tags from the one or more tagged feature vectors, and tag the video with the set of tags. This would have produced predictable and desirable results, in that it would allow for the improvements of Boquet to be used in a wider variety of situations, such as with video, as well as allowing for tags of related videos to be used to associate with a given video.
It could be argued that Boquet and Curtis do not explicitly disclose causing the computer system to combine a subset of the feature vectors to generate an aggregated feature vector; and thus does not disclose selecting tagged feature vectors based on distances between the aggregated feature vector and the one or more tagged feature vectors from the set of tagged feature, nor generate a set of tags to associate with the video by aggregating the tags selected from the one or more tagged feature vectors. However, in analogous art, Bou discloses a dataset with “several training videos, each of which is labeled with one or more tags. However, the dataset does not contain information about where each tag occurs in the sequence. Our task is to classify whether an unknown test video contains each one of these tags. We use a weakly-supervised approach where a neural network predicts the tags of each frame independently and an aggregation layer computes the tags for the whole video based on the individual tags of each frame (para. [0028]).” Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet and Curtis to allow for causing the computer system to combine a subset of the feature vectors to generate an aggregated feature vector, select tagged feature vectors based on distances between the aggregated feature vector and the one or more tagged feature vectors from the set of tagged feature, and generate a set of tags to associate with the video by aggregating the tags selected from the one or more tagged feature vectors. This would have produced predictable and desirable results, in that it would allow for a video to have a plurality of tags related to the entire video, rather than only different tags related to different portions of the video, which could increase the effectiveness of the tags in terms of relaying information to interested parties.
Regarding claim 4, the combination of Boquet, Curtis and Bou discloses the non-transitory computer-readable medium of claim 1, and further discloses further comprising instructions that, when executed by the at least one processor, cause the computer system to: select the one or more tagged feature vectors from the set of tagged feature vectors by: determining distance values between the aggregated feature vector and the one or more tagged feature vectors from the set of tagged feature vectors; and selecting the one or more tagged feature vectors based on the one or more tagged feature vectors having distance values that meet a threshold distance value (Boquet, para. [0008]-[0016], language of claim 4; Bou, para. [0028]. This claim is rejected on the same grounds as claim 1.).
Regarding claim 5, the combination of Boquet, Curtis and Bou discloses the non-transitory computer-readable medium of claim 1, and further discloses further comprising instructions that, when executed by the at least one processor, cause the computer system to group the plurality of frames into a plurality of groups based on one or more characteristics of the frames of the plurality of frames; wherein subset of the feature vectors comprise feature vectors of the frames in a group of the plurality of groups (Boquet, Figs. 2A-2C, paras. [0050]-[0052]. This claim is rejected on the same grounds as claim 1.).
Regarding claim 6, the combination of Boquet, Curtis and Bou discloses the non-transitory computer-readable medium of claim 5, and further discloses further comprising instructions that, when executed by the at least one processor, cause the computer system to group the plurality of frames into the plurality of groups based on the one or more characteristics of the frames of the plurality of frames by grouping the frames based on time stamps associated with the frames (Curtis, paras. [0018]-[0019]; Bou, para. [0057]. This claim is rejected on the same grounds as claim 1.).
Regarding claim 7, the combination of Boquet, Curtis and Bou discloses the non-transitory computer-readable medium of claim 5, and further discloses further comprising instructions that, when executed by the at least one processor, cause the computer system to group the plurality of frames into the plurality of groups based on the one or more characteristics of the frames of the plurality of frames by grouping the frames into delineated scenes within the video (Curtis, para. [0029]. This claim is rejected on the same grounds as claim 1.).
Regarding claim 9, the combination of Boquet, Curtis and Bou discloses the non-transitory computer-readable medium of claim 1, and further discloses further comprising instructions that, when executed by the at least one processor, cause the computer system to associate the set of tags with a temporal segment of the video comprising the set of frames (Bou, paras. [0025] and [0057]. This claim is rejected on the same grounds as claim 1.).
Regarding claim 11, Boquet discloses a system comprising: memory comprising a neural network and a set of tagged feature vectors corresponding to a set of media content items, the set of tagged feature vectors comprising feature vectors generated from media content items and tagged with labels that correspond to content of the media content items (Fig. 1, elements 40 and 50, paras. [0040]-[0046]); and at least one server configured to cause the system to: generate, utilizing a neural network, feature vectors for images (Fig. 1, elements 20 and 30, paras. [0028]-[0033], Figs. 2A-2C.); generate feature vectors by combining subsets of the feature vectors; determine tags for the feature vectors by: selecting one or more tagged feature vectors from the set of tagged feature vectors based on distances between the feature vectors and the one or more tagged feature vectors (Fig. 1, elements 40 and 50, paras. [0040]-[0046]); and extracting the tags associated with the one or more tagged feature vectors (Fig. 1, elements 40 and 50, paras. [0040]-[0046]).
Boquet does not disclose causing the system to extract a plurality of frames from a video; and thus does not disclose generating feature vectors for frames of the plurality of frames, nor tag the frames of the video associated with the feature vectors with the determined tags. However, in analogous art, Curtis discloses that when generating recommended tags, the system will use “tags used by or recommended to other users for video items 34 in the video repository 24 that have audio and/or video content similar to that of the video item 34 (para. [0037]),” which teaches that tags may be selected from tagged related videos, which when combined with the teaching of Boquet, can be seen as tagged feature vectors. Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet to allow for causing the system to extract a plurality of frames from a video, generating feature vectors for frames of the plurality of frames, and tagging the frames of the video associated with the feature vectors with the determined tags. This would have produced predictable and desirable results, in that it would allow for the improvements of Boquet to be used in a wider variety of situations, such as with video, as well as allowing for tags of related videos to be used to associate with a given video.
It could be argued that Boquet and Curtis do not explicitly disclose causing the system to generate aggregated feature vectors by combining subsets of the feature vectors; nor generate aggregated feature vectors by combining subsets of the feature vectors, nor determine tags for the aggregated feature vectors by selecting one or more tagged feature vectors from the set of tagged feature vectors based on distances between the aggregated feature vectors and the one or more tagged feature vectors, nor tag the frames of the video associated with the aggregated feature vectors. However, in analogous art, Bou discloses a dataset with “several training videos, each of which is labeled with one or more tags. However, the dataset does not contain information about where each tag occurs in the sequence. Our task is to classify whether an unknown test video contains each one of these tags. We use a weakly-supervised approach where a neural network predicts the tags of each frame independently and an aggregation layer computes the tags for the whole video based on the individual tags of each frame (para. [0028]).” Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet and Curtis to allow for causing the system to generate aggregated feature vectors by combining subsets of the feature vectors, generate aggregated feature vectors by combining subsets of the feature vectors, determine tags for the aggregated feature vectors by selecting one or more tagged feature vectors from the set of tagged feature vectors based on distances between the aggregated feature vectors and the one or more tagged feature vectors, and tag the frames of the video associated with the aggregated feature vectors. This would have produced predictable and desirable results, in that it would allow for a video to have a plurality of tags related to the entire video, rather than only different tags related to different portions of the video, which could increase the effectiveness of the tags in terms of relaying information to interested parties.
Regarding claim 14, the combination of Boquet, Curtis and Bou discloses the system of claim 11, and further discloses wherein the at least one server is further configured to cause the system to: receive a search request to identify videos associated with an action; identify that the video is tagged with a tag corresponding to the action; and returning the video in response to the search request (Curtis, para. [0028]. This claim is rejected on the same grounds as claim 11.).
Regarding claim 15, the combination of Boquet, Curtis and Bou discloses the system of claim 11, and further discloses wherein the at least one server is further configured to cause the system to: cluster the plurality of frames into a plurality of groups, each group of the plurality of groups corresponding to scene from the video; and wherein each subset of feature vectors comprises the feature vectors of the frames of a given group of the plurality of groups (Curtis, para. [0029]. This claim is rejected on the same grounds as claim 11.).


Claims 2, 3, 12, 17, 18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Boquet et al. (Pub. No.: US 2019/0377823) in view of Curtis et al. (Pub. No.: US 2010/0088726) and Bou et al. (Pub. No.: US 2019/0258671), and further in view of Dal Mutto et al. (Pub. No.: US 2019/0108396).
Regarding claim 2, the combination of Boquet, Curtis and Bou discloses the non-transitory computer-readable medium of claim 1, but does not explicitly disclose further comprising instructions that, when executed by the at least one processor, cause the computer system to combine the subset of the feature vectors to generate the aggregated feature vector by pooling feature values from the subset of feature vectors. However, in analogous art, Dal Mutto discloses the concept of max pooling (paras. [0133], [0141], [0142] and [0246]). Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet, Curtis and Bou to allow for combining the subset of the feature vectors to generate the aggregated feature vector by pooling feature values from the subset of feature vectors. This would have produced predictable and desirable results, in that it would allow for a well-known technique for combining feature vectors to be used.
Regarding claim 3, the combination as stated above discloses the non-transitory computer-readable medium of claim 2, and further discloses wherein pooling feature values from the subset of feature vectors comprises max pooling the feature values from the subset of feature vectors (Dal Mutto, paras. [0133], [0141], [0142] and [0246]. This claim is rejected on the same grounds as claim 2.).
Regarding claim 12, the combination of Boquet, Curtis and Bou discloses the system of claim 11, but does not explicitly disclose wherein the at least one server is further configured to cause the system to generate aggregated feature vectors by, for a given aggregated feature vector, utilizing averaging pooling or max pooling to combine feature vectors in a subset of feature vectors. However, in analogous art, Dal Mutto discloses the concept of max pooling (paras. [0133], [0141], [0142] and [0246]). Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet, Curtis and Bou to generate aggregated feature vectors by, for a given aggregated feature vector, utilizing averaging pooling or max pooling to combine feature vectors in a subset of feature vectors. This would have produced predictable and desirable results, in that it would allow for a well-known technique for combining feature vectors to be used.
Regarding claim 17, Boquet discloses a computer-implemented method for automatic tagging of images, the computer-implemented method comprising: generating, utilizing an image classification neural network, feature vectors for images (Fig. 1, elements 20 and 30, paras. [0028]-[0033], Figs. 2A-2C.); determining tags for the feature vectors by: selecting one or more tagged feature vectors from a set of tagged feature vectors based on distances between the feature vectors and the one or more tagged feature vectors (Fig. 1, elements 40 and 50, paras. [0040]-[0046]); and extracting the tags from the one or more tagged feature vectors (Fig. 1, elements 40 and 50, paras. [0040]-[0046]).
Boquet does not disclose the automatic tagging of videos, nor extracting a plurality of frames from a video; and thus does not disclose generating feature vectors for frames of the plurality of frames, nor and tagging the frames of the video associated with the feature vectors with the extracted tags. However, in analogous art, Curtis discloses that when generating recommended tags, the system will use “tags used by or recommended to other users for video items 34 in the video repository 24 that have audio and/or video content similar to that of the video item 34 (para. [0037]),” which teaches that tags may be selected from tagged related videos, which when combined with the teaching of Boquet, can be seen as tagged feature vectors. Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet to allow for the automatic tagging of videos, extracting a plurality of frames from a video, and generating feature vectors for frames of the plurality of frames, and tagging the frames of the video associated with the feature vectors with the extracted tags. This would have produced predictable and desirable results, in that it would allow for the improvements of Boquet to be used in a wider variety of situations, such as with video, as well as allowing for tags of related videos to be used to associate with a given video.
It could be argued that Boquet and Curtis do not explicitly disclose generating aggregated feature vectors by combining subsets of the feature vectors, nor determining tags for the aggregated feature vectors by selecting one or more tagged feature vectors from a set of tagged feature vectors based on distances between the aggregated feature vectors and the one or more tagged feature vectors, nor tagging the frames of the video associated with the aggregated feature vectors with the extracted tags. However, in analogous art, Bou discloses a dataset with “several training videos, each of which is labeled with one or more tags. However, the dataset does not contain information about where each tag occurs in the sequence. Our task is to classify whether an unknown test video contains each one of these tags. We use a weakly-supervised approach where a neural network predicts the tags of each frame independently and an aggregation layer computes the tags for the whole video based on the individual tags of each frame (para. [0028]).” Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet and Curtis to allow for generating aggregated feature vectors by combining subsets of the feature vectors, determining tags for the aggregated feature vectors by selecting one or more tagged feature vectors from a set of tagged feature vectors based on distances between the aggregated feature vectors and the one or more tagged feature vectors, and tagging the frames of the video associated with the aggregated feature vectors with the extracted tags. This would have produced predictable and desirable results, in that it would allow for a video to have a plurality of tags related to the entire video, rather than only different tags related to different portions of the video, which could increase the effectiveness of the tags in terms of relaying information to interested parties.
The combination of Boquet, Curtis and Bou does not explicitly disclose generating aggregated feature vectors by combining subsets of the feature vectors utilizing pooling. However, in analogous art, Dal Mutto discloses the concept of max pooling (paras. [0133], [0141], [0142] and [0246]). Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet, Curtis and Bou to allow for generating aggregated feature vectors by combining subsets of the feature vectors utilizing pooling. This would have produced predictable and desirable results, in that it would allow for a well-known technique for combining feature vectors to be used.
Regarding claim 18, the combination as stated above discloses the computer-implemented method of claim 17, and further discloses wherein each subset of feature vectors corresponds to a temporal segment of the video; and tagging the frames of the video associated with the aggregated feature vectors with the extracted tags comprises tagging a given temporal segment of the video with tags determined for a given aggregated feature vector (Bou, paras. [0025] and [0057]. This claim is rejected on the same grounds as claim 17.).
Regarding claim 20, the combination as stated above discloses the computer-implemented method of claim 17, and further discloses further comprising clustering the plurality of frames into a plurality of groups, each group of the plurality of groups corresponding to scene from the video; and wherein each subset of feature vectors comprises the feature vectors of the frames of a given group of the plurality of groups (Curtis, para. [0029]. This claim is rejected on the same grounds as claim 17.).


Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Boquet et al. (Pub. No.: US 2019/0377823) in view of Curtis et al. (Pub. No.: US 2010/0088726) and Bou et al. (Pub. No.: US 2019/0258671), and further in view of Verdejo et al. (Pub. No.: US 2018/0082122).
Regarding claim 8, the combination of Boquet, Curtis and Bou discloses the non-transitory computer-readable medium of claim 1, but does not explicitly disclose further comprising instructions that, when executed by the at least one processor, cause the computer system to identify the set of tagged feature vectors by: identifying a media content item comprising text representing one or more verbs; generating, utilizing the neural network, a tagged feature vector for the media content item; assigning tags to the tagged feature vector by assigning the one or more verbs to the tagged feature vector; and associating the tagged feature vector with the set of tagged feature vectors. However, in analogous art, Verdejo discloses that “analytics system 205 may associate tags with words included in the first data (e.g., based on tag association rules). In some implementations, the tag association rules may specify a manner in which the tags are to be associated with words, or based on characteristics of the words. For example, a tag association rule may specify that a singular noun tag (“/NN”) is to be associated with words that are singular nouns (e.g., based on a language database or a context analysis). In some implementations, a tag may include a part-of-speech (POS) tag, such as NN (noun, singular or mass), NNS (noun, plural), NNP (proper noun, singular), NNPS (proper noun, plural), VB (verb, base form), VBD (verb, past tense), VBG (verb, gerund or present participle), and/or the like (para. [0066]).” Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet, Curtis and Bou to allow for identifying the set of tagged feature vectors by identifying a media content item comprising text representing one or more verbs, generating, utilizing the neural network, a tagged feature vector for the media content item, assigning tags to the tagged feature vector by assigning the one or more verbs to the tagged feature vector, and associating the tagged feature vector with the set of tagged feature vectors. This would have produced predictable and desirable results, in that it would allow for more desired words and/or concepts to be found with greater specificity.


Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Boquet et al. (Pub. No.: US 2019/0377823) in view of Curtis et al. (Pub. No.: US 2010/0088726) and Bou et al. (Pub. No.: US 2019/0258671), and further in view of Li et al. (Pub. No.: US 2017/0047096).
Regarding claim 10, the combination of Boquet, Curtis and Bou discloses the non-transitory computer-readable medium of claim 9, but does not explicitly disclose further comprising instructions that, when executed by the at least one processor, cause the computer system to: provide graphical user interface displaying the video; provide a timeline for the video in the graphical user interface; and place a tag indicator associated with a tag of the set of tags on the timeline at a position corresponding to the temporal segment of the video. However, in analogous art, Li discloses a GUI with tag indicators relating to video segments on a timeline (Figs. 5 and 6, paras. [0036]-[0044]). Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet, Curtis and Bou to allow for providing a graphical user interface displaying the video, provide a timeline for the video in the graphical user interface, and place a tag indicator associated with a tag of the set of tags on the timeline at a position corresponding to the temporal segment of the video. This would have produced predictable and desirable results, in that it would allow for a more intuitive visual summary of videos.


Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Boquet et al. (Pub. No.: US 2019/0377823) in view of Curtis et al. (Pub. No.: US 2010/0088726) and Bou et al. (Pub. No.: US 2019/0258671), and further in view of Lee et al. (Pub. No.: US 2011/0205359).
Regarding claim 13, the combination of Boquet, Curtis and Bou discloses the system of claim 11, but does not explicitly disclose wherein selecting the one or more tagged feature vectors from the set of tagged feature vectors based on distances between the aggregated feature vectors and the one or more tagged feature vectors comprises utilizing a k-nearest neighbor algorithm to identify a set of tagged feature vectors for each aggregated feature vector. However, in analogous art, Lee discloses a k-nearest neighbor search can determine distance values between vectors (para. [0083]). Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet, Curtis and Bou to allow for selecting the one or more tagged feature vectors from the set of tagged feature vectors based on distances between the aggregated feature vectors and the one or more tagged feature vectors comprises utilizing a k-nearest neighbor algorithm to identify a set of tagged feature vectors for each aggregated feature vector. This would have produced predictable and desirable results, in that it would  allow for a well-known method of determining distance values to be used.


Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Boquet et al. (Pub. No.: US 2019/0377823) in view of Curtis et al. (Pub. No.: US 2010/0088726) and Bou et al. (Pub. No.: US 2019/0258671), and further in view of Sawhney et al. (Pub. No.: US 2016/0042252).
Regarding claim 16, the combination of Boquet, Curtis and Bou discloses the system of claim 11, but does not explicitly disclose wherein the at least one server is further configured to cause the system to generate, utilizing the neural network, feature vectors for frames of the plurality of frames by utilizing an image classification neural network to extract visual characteristics and latent attributes in different levels of abstractions from a frame of the plurality of frames. However, in analogous art, Sawhney discloses a system that “indexes the visual features and provides technologies for multi-dimensional content-based clustering, searching, and iterative exploration of the image collection using the visual features and/or the visual feature indices (Abstract),” wherein “[e]ach of the different types of similarity measures can relate to a different visual characteristic of the images 210 in the collection 150. Different similarity functions 238 can be defined and executed by the computing system 100 to capture different patterns of, for example, sameness and/or similarity of scenes, objects, locations, time of day, weather, etc. depicted in the images 210, and/or to capture similarity at different levels of abstraction (e.g., instance-based vs. category-based similarity, described further below). These different similarity measures can be used by the computing system 100 to supplement, or as an alternative to, more traditional “exact match”-type search and clustering techniques (para. [0016]; see also paras. [0022] and [0071]).” Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Boquet, Curtis and Bou to allow for the at least one server to be further configured to cause the system to generate, utilizing the neural network, feature vectors for frames of the plurality of frames by utilizing an image classification neural network to extract visual characteristics and latent attributes in different levels of abstractions from a frame of the plurality of frames. This would have produced predictable and desirable results, in that it would allow for more robust feature vectors to be generated, which could improve the performance of the system.


Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Boquet et al. (Pub. No.: US 2019/0377823) in view of Curtis et al. (Pub. No.: US 2010/0088726), Bou et al. (Pub. No.: US 2019/0258671) and Dal Mutto et al. (Pub. No.: US 2019/0108396), and further in view of Verdejo et al. (Pub. No.: US 2018/0082122).
Regarding claim 19, the combination as stated above discloses the computer-implemented method of claim 17, but does not explicitly disclose wherein extracting the tags associated with the one or more tagged feature vectors comprises extracting action words. However, in analogous art, Verdejo discloses that “analytics system 205 may associate tags with words included in the first data (e.g., based on tag association rules). In some implementations, the tag association rules may specify a manner in which the tags are to be associated with words, or based on characteristics of the words. For example, a tag association rule may specify that a singular noun tag (“/NN”) is to be associated with words that are singular nouns (e.g., based on a language database or a context analysis). In some implementations, a tag may include a part-of-speech (POS) tag, such as NN (noun, singular or mass), NNS (noun, plural), NNP (proper noun, singular), NNPS (proper noun, plural), VB (verb, base form), VBD (verb, past tense), VBG (verb, gerund or present participle), and/or the like (para. [0066]).” Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Boquet, Curtis and Bou to allow for extracting the tags associated with the one or more tagged feature vectors to comprise extracting action words. This would have produced predictable and desirable results, in that it would allow for more desired words and/or concepts to be found with greater specificity.


Conclusion
Claims 1-20 are pending.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Joshua D Taylor whose telephone number is (571)270-3755. The examiner can normally be reached Monday - Friday 8 am - 6 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nasser Goodarzi can be reached on 571-272-4195. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Joshua D Taylor/Primary Examiner, Art Unit 2426                                                                                                                                                                                                        September 30, 2022