DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Allowable Subject Matter
Claims 1 - 5, 8 - 13 and 16 - 20 are currently subject to non-statutory double patent rejections, but are otherwise not subject to any prior art rejections under either 35 U.S.C. § 102 or 35 U.S.C. § 103. Assuming that the foregoing shortcomings of these claims were rectified by the timely filing of a terminal disclaimer, these claims would be allowable.
Claims 6 - 7 and 14 - 15 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  
Independent claims 1, 8, 16, recite the same patentable features as were found allowable in parent application no. 16/784005, which issued as United States patent no. 11,244,167. The present claims are found allowable for the same reasons as were provided in the parent application.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claim 1 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 11,244,167 in view of KATARIA et al (U.S. PG Pub. No. 20190034793). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application 
Claim 1
U.S. Patent No. 11,244,167
Claim 1
A non-transitory computer-readable medium comprising instructions that, when executed by at least one processor, cause a computing device to:
A non-transitory computer-readable medium storing instructions that, when executed by at least one processor, cause a computing device to: 
extract a query vector from a question corresponding to a video segment;
extract a query vector from a question corresponding to a video segment; 
generate one or more context vectors representing at least one of a visual feature corresponding to the video segment or transcript text corresponding to the video segment;
extract multiple contextual modalities from the video segment by: generating visual-context vectors representing visual features corresponding to the video segment; and 
generate a query-context vector by combining the query vector and the one or more context vectors utilizing a neural network and one or more attention mechanisms;
generating textual-context vectors representing transcript text corresponding to the video segment; generate a query-context vector by combining the query vector, the visual- context vectors, and the textual-context vectors;
generate candidate-response vectors representing candidate responses to the question; 
generate candidate-response vectors representing candidate responses to the question; and
select a response from the candidate responses by comparing the query-context vector to the candidate-response vectors.
select a response from the candidate responses by comparing the query-context vector to the candidate-response vectors. 


Claim 1 of U.S. Patent No. 11,244,167 does not recite generating the query-context vector by utilizing a neural network. However, this limitation was known in the art:
KATARIA et al (U.S. PG Pub. No. 20190034793) discloses generating the query-context vector by utilizing a neural network at ¶¶ [0054]-[0055]: “At operation 822 , the job search query is passed into the query-based deep semantic similarity neural network to output a first query context vector.” At the time of the filing of the present application, it would have been obvious to a person of ordinary skill in the art to generate the query-context vector by utilizing a neural network, as taught by KATARIA, when generating a query-context vector according to claim 1 of U.S. Patent No. 11,244,167.  The motivation for doing so comes from the prior art, wherein the benefits of neural networks were well known and include their accuracy and reliability in performing trained tasks. Therefore, it would have been obvious to combine KATARIA with claim 1 of U.S. Patent No. 11,244,167 to obtain the invention specified in this claim.
Claim 2 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 11,244,167 in view of KATARIA et al (U.S. PG Pub. No. 20190034793). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 2
U.S. Patent No. 11,244,167
Claim 1
The non-transitory computer-readable medium of claim 1, further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the one or more context vectors by:
extract multiple contextual modalities from the video segment by: 
generating visual-context vectors representing visual features corresponding to the video segment; and
generating visual-context vectors representing visual features corresponding to the video segment; and 
generating textual-context vectors representing transcript text corresponding to the video segment.
generating textual-context vectors representing transcript text corresponding to the video segment;


Claim 3 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 7 of U.S. Patent No. 11,244,167 in view of KATARIA et al (U.S. PG Pub. No. 20190034793). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 3
U.S. Patent No. 11,244,167
Claim 7
The non-transitory computer-readable medium of claim 2, further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the query-context vector by:
The non-transitory computer-readable medium of claim 1, further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the query-context vector by utilizing posterior layers from a query-response-neural network to:
generating, utilizing a spatial attention mechanism, a precursor query-context vector based on a combination of the visual-context vectors and the query vector; and
generate a precursor query-context vector based on a subset of the visual-context vectors for a video frame of the video segment and the query vector by utilizing a spatial-attention mechanism;
combining the precursor query-context vector and at least one of the textual-context vectors utilizing the neural network.
generate the query context vector based on the precursor query-context vector and a textual-context vector for the video frame from the textual-context vectors utilizing a recurrent neural network.


Claim 4 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 6 of U.S. Patent No. 11,244,167 in view of KATARIA et al (U.S. PG Pub. No. 20190034793). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 4
U.S. Patent No. 11,244,167
Claim 6
The non-transitory computer-readable medium of claim 2, further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the query-context vector by:
The non-transitory computer-readable medium of claim 1, further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the query-context vector by utilizing posterior layers from a query-response-neural network to:  
generating, utilizing the neural network, hidden-feature vectors based on at least one of the textual-context vectors;
generate a hidden-feature vector based on the textual-context vectors utilizing a recurrent neural network;  
generating, utilizing a temporal attention mechanism, a precursor query-context vector based on a combination of the hidden-feature vectors and the query vector; and
generate a precursor query-context vector based on the query vector and the hidden-feature vector utilizing a temporal-attention mechanism; and 
combining the precursor query-context vector and at least one of the visual-context vectors utilizing a spatial attention mechanism.
generate the query-context vector based on the precursor query-context vector and the visual-context vectors utilizing a spatial-attention mechanism.


Claim 5 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 6 of U.S. Patent No. 11,244,167 in view of KATARIA et al (U.S. PG Pub. No. 20190034793). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 5
U.S. Patent No. 11,244,167
Claim 6
The non-transitory computer-readable medium of claim 2, further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the query-context vector by:
The non-transitory computer-readable medium of claim 1, further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the query-context vector by utilizing posterior layers from a query-response-neural network to:  
generating, utilizing the neural network, hidden-feature vectors based on the visual-context vectors and the textual-context vectors; and
generate a hidden-feature vector based on the textual-context vectors utilizing a recurrent neural network; generate a precursor query-context vector based on the query vector and the hidden-feature vector utilizing a temporal-attention mechanism; 
combining the hidden-feature vectors and the query vector utilizing a temporal attention mechanism.
generate the query-context vector based on the precursor query-context vector and the visual-context vectors utilizing a spatial-attention mechanism.


Claim 8 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 10 of U.S. Patent No. 11,244,167 in view of PALANISAMY et al (U.S. PG Pub. No. 2020/0139973). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table: 
Present Application
Claim 8
U.S. Patent No. 11,244,167
Claim 10
A system comprising: 
A system comprising: 

one or more memory devices comprising a video and dual-attention mechanisms comprising a spatial attention mechanism and a temporal attention mechanism; and one or more processors configured to cause the system to:
one or more memory devices comprising a video and a query-response-neural network; and at least one server configured to cause the system to: 

extract a query vector from a question corresponding to the video;
extract a query vector from a question corresponding to a video segment of the video utilizing question-network layers from the query-response-neural network; extract multiple contextual modalities from the video segment by: 

generate one or more context vectors representing at least one of a visual feature corresponding to the video or transcript text corresponding to the video;
generating visual-context vectors representing visual features corresponding to the video segment by utilizing visual-feature layers from the query-response-neural network; generating textual-context vectors representing transcript text corresponding to the video segment by utilizing transcript layers from the query-response-neural network;
generate a query-context vector by combining the query vector and the one or more context vectors utilizing the dual-attention mechanisms;
generate a query-context vector based on the query vector, the visual-context vectors, and the textual-context vectors by utilizing posterior layers from the query-response-neural network;
generate candidate-response vectors representing candidate responses to the question; and
generate candidate-response vectors representing candidate responses to the question utilizing response-network layers from the query-response-
neural network;
select a response from the candidate responses by comparing the query-context vector to the candidate-response vectors.
select a response from the candidate responses based on a comparison of the query-context vector to the candidate-response vectors.


Claim 10 of U.S. Patent No. 11,244,167 does not recite dual-attention mechanisms comprising a spatial attention mechanism and a temporal attention mechanism. However, this limitation was known in the art:
PALANISAMY et al (U.S. PG Pub. No. 2020/0139973) discloses dual-attention mechanisms comprising a spatial attention mechanism and a temporal attention mechanism at ¶ [0071](“The disclosed embodiments integrate both temporal and spatial attention mechanisms…”). At the time of the filing of the present application, it would have been obvious to a person of ordinary skill in the art use dual-attention mechanisms comprising a spatial attention mechanism and a temporal attention mechanism when generating a query-context vector based on the visual-context vectors, as recited by claim 10 of U.S. Patent No. 11,244,167.  The motivation for doing so comes from PALANISAMY, which discloses, “[T]he temporal attention module that learns to weigh the importance of previous frames of image data at any given frame of the image data, and spatial attention at the spatial attention module that learns the importance of different locations in the any given frame of the image data. The spatial attention module and the temporal attention module collectively improve lane-change policy selection of the actor network..”  (PALANISAMY, ¶¶ [0014]-[0015]).  Therefore, it would have been obvious to combine PALANISAMY with claim 10 of U.S. Patent No. 11,244,167 to obtain the invention specified in this claim.
Claim 9 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 10 of U.S. Patent No. 11,244,167 in view of PALANISAMY et al (U.S. PG Pub. No. 2020/0139973). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 9
U.S. Patent No. 11,244,167
Claim 10
The system of claim 8, wherein the one or more processors are configured to cause the system to generate the one or more context vectors by: 
system comprising: one or more memory devices comprising a video and a query-response-neural network; and at least one server configured to cause the system to: … extract multiple contextual modalities from the video segment by:
generating visual-context vectors representing visual features corresponding to video frames of the video; and
generating visual-context vectors representing visual features corresponding to the video segment by utilizing visual-feature layers from the query-response-neural network; 
generating textual-context vectors representing transcript text corresponding to the video frames of the video.
generating textual-context vectors representing transcript text corresponding to the video segment by utilizing transcript layers from the query-response-neural network;


Claim 10 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 10 of U.S. Patent No. 11,244,167 in view of PALANISAMY et al (U.S. PG Pub. No. 2020/0139973). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 10
U.S. Patent No. 11,244,167
Claim 10
The system of claim 8, wherein the one or more processors are configured to cause the system to generate the query-context vector utilizing a neural network.
generate a query-context vector based on the query vector, the visual-context vectors, and the textual-context vectors by utilizing posterior layers from the query-response-neural network


Claim 11 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 8 of U.S. Patent No. 11,244,167 in view of PALANISAMY et al (U.S. PG Pub. No. 2020/0139973). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application

U.S. Patent No. 11,244,167

Claim 11
The system of claim 8, wherein the one or more processors are configured to cause the system to generate the query-context vector by:
generating, utilizing the spatial attention mechanism, a precursor query-context vector based on a combination of visual-context vectors and the query vector; and
Incorporated from parent claim 7
The non-transitory computer-readable medium of claim 1, further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the query-context vector by utilizing posterior layers from a query-response-neural network to: generate a precursor query-context vector based on a
subset of the visual-context vectors for a video frame of the video segment and the
query vector by utilizing a spatial-attention mechanism; and generate the query context
vector based on the precursor query-context vector and a textual-context
vector for the video frame from the textual-context vectors utilizing a recurrent neural network.
combining the precursor query-context vector and textual-context vectors utilizing gated recurrent units (GRUs) of a recurrent neural network.
Claim 8
The non-transitory computer-readable medium of claim 7, further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the query-context vector by utilizing the recurrent neural network comprising one or more gated recurrent units.
Incorporated from parent claim 1
A system comprising: 
Incorporated from parent claim 1
A non-transitory computer-readable medium storing instructions that, when executed by at least one processor, cause a computing device to:
one or more memory devices comprising a video and dual-attention mechanisms comprising a spatial attention mechanism and a temporal attention mechanism; and one or more processors configured to cause the system to:

extract a query vector from a question corresponding to the video;
extract a query vector from a question corresponding to a video segment;
generate one or more context vectors representing at least one of a visual feature corresponding to the video or transcript text corresponding to the video;
extract multiple contextual modalities from the video segment by: generating visual-context vectors representing visual features corresponding to the video segment; and generating textual-context vectors representing transcript text
corresponding to the video segment;
generate a query-context vector by combining the query vector and the one or more context vectors utilizing the dual-attention mechanisms;
generate a query-context vector by combining the query vector, the visual- context vectors, and the textual-context vectors;
generate candidate-response vectors representing candidate responses to the question; and
generate candidate-response vectors representing candidate responses to the question;
select a response from the candidate responses by comparing the query-context vector to the candidate-response vectors.
select a response from the candidate responses by comparing the query-context vector to the candidate-response vectors.


Claim 1 of U.S. Patent No. 11,244,167 does not recite dual-attention mechanisms comprising a spatial attention mechanism and a temporal attention mechanism. However, this limitation was known in the art as evidenced by PALANISAMY et al (U.S. PG Pub. No. 2020/0139973), discussed above with respect to the rejection of claim 1. The motivation for this combination is the same as was previously presented.
Claim 12 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 8 of U.S. Patent No. 11,244,167 in view of PALANISAMY et al (U.S. PG Pub. No. 2020/0139973). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
U.S. Patent No. 11,244,167
Claim 12
The system of claim 8, wherein the one or more processors are configured to cause the system to generate the query-context vector by: 
generating, utilizing GRUs of a recurrent neural network, hidden-feature vectors based on textual-context vectors; 
Claim 8
The non-transitory computer-readable medium of claim 7, further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the query-context vector by utilizing the recurrent neural network comprising one or more gated recurrent units.
generating, utilizing the temporal attention mechanism, a precursor query-context vector based on a combination of the hidden-feature vectors and the query vector; and 
Incorporated from parent claim 7
The non-transitory computer-readable medium of claim 1, further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the query-context vector by utilizing posterior layers from a query-response-neural network to: generate a precursor query-context vector based on a subset of the visual-context vectors for a video frame of the video segment and the query vector by utilizing a spatial-attention mechanism; and 
combining the precursor query-context vector and visual-context vectors utilizing the spatial attention mechanism.
generate the query context vector based on the precursor query-context vector and a textual-context vector for the video frame from the textual-context vectors utilizing a recurrent neural network.



Claim 13 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 15 of U.S. Patent No. 11,244,167 in view of PALANISAMY et al (U.S. PG Pub. No. 2020/0139973). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 13
U.S. Patent No. 11,244,167
Claim 15
The system of claim 8, wherein the one or more processors are configured to cause the system to generate the query-context vector by: 
generating, utilizing GRUs of a recurrent neural network, hidden-feature vectors based on visual-context vectors and textual-context vectors; and 
The system of claim 10, wherein the at least one server is further configured to cause the system to generate the query-context vector utilizing the posterior layers from the query-response-neural network by: generating a hidden-feature vector based on the textual-context vectors utilizing a recurrent neural network; 
combining the hidden-feature vectors and the query vector utilizing the temporal attention mechanism.
generating a precursor query-context vector based on the query vector and the hidden-feature vector utilizing a temporal-attention mechanism; and generating the query-context vector based on the precursor query-context vector and the visual-context vectors utilizing a spatial-attention mechanism.


Claim 16 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 18 of U.S. Patent No. 11,244,167 in view of  PALANISAMY et al (U.S. PG Pub. No. 2020/0139973). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table: 
Present Application
Claim 16
U.S. Patent No. 11,244,167
Claim 18
A computer-implemented method comprising: 
A computer-implemented method comprising: 
extracting a query vector from a question corresponding to a video segment; 
extracting a query vector from a question corresponding to a video segment by utilizing question-network layers from a query-response-neural network; 
generating visual-context vectors representing visual features corresponding to the video segment; 
generating visual-context vectors representing visual features corresponding to the video segment; and generating textual-context vectors representing transcript text corresponding to the video segment; 
generating a query-context vector by combining the query vector and the visual-context vectors utilizing a spatial attention mechanism to spatially weight one or more of the visual features corresponding to the video segment; 
performing a step for combining the query vector, the visual-context vectors, and the textual-context vectors from the video segment to form a query-context vector; 
generating candidate-response vectors representing candidate responses to the question; and 
generating candidate-response vectors representing candidate responses to the question utilizing response-network layers from the query-response-neural network;
selecting a response from the candidate responses by comparing the query-context vector to the candidate-response vectors.
selecting a response from the candidate responses based on a comparison of the query-context vector to the candidate-response vectors.


Claim 18 of U.S. Patent No. 11,244,167 does not recite a spatial attention mechanism to spatially weight one or more of the visual features corresponding to the video segment. However, this limitation was known in the art as evidenced by PALANISAMY et al (U.S. PG Pub. No. 2020/0139973), which discloses a spatial attention mechanism to spatially weight one or more of the visual features corresponding to the video segment at ¶ [0107]: “The spatial attention module 140 is applied after the feature extraction layers 130 and before the recurrent layers 150 . The spatial attention module 140 can apply spatial attention to learn weights for different areas in an image, and a spatial contextvector (ZT) 135 - 3 used by LSTM network 150 - 3 will be a weighted sumcombination of spatial features multiplied by the learned weights. This allows thespatial attention module 140 to add importance to different locations or regions withinthe image data 129 - 3.” The motivation for this combination is the same as was previously presented.
Claim 17 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim  of U.S. Patent No. 11,244,167 in view of PALANISAMY et al (U.S. PG Pub. No. 2020/0139973). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 17
U.S. Patent No. 11,244,167
Claim 18
The computer-implemented method of claim 16, further comprising generating textual-context vectors representing transcript text corresponding to the video segment.
generating textual-context vectors representing transcript text corresponding to the video segment;


Claim 18 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 16 of U.S. Patent No. 11,244,167 in view of PALANISAMY et al (U.S. PG Pub. No. 2020/0139973). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
U.S. Patent No. 11,244,167
Claim 18
The computer-implemented method of claim 17, wherein generating the query context vector comprises: 
Claim 16
The system of claim 10, wherein the at least one server is further configured to cause the system to generate the query-context vector utilizing the posterior layers from the query-response-neural network by: 

generating, utilizing the spatial attention mechanism, a precursor query-context vector based on a combination of the visual-context vectors and the query vector; and 
generating a precursor query-context vector based on a subset of visual-context vectors for a video frame of the video segment and the query vector by utilizing a spatial-attention mechanism; and

combining the precursor query-context vector and at least one of the textual-context vectors utilizing a neural network.
generating the query-context vector based on the precursor query-context vector and
a textual-context vector for the video frame from the textual-context vectors utilizing are current neural network.
Incorporated from parent claim 16
extracting a query vector from a question corresponding to a video segment; 
Incorporated from parent claim 10
extract a query vector from a question corresponding to a video segment of the video utilizing question-network layers from the query-response-neural network;
generating visual-context vectors representing visual features corresponding to the video segment; 
extract multiple contextual modalities from the video segment by: generating visual-context vectors representing visual features corresponding to the video segment by utilizing visual-feature layers from the query-response-neural network; 
generating a query-context vector by combining the query vector and the visual-context vectors utilizing a spatial attention mechanism to spatially weight one or more of the visual features corresponding to the video segment; 
generating textual-context vectors representing transcript text corresponding to the video segment by utilizing transcript layers from the query-response-neural network; generate a query-context vector based on the query vector, the visual-context vectors, and the textual-context vectors by utilizing posterior layers from the query-response-neural network;
generating candidate-response vectors representing candidate responses to the question; and 
generate candidate-response vectors representing candidate responses to the question utilizing response-network layers from the query-response- neural network;
selecting a response from the candidate responses by comparing the query-context vector to the candidate-response vectors.
select a response from the candidate responses based on a comparison of the query-context vector to the candidate-response vectors.


Claim 19 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 15 of U.S. Patent No. 11,244,167 in view of PALANISAMY et al (U.S. PG Pub. No. 2020/0139973). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 19
U.S. Patent No. 11,244,167
Claim 15
The computer-implemented method of claim 17, wherein generating the query context vector comprises: 
The system of claim 10, wherein the at least one server is further configured to cause the system to generate the query-context vector utilizing the posterior layers from the query-response-neural network by: 
generating, utilizing a neural network, hidden-feature vectors based on at least one of the textual-context vectors; 
generating a hidden-feature vector based on the textual-context vectors utilizing a recurrent neural network;
generating, utilizing a temporal attention mechanism, a precursor query-context vector based on a combination of the hidden-feature vectors and the query vector; and 
generating a precursor query-context vector based on the query vector and the hidden-feature vector utilizing a temporal-attention mechanism; and.
combining the precursor query-context vector and at least one of the visual-context vectors utilizing the spatial attention mechanism.
generating the query-context vector based on the precursor query-context vector and the visual-context vectors utilizing a spatial-attention mechanism


Claim 20 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 13 of U.S. Patent No. 11,244,167 in view of  PALANISAMY et al (U.S. PG Pub. No. 2020/0139973). Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
U.S. Patent No. 11,244,167
Claim 20
The computer-implemented method of claim 16, further comprising: 
Incorporated from parent claim 12
The system of claim 10, wherein the at least one server is further configured to cause the system to generate a visual-context vector of the visual-context vectors by: extracting textual-feature embeddings from inner objects that correspond to detected outer objects; 
detecting, utilizing a detection neural network, an object portrayed within the video segment; 

extracting, utilizing a graphical-object-matching engine, a textual-feature embedding based on textual elements inside the object; 
generating training-sample-textual-feature embeddings representing
visual-feature categories for training-sample objects visible within videos;

comparing the textual-feature embedding with feature embeddings of training-sample objects by generating similarity scores indicating a similarity between the textual-feature embedding and a particular feature embedding associated with a training-sample object; and 
comparing the textual-feature embeddings with the training-sample-textual-feature embeddings; 

Claim 13
compare the textual-feature embeddings with the training-sample-textual-feature embeddings by generating similarity scores indicating a similarity between particular textual-feature embeddings and particular training-sample-textual-feature embeddings; and 

generating, based on the similarity scores, the visual-context vectors indicating a visual feature category for the object.
identify the visual-feature category from among the visual-feature categories for the textual-feature embedding based on the similarity scores.

Incorporated from parent claim 12
based on comparing the textual-feature embeddings with the training-sample-textual-feature embeddings, generating the visual-context vector indicating a visual-feature category from among the visual-feature categories for a textual-feature embedding from the textual-feature embeddings.


Conclusion	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID F DUNPHY whose telephone number is (571)270-1230. The examiner can normally be reached 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached on 5712727332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DAVID F DUNPHY/Primary Examiner, Art Unit 2668