DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of papers submitted under 35 U.S.C 119(a)-(d), which papers have been placed of record in the file.


Information Disclosure Statement
The references listed in the Information Disclosure Statement filed on May 25, 2021 have been considered by the examiner (see attached PTO-1449 form).

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-5, 7-14, 17, 19, 21-24 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claims 1, 17 and 19, recites the limitation “the user date" in lines 8, 10 and 10 respectively.  There is insufficient antecedent basis for this limitation in the claim.  It is unclear how “the user date” is related to the claim language.  Since claims 2-14 and 21-24 are directly or indirectly dependent on claims 1, 17 and 19, they inherit the same problem.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5, 6, 8-10, 13, 14, 17, 22 and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Wexler (U.S. Pub. No. 2015/0193540) in view of Germano et al. (U.S. Patent No. 11,153,655).
Regarding claim 1, Wexler discloses a video method, performed by a computer device, the method comprising (see paragraph 0093):
inputting a content item to a first feature extraction network (see paragraph 0056 and fig. 3 (302, 304));
inputting user data of a user to a second feature extraction network (see paragraphs 0032-0033, fig. 2 (202, 0204, 206));
performing user feature extraction on the user data with the second feature extraction network to generate a user feature of the user, the user date being discrete (see paragraphs 0032-0033, fig. 2 (202, 0204, 206); user demographic data (e.g., age, gender, date of birth, birthplace, address, etc.) is unique and distinct to a particular user);
performing first feature fusion based at least on the content item feature and the user feature to obtain a first commonality of generate a prioritized list of article to the user (see paragraphs 0060-0061 and fig. 4 (402); intersection logic 402 analyzes the commonality between the consolidated user features and the consolidated content features in order to identify features common in both sets).
However, Wexler is silent as to performing video feature extraction on at least one consecutive video frame in the video with the first feature extraction network to generate a video feature of the video; and determining, according to the first recommendation probability, whether to recommend the video to the user.
Germano et al. discloses performing video feature extraction on at least one consecutive video frame in the video with the first feature extraction network to generate a video feature of the video (see col. 2, lines 8-23; video feature extracted from frames 112 of the video); and 
determining, according to the first recommendation probability, whether to recommend the video to the user (see col. 2, lines 24-34; comparison of the features of video title 110 and the features of video title 120 yields a high enough degree of similarity to confidently recommend video title 120 to viewer as another action film in which they might have an interest).
It would have been obvious to a skilled artisan before the effective filing date of the claimed invention to modify the system of Wexler with the teachings of Germano et al., the motivation being to predicting videos that are appealing to a user. 

Regarding claim 17, claim 17 is rejected for the same reason set forth in the rejection of claim 1.

Regarding claim 19, claim 19 is rejected for the same reason set forth in the rejection of claim 1.

Regarding claims 5, 22 and 24, Wexler and Germano et al. discloses everything claimed as applied above (see claims 1 and 17).  Wexler discloses wherein performing the user feature extraction on the user data with the second feature extraction network comprises:
performing general linear combination on the user data by using a wide component in the second feature extraction network to obtain a wide feature of the user (see paragraphs 0037, 0065-0066 and figs. 2 and 4);
performing embedding and third convolution on the user data by using a deep component in the second feature extraction network to obtain a deep feature of the user (see paragraphs 0037, 0065-0066 and figs. 2 and 4); and
performing third feature fusion on the wide feature of the user and the deep feature of the user to obtain the user feature of the user (see paragraphs 0037, 0065-0066 and figs. 2 and 4).

Regarding claim 6, Wexler and Germano et al. discloses everything claimed as applied above (see claim 5).  Wexler discloses wherein performing the third feature fusion on the wide feature of the user and the deep feature of the user to obtain the user feature of the user comprises:
cascading the wide feature of the user and the deep feature of the user by using a fully- connected layer to obtain the user feature of the user (see paragraphs 0051, 0060, 0065-0066, figs. 2 and 4).

Regarding claim 8, Wexler and Germano et al. discloses everything claimed as applied above (see claim 1).  Germano et al. discloses wherein the method further comprises:
inputting at least one text corresponding to the video to a third feature extraction network (see col. 2, 8-34 and fig.1(114)); and
performing text feature extraction on the at least one text with the third feature extraction network to generate a text feature of the video, the at least one text being discrete (see col. 2, 8-34 and fig.1(114)).

Regarding claim 9, Wexler and Germano et al. discloses everything claimed as applied above (see claim 8).  Germano et al. discloses wherein performing the text feature extraction on the at least one text with the third feature extraction network, comprises:
performing general linear combination on the at least one text by using a wide component in the third feature extraction network to obtain a wide feature of the at least one text (see col. 2, 8-34 and fig.1(114));
performing embedding and fourth convolution on the at least one text by using a deep component in the third feature extraction network to obtain a deep feature of the at least one text (see col. 2, 8-34 and fig.1(114)); and
performing fourth feature fusion on the wide feature of the at least one text and the deep feature of the at least one text to obtain the text feature of the video (see col. 2, 8-34 and fig.1(114)).

Regarding claim 10, Wexler and Germano et al. discloses everything claimed as applied above (see claim 9).  Germano et al. discloses wherein the performing the fourth feature fusion on the wide feature of the at least one text and the deep feature of the at least one text to obtain the text feature of the video comprises:
cascading the wide feature of the at least one text and the deep feature of the at least one text by using a fully-connected layer to obtain the text feature of the video (see col. 2, 8-34 and fig.1(114)).

Regarding claim 13, Wexler and Germano et al. discloses everything claimed as applied above (see claim 1).  Germano et al. discloses wherein determining, according to the first recommendation probability, whether to recommend the video to the user comprises:
determining, when first the recommendation probability is greater than a probability threshold, to recommend the video to the user (see col. 4, lines 32-45, col. 7, lines 34-45); and
determining, when the first recommendation probability is less than or equal to the probability threshold, not to recommend the video to the user (see col. 4, lines 64-col. 5, line 11).

Regarding claim 14, Wexler and Germano et al. discloses everything claimed as applied above (see claim 1).  Wexler discloses obtaining two or more extra recommendation probabilities respectively for two or more extra videos (see paragraphs 0010-0012 and fig. 4);
obtaining probability ranking of the extra two or more recommendation probabilities and the first recommendation probability (see paragraphs 0010-0012 and fig. 4); and
determining whether to recommend a certain video according to the ranking (see paragraphs 0061, 0063-0064).


Claims 7, 11 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Wexler and Germano et al. as applied to claims 1, 17 and 19 above, and further in view of Kounine et al. (U.S. Pub. No. 2019/0236680).

Regarding claim 7, Wexler and Germano et al. discloses everything claimed as applied above (see claim 1).  However, Wexler and Germano et al. are silent as to wherein performing the first feature fusion based at least on the video feature and the user feature to obtain the first recommendation probability of recommending the video to the user comprises:
performing dot multiplication on the video feature and the user feature to obtain the first recommendation probability of recommending the video to the user.
Kounine et al. discloses wherein performing the first feature fusion based at least on the video feature and the user feature to obtain the first recommendation probability of recommending the video to the user comprises:
performing dot multiplication on the video feature and the user feature to obtain the first recommendation probability of recommending the video to the user (see paragraph 0087).
It would have been obvious to a skilled artisan before the effective filing date of the claimed invention to modify the system of Wexler and Germano et al. with the teachings of Kounine et al., the motivation being to provide personalized content. 

Regarding claim 11, Wexler and Germano et al. discloses everything claimed as applied above (see claim 8).  Germano et al. discloses wherein performing the first feature fusion based at least on the video feature and the user feature to obtain the first recommendation probability of recommending the video to the user comprises:
performing video-user feature fusion on the video feature and the user feature to obtain a first associated feature between the video and the user (see col. 2, 8-34 and fig.1(114));
performing text-user feature fusion on the text feature and the user feature to obtaining a second associated feature between the at least one text and the user (see col. 2, 8-34 and fig.1(114)).
Kounine et al. discloses performing dot multiplication on the first associated feature and the second associated feature to obtain the first recommendation probability of recommending the video to the user (see paragraph 0087).
It would have been obvious to a skilled artisan before the effective filing date of the claimed invention to modify the system of Wexler and Germano et al. with the teachings of Kounine et al., the motivation being to provide personalized content. 

Regarding claim 12, Wexler, Germano et al. and Kounine et al. discloses everything claimed as applied above (see claim 11).  Germano et al. discloses wherein performing the video-user feature fusion on the video feature and the user feature to obtain the first associated feature between the video and the user comprises performing video- user bilinear pooling on the video feature and the user feature to obtain the first associated feature between the video and the user (see col. 1, lines 60-col. 2, line 7, col. 2, lines 24-34); and
wherein performing the text-user feature fusion on the text feature and the user feature to obtain the second associated feature between the text and the user comprises performing text- user bilinear pooling on the text feature and the user feature to obtain the second associated feature between the text and the user (see col. 1, lines 60-col. 2, line 7, col. 2, lines 24-34).

Claims 2-4, 21 and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Wexler and Germano et al. as applied to claims 1, 17 and 19 above, and further in view of Mac et al. (U.S. Pub. No. 2019/0354835).

Regarding claims 2, 21 and 23, Wexler and Germano et al. discloses everything claimed as applied above (see claims 1, 17 and 19).  Germano et al. discloses wherein inputting the video to the first feature extraction network comprises: separately inputting the at least one consecutive video frame in the video to a convolutional neural network in the first feature extraction network (see col. 5, lines 24-36; convolutional neural network),
wherein performing the video feature extraction on the at least one consecutive video frame in the video with the first feature extraction network to generate the video feature of the video comprises:
extracting the video feature of the video through performing first convolution on the at least one consecutive video frame by using the convolutional neural network (see col. 5, lines 24-36; convolutional neural network).
However, Wexler and Germano et al. are silent as to a temporal convolutional network.
Mac et al. discloses a temporal convolutional network (see paragraph 0028; temporal convolutional network).
It would have been obvious to a skilled artisan before the effective filing date of the claimed invention to modify the system of Wexler and Germano et al. with the teachings of Mac et al., the motivation being to aggregate one or more motion vectors. 

Regarding claim 3, Wexler, Germano et al. and Mac et al. discloses everything claimed as applied above (see claim 2).  Germano et al. discloses wherein performing the first convolution on the at least one consecutive video frame by using the temporal convolutional network and the convolutional neural network to generate the video feature of the video comprises:
performing audio convolution on at least one audio frame in the at least one consecutive video frame using the convolutional neural network to obtain an audio feature of the video (see col. 2, lines 8-23, col. 5, lines 24-36); and
performing second feature fusion on the image feature of the video and the audio feature of the video to obtain the video feature of the video (see col. 2, lines 8-34, col. 5, lines 24-36).
Mac discloses performing causal convolution on at least one image frame in the at least one consecutive video frame using the temporal convolutional network to obtain an image feature of the video (see paragraph 0028).

Regarding claim 4, Wexler, Germano et al. and Mac et al. discloses everything claimed as applied above (see claim 3).  Germano et al. discloses wherein performing the second feature fusion on the image feature and the audio feature to obtain the video feature comprises:
performing bilinear pooling on the image feature and the audio feature to obtain the video feature (see col. 2, lines 8-34, col. 5, lines 24-36).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NNENNA NGOZI EKPO whose telephone number is (571)270-1663. The examiner can normally be reached M-W 10:00am - 6:30pm, TH-F 8:00am - 4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Brian Pendleton can be reached on 571-272-7527. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

NNENNA EKPO
Primary Examiner
Art Unit 2425



/NNENNA N EKPO/Primary Examiner, Art Unit 2425                                                                                                                                                                                                        May 6, 2022