DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 7/7/2022 has been entered.
 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 8-10, 13, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over TANG et al. (US 20190384985, hereinafter TANG-1) in view of YI et al. (Pub. No. US 20190073524) and further in view of Drees (Pub. No. US 20210048785).
Regarding claims 1, 8, 13, and 17, TANG-1 teaches accessing a first set of images of a plurality of images of a scene, wherein the first set of images show the scene during a time period [Para. 32-34. Fig. 2 step 101-102]; generating, by processing the first set of images using a first machine-learning model, one or more attributes (feature representation result) representing observed actions performed in the scene during the time period [Para. 38, fig. 2 step 103; Fig. 3 is a video to-be-processed depicting a sport action.  Para. 29 states “The feature representation result is video feature representation in a time scale”. Para. 41 states “in one embodiment, the server may separately input the feature representation result corresponding to each video frame feature sequence into the second neural network model, and then after processing each input feature representation result by using the second neural network model, the server outputs the prediction result corresponding to each feature representation result. Finally, the server may determine the category of the to-be-processed video according to the prediction result”. [0042] It may be understood that the category of the to-be-processed video may be "sports", "news", "music", "animation", "game", or the like, and is not limited herein”. Therefore, it’s clear that feature representation result includes feature/attribute representing observed action]; and predicting, by processing the generated one or more attributes using a second machine learning model, a category the video belongs to [Para. 41, fig. 2 step 104].
However, TANG-1 doesn’t explicitly teach about the rest of claim limitations. 
	YI teaches wherein the first machine-learning model (encoder + first sub CNN) receives the first set of images as input and wherein the one or more attributes represent at least one of observed characteristics (walking behavior) of the scene during the time period [Para. 64 and fig. 3 unit 302]; predicting one or more actions (future walking behavior) that would happen in the scene after the time period based on the content of the previous image [fig. 3 step 306 and Para. 168]; and wherein the first machine-learning (encoder +first sub CNN) model and the second machine-learning model (second Sub CNN + decoder) are different learning models [Para. 33, 50, and 51].
	It would have been obvious to one of ordinary still in the art, before the effective filing, to include in video classification system of TANG-1 the ability to predict action that happen in the scene after the time period in order to quickly determine walking behavior of a pedestrian, as taught by YI since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
	TANG-1 in view of YI doesn’t explicitly teach that the first and second learning models are different types.
	Drees teaches combining the prediction of two different types of models [Para. 85; fig. 5 steps 510 and 516; fig. 6 and related description].
It would have been obvious to one of ordinary still in the art, before the effective filing, to include in video classification system of TANG-1 in view of YI the ability to predict action using two different types of models, as taught by Drees since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claim 9, TANG-1 in view of YI teaches all claim limitation above. Furthermore, YI wherein processing the generated one or more attributes (characteristics) using the second machine-learning model comprises correlating the generated one or more attributes with potential predicted actions for the scene after the time period [fig. 2-4 and related description].
Regarding claim 10, TANG-1 teaches wherein the plurality of images are frames of a video recording of the scene during the time period [Abstract].
Claims 2, 4, 14, 16, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over TANG et al. (US 20190384985, hereinafter TANG-1) in view of YI et al. (Pub. No. US 20190073524) further in view of Drees (Pub. No. US 20210048785) and further in view TANG et al. (Pub. No. US 20190138798, hereinafter TANG-2).
Regarding claims 2, 14, and 18, TANG-1 teaches having multiple neural networks in order to predict a scene. However, TANG-1 in view of YI further in view of Drees doesn’t explicitly teach having two neural networks (“second” and “third”) processing the same input for predicting action and combine each predicted actions in order to determine composite predicted action.
However, TANG-2 teaches predicting, by processing the first set of images using a third (second CNN of TANG-2) of the machine-learning model, one or more actions that would happen in the scene after the time period [Para. 118, 122, fig. 2 and related description]; and determining one or more composite predicted actions based on the actions predicted by processing the generated one or more attributes using the second machine-learning model (first CNN of TANG-2) and the actions predicted by processing the first set of images using the third machine-learning model [Para. 123-124, fig. 2 and related description].
It would have been obvious to one of ordinary still in the art, before the effective filing, to include in the system of TANG-1 in view of YI the ability to predict action using third machine learning model and to combine predicted actions of the first and the third machine learning model to output composite predicted action, by replacing the deep learning of YI with first and second CNN, as taught by TANG-2 since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claims 4, 16, and 20, TANG-1 in view of YI further in view of Drees further in view of TANG-2 teaches all claim limitation above. Furthermore, TANG-2 teaches wherein the third machine-learning model was trained using loss minimization (error minimization) on a set of training images and corresponding potential predicted actions [Para. 121-122].

Claims 3, 15 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over TANG et al. (US 20190384985, hereinafter TANG-1) in view of YI et al. (Pub. No. US 20190073524) further in view of Drees (Pub. No. US 20210048785), further in view TANG et al. (Pub. No. US 20190138798, hereinafter TANG-2) and further in view of Ma et al. (pub. No. US 20200099790).
Regarding claims 3, 15, and 19, TANG-1 in view of YI further in view of Drees further in view of TANG-2 teaches all claim limitation above. Furthermore, TANG-2 teaches assigning/determining a score/value for each action predicted by processing the generated one or more attributes using the second machine-learning model [Para. 123-124, fig. 2 and related description]; assigning a score for each action predicted by processing the first set of images using the third machine-learning model [Para. 118, 122, fig. 2 and related description]. 
However, TANG-1 in view of YI further in view of Drees further in view of TANG-2 doesn’t explicitly teach determining a weighted average of the assigned scores for each action for the scene.
Ma teaches determining a weighted average of the assigned/determined scores for each human predicted expertise/action [Para. 3-5, claim 1 and related description].
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify TANG-1 in view of YI further in view of Drees further in view of TANG-2 to average the assigned score, feature as taught by Ma; because the modification enable the system to determine the right agent in order to save resources. 

Claims 5 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over TANG et al. (US 20190384985, hereinafter TANG-1) in view of YI et al. (Pub. No. US 20190073524) further in view of Drees (Pub. No. US 20210048785), further in view FUKUMOTO et al. (Pub. No. US 20190272750 hereinafter “FUKU”) further in view of CHONG et al. (Pub. No. US 20190373264).
 Regarding claim 5, TANG-1 in view of YI further in view of Drees doesn’t explicitly teach the claim limitation.
FUKU teaches accessing a set of potential actions for the scene, wherein each action of the set of potential actions is associated with a set of attributes considered predictive of the action occurring [Para. 35-38, 70-77 fig. 1 unit 2c and related description]; and selecting one or more actions from the set of potential actions for the scene after the time period based on the determined probabilities corresponding to the attributes and the actions considered predictive of the action occurring [Para. 35-38, 70-77 fig. 1 unit 2c and related description].
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify TANG-1 in view of YI further in view of Drees to teach the claim limitation, feature as taught by FUKU; because the modification enable the self-driving system to improve decision making process in order to reduce accident and time it take for travel.
TANG-1 in view of YI further in view of Drees further in view of FUKU doesn’t explicitly teach determining a probability that each of the one or more attributes accurately represent the actions performed in the scene observed during the time period
CHONG teaches determining a probability that each of the one or more attributes accurately represent the actions performed (should be taken) in the scene observed during the time period [Para. 33 “Given all this information, the neural network can output a probability that the high-level features [attribute] represent a particular object or scene. For example, the neural network can output whether an image contains a cat or does not contain a cat”. Para. 34, Para. 36 “In image and video, neural networks have been used for image classification, object localization and detection, image segmentation, and action recognition”].
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify TANG-1 in view of YI further in view of Drees and further in view of FUKUMOTO to teach the claim limitation, feature as taught by CHONG; because the modification enable to improve efficiency of operating a neural network by limiting the number of weights that contribute to the output.
Regarding claim 6, TANG-1 in view of YI further in view of Drees further in view of FUKU and further in view of CHONG teaches all claim limitation as stated above. Furthermore, FUKU teaches wherein the set of potential actions for the scene have been pre-generated by a machine-learning model [Para. 35-38, 70-77 fig. 1 unit 2C and related description. It’s clear that the prediction are pre-generated and stored in the database].

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over TANG et al. (US 20190384985, hereinafter TANG-1) in view of YI et al. (Pub. No. US 20190073524) further in view of Drees (Pub. No. US 20210048785) and further in view FUKUMOTO et al. (Pub. No. US 20190272750 hereinafter “FUKU”).
Regarding claim 7, TANG-1 in view of YI further in view of Drees doesn’t explicitly teach the claim limitation.
However, FUKU teaches accessing a second set of images of the plurality of the images of the scene, wherein the second set of images show the scene before the time period [Para. 35-38, 70-77, fig. 1 unit 5, 1b, 2c and related description]; generating, by processing the second set of images using the first machine-learning model, one or more attributes representing observed actions performed in the scene before the time period [Para. 35-38, 70-77, fig. 1 unit 5, 1b, 2c and related description]; and wherein predicting one or more actions that would happen in the scene after the time period is performed by processing the generated one or more attributes representing observed actions performed in the scene before the time period and the generated one or more attributes representing observed actions performed in the scene during the time period [Para. 35-38, 70-77, fig. 1 unit 5, 1b, 2c and related description].
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify TANG-1 in view of YI further in view of Drees to teach the claim limitation, feature as taught by FUKU; because the modification enable to improve efficiency of operating a neural network by limiting the number of weights that contribute to the output.

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over TANG et al. (US 20190384985, hereinafter TANG-1) in view of YI et al. (Pub. No. US 20190073524), further in view of Drees (Pub. No. US 20210048785), and further in view Mei et al. (PG pub. US 20180285689).
Regarding claim 11, TANG-1 in view of YI further in view of Drees doesn’t explicitly teach the claim limitation.
 Mei teaches predicting a label for the scene after the time period based on the generated attributes representing actions performed in the scene during the time period [Para. 2; Para. 17 “By sharing memory, each modality may not only possess its own property but may also possess the attributes of other modalities, and thus becomes more discriminative to distinguish pixels and more accurate predict scene labels”; Para. 31 “A prediction label 450 may be applied and then the final prediction 370 may be output”].
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify TANG-1 in view of YI further in view of Drees to teach the claim limitation, feature as taught by Mei; because the modification enable the system to increase accuracy on predicting a scene.




				              Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SOLOMON G BEZUAYEHU whose telephone number is (571)270-7452.  The examiner can normally be reached on Monday-Friday 10 AM-7 PM..
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached on 571-270-3717. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-0101 (IN USA OR CANADA) or 571-272-1000.

/SOLOMON G BEZUAYEHU/           Primary Examiner, Art Unit 2666