DETAILED ACTION
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
All amendments to the claims as filled on 8/25/2022 have been entered and the action follows:

Response to Arguments
Applicant’s arguments with respect to claim(s) have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 7-12 and 14-19 are rejected under 35 U.S.C. 103 as being unpatentable over Koivisto et al (US Pub. 2019/0258878) in view of Early intent prediction of Vulnerable road users from visual attributes using multi-task learning network, by Saleh (an IDS document) and Shet et al (US Pub. 2010/0008540).
With respect to claim 1, Koivisto discloses method for tracking a vulnerable road user (VRU) regardless of occlusion, the method comprising (see Abstract and paragraph 0002 operator safety for autonomous vehicles):
capturing a series of images comprising a plurality of human VRUs including the VRU, the VRU at least partially occluded in at least some images of the series of images; inputting each of the images into a detection model, (see paragraph 0053, object detector “a detection model” analyzes image data to detect objects in captured sensor data, also see paragraph 0177 for raining object detector using the occlusion data; paragraph 0066, …objects may be occluded from one or multiple sides…, paragraph 0002, objects – such as vehicles, people “humans”… “a plurality of human VRUs including”); 
receiving a bounding box for each of the series of images of the VRU as output from the detection model, (see paragraph 0063, detected object region may be a bounding box around the detected object);
[inputting each bounding box into a multi-task model;
receiving as output from the multi-task model an embedding for each bounding box, the embedding produced from a shared layer of the multi-task model, the multi-task model comprising the shared layer and a plurality of branches each trained to predict a different activity; and
determining, using the embeddings for each bounding box across the series of images, an indication of which of the embeddings correspond to the VRU as opposed to a different VRU of the plurality of  human VRUs despite the partial occlusion of the VRU], (see paragraph 0078, for getting the confidence score for the objects detected in the image and then the object is tracked in object tracker see figure 1B 114), wherein determining the indication of which of the embeddings correspond to the VRU comprises:
inputting each embedding into an unsupervised learning model; and receiving as output, from the unsupervised learning model, an indication of a cluster of embeddings to which each embedding corresponds, each cluster corresponding to a different VRU, (see paragraph 0053, and paragraph 0043 for the unsupervised learning model, also see paragraph 0072, wherein various object are detected, …classes may include without limitations cars. Motorcycles, pedestrians and or cyclists… “each cluster corresponding to a different VRU”), as claimed.
However, Koivisto fails to explicitly disclose inputting each bounding box into a multi-task model; receiving as output from the multi-task model an embedding for each bounding box, the embedding produced from a shared layer of the multi-task model, the multi-task model comprising the shared layer and a plurality of branches each trained to predict a different activity; and determining, using the embeddings for each bounding box across the series of images, an indication of which of the embeddings correspond to the VRU an indication of which of the embeddings correspond to the VRU as opposed to a different VRU of the plurality of  human VRUs despite the partial occlusion of the VRU, as claimed.
Saleh in the same field teaches inputting each bounding box into a multi-task model;
receiving as output from the multi-task model an embedding for each bounding box, the embedding produced from a shared layer of the multi-task model, the multi-task model comprising the shared layer and a plurality of branches each trained to predict a different activity; and determining, using the embeddings for each bounding box across the series of images, an indication of which of the embeddings correspond to the VRU, an indication of which of the embeddings correspond to the VRU as opposed to a different VRU of the plurality of  human VRUs despite the partial occlusion of the VRU, (see figure 3;  page 3367, right hand column last five lines …given an input bounding box ….we utilize a multi-task learning based on CNN “multi task model”..…jointly reason about the two classification task (head and body posture) “an embedding” simultaneously, also page 3369, left hand column …lower layers are shared “a shared layer of the multi-task model” ….and top layers …has specific unique outputs …. “a plurality of branches each trained to predict a different activity”) as claimed.  
It would have been obvious to one ordinary skilled in the art at the effective date of invention to combine the two references as they are analogous because they are solving similar problem of object detection in an image using image analysis.  The teaching of Saleh to use a multi-task learning network can be incorporated in to Koivisto’s system as it uses a neural network for the object detection (see figure 1	) for suggestion, and modification of the system yields one joint network rather than two networks for each task (see Saleh page 3368, left hand column first four lines) for motivation.  

Koivisto and Saleh fail to disclose an indication of which of the embeddings correspond to the VRU as opposed to a different VRU of the plurality of human VRUs despite the partial occlusion, (emphasis added), as claimed.  
Shet in the method for object detection teaches an indication of which of the embeddings correspond to the VRU as opposed to a different VRU of the plurality of human VRUs despite the partial occlusion, (see figure 4, query for the presence of a human ..; based on the evaluation of the evidence or lack therefor decide whether a human is or is not in the region), as claimed.  
It would have been obvious to one ordinary skilled in the art at the effective date of invention to combine the references as they are analogous because they are solving similar problem of object detection in an image using image analysis.  The teaching of Shet to find the presence of the person in the region of interest in the image can be in corporated in to the Koivisto’s system 
system as it uses a neural network for the object detection (see figure 1) for suggestion, and modification of the system yields a system to detect the presence of person (see paragaprh 0003).  

With respect to claim 2, Koivisto further discloses wherein capturing the series of images comprises receiving images captured by a camera installed on a vehicle, and wherein auxiliary data captured by sensors installed on the vehicle is received with the images, (see figure 15 B car with cameras/sensors; and auxiliary data are captured by the sensors like LIDAR see figure 15C 1564) as claimed.

With respect to claim 3, Koivisto further discloses receiving coordinates in the context of each of the series of images of each bounding box, wherein determining the indication of which of the embeddings correspond to the VRU comprises using the coordinates in addition to the embeddings, (see paragraph 0041 for the object features detected with in the bounding box and paragraph 0043 for the training of the CNN “model” for the detection of the object) as claimed. 

With respect to claim 4, Koivisto further discloses wherein determining the indication of which of the embeddings correspond to the VRU further comprises using the auxiliary data, (see paragraph 0041 for the LIDAR sensor data use to find the location of the objects), as claimed.

With respect to claim 5, Koivisto further discloses wherein each embedding acts as a fingerprint that tracks VRU without assigning an identity to the VRU, (see paragraph 0053, wherein the detected object is tracked through the frames or images), as claimed.

With respect to claim 7, Koivisto further discloses wherein determining the indication of which of the embeddings correspond to the VRU further comprises receiving, as part of the output, a confidence score corresponding to a confidence that each given embedding corresponds to its indicated cluster, (see paragraph 0053), as claimed.

Claims 8-12, 14 and 15-19 are rejected for the same reasons as set forth in the rejections of claims 1-5, and 7, because claims 8-12, 14 and 15-19 are claiming subject matter of similar scope as claimed in claims 1-5 and 7.  

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to VIKKRAM BALI whose telephone number is (571)272-7415. The examiner can normally be reached Monday-Friday 7:00AM-3:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire Wang can be reached on 571-270-1051. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/VIKKRAM BALI/Primary Examiner, Art Unit 2663