DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Claims
Claims 1-20, as originally filed, are currently pending and have been considered below.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 1-4, 6-11, 13-18 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu, Yuwei, et al. "Robust discriminative tracking via landmark-based label propagation." IEEE Transactions on Image Processing 24.5 (2015): 1510-1523, hereinafter, “Wu”, and further in view of Zhang, Peng, et al. "Online tracking based on efficient transductive learning with sample matching costs." Neurocomputing 175 (2016): 166-176, hereinafter, “Zhang”.

As per claim 1, Wu discloses a method for training a detector model of a detector system, the method comprising the steps of: 
obtaining a first training set that includes images having pixels that form one or more objects, the one or more objects being annotated with a known object location and a known class label (Wu, Abstract, a limited amount of initial labels; Wu, page 1512, III. Landmark-Based Label Propagation, we have l labeled samples ... and u unlabeled samples ... where xi and yi ... is the label vector ... classification task to separate the object from its surrounding background … known labels Yl); 
training the detector model using the first training set and a first loss function, the first loss function expresses a difference between the known object location and the known class label for the one or more objects and a predicted object location and a predicted class label for the one or more objects as predicted by the detector model (Wu, page 1514, C. Solving Label Prediction Matrix A, The second term L(·, ·) in Eq. (13) is an empirical loss function, which requires that the prediction f should be consistent with the known class labels … fl ... is the sub-matrix corresponding to the labeled samples; Wu, page 1515, A. Object Representation, used to train a single classifier);
label propagating a second training set by the detector model after the detector model is trained with the first training set, the second training set includes images having pixels that form one or more objects, the images of the second training set are sequentially associated with at least one image of the first training set (Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) … After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, page 1515, A. Object Representation, an object is represented by five image feature vectors inside the object region. The first patch is the entire object. Then the object is partitioned into 2 × 2 subsets which constitute the 4 remaining patches ... image patches corresponding to the same part of all samples construct a sub-sample set ... Each sub-sample set X(τ ) is used to train a single classifier f (τ ) using the label propagation model; Wu, page 1516, D. Bayesian State Inference, the observation set of the object ... which models the temporal correlation of the tracking results in consecutive frames); and 
training the detector model using the first training set, the second training set, the first loss function, wherein the detector model learns an instance identifier from the known object location of the one or more objects of the first training set and the second training set, the instance identifier expressing a temporal consistency of the one or more objects along a temporal axis (Wu, page 1514, C. Solving Label Prediction Matrix A, The second term L(·, ·) in Eq. (13) is an empirical loss function, which requires that the prediction f should be consistent with the known class labels; Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) ... After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, page 1515, A. Object Representation, an object is represented by five image feature vectors inside the object region. The first patch is the entire object. Then the object is partitioned into 2 × 2 subsets which constitute the 4 remaining patches ... image patches corresponding to the same part of all samples construct a sub-sample set ... Each sub-sample set X(τ ) is used to train a single classifier f (τ ) using the label propagation; Wu, page 1515-1516, C. Updating the Samples and Landmarks, For each new frame, candidates predicted by the particle filter are considered as unlabeled samples X. According to Eq. (19), we can get the classification score of each candidate. A candidate with higher classification score indicates that it is more likely to be generated from the target class ... If the classification score of the located object is higher than the predefined threshold E (i.e., the current tracking result is reliable), samples in XC are regarded as labeled ones ... candidates are considered as unlabeled samples and utilized to train the classifier together with collected samples stored in the sample pool; Wu, page 1516, D. Bayesian State Inference, the observation set of the object ... which models the temporal correlation of the tracking results in consecutive frames).
Wu does not explicitly disclose the following limitations as further recited however Zhang discloses 
training the detector model using the first loss function and a discriminative loss function, wherein the detector model learns an instance identifier from the known object location of the one or more objects of the first training set and the second training set using the discriminative loss function, wherein the detector model is trained through an intermediate multidimensional feature predicted at each pixel location of the one or more objects of the first training set and the second training set, the intermediate multidimensional feature being the instance identifier (Zhang, page 167, 3. Proposed large margin learning with sample matching costs, there are limited labeled data pairs ... and sufficient unlabeled ones ... By extending the semi-supervised learning ... to learn the optimal hyperplane and the labels for unlabeled instances simultaneously ... C1 and C2 are the regularization parameters that balance the computational cost and the empirical error on labeled as well as unlabeled data. The h(·) and h|·| denote the hinge loss for labeled data and symmetric hinge loss for unlabeled data; Zhang, page 169, 4. Tracking framework, We build our tracking framework by referring to the tracking-learning-detection scheme, which is shown in Fig. 1. In our framework, the online tracking consists of an appearance model and a motion model. After initial location of the target is specified, motion model will be used to generate the consistent feature (e.g., brightness, SIFT or SURF) of a pixel or patch between consecutive frames ... Initialization: During the stage of initialization, the target object is represented using HoG Features obtained from a normalized image patch (normally using 32 x 32 bounding box) based on the manually ground truth).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Wu to include the loss function for the unlabeled data as taught by Zhang in order to balance the computational cost while evaluating the model on labeled as well as unlabeled data (Zhang, page 168, 3. Proposed large margin learning with sample matching costs).

As per claim 2, Wu and Zhang disclose the method of claim 1, wherein, after the detector model is trained with the first training set, the second training set, the first loss function, and the discriminative loss function, the detector model outputs, for a detected object within an input image, a detected object location, a detected class label, and a detected instance identifier indicating a consistency of the detected object along the temporal axis (Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) ... After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, pages 1515-1516, C. Updating the Samples and Landmarks, For each new frame, candidates predicted by the particle filter are considered as unlabeled samples X. According to Eq. (19), we can get the classification score of each candidate. A candidate with higher classification score indicates that it is more likely to be generated from the target class ... If the classification score of the located object is higher than the predefined threshold E (i.e., the current tracking result is reliable); Wu, page 1516, D. Bayesian State Inference, the observation set of the object ... which models the temporal correlation of the tracking results in consecutive frames;  Zhang, page 169, 3. Proposed large margin learning with sample matching costs, matching cost function EΦ is defined for grid cell correspondence between data samples, which is a multi-scale extension of descriptor-wise matching with a pyramid graph model ... Let d(p) be the feature descriptor extracted at location p and n be the total number of the descriptors, and z denote correspondence vector obtained by descriptor matching at point p [Equations 8 and 9] ... where d1(p) and d2(p) denote the descriptors of pixel p).

As per claim 3, Wu and Zhang disclose the method of claim 2, further comprising the step of outputting the instance identifier to an object tracking system (Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) ... After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, page 1515-1516, C. Updating the Samples and Landmarks, For each new frame, candidates predicted by the particle filter are considered as unlabeled samples X. According to Eq. (19), we can get the classification score of each candidate. A candidate with higher classification score indicates that it is more likely to be generated from the target class ... If the classification score of the located object is higher than the predefined threshold E (i.e., the current tracking result is reliable), samples in XC are regarded as labeled ones ... candidates are considered as unlabeled samples and utilized to train the classifier together with collected samples stored in the sample pool).

As per claim 4, Wu and Zhang disclose the method of claim 3, further comprising the step of determining by the object tracking model system an instance similarity based on the instance identifier (Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) ... After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, page 1515-1516, C. Updating the Samples and Landmarks, For each new frame, candidates predicted by the particle filter are considered as unlabeled samples X. According to Eq. (19), we can get the classification score of each candidate. A candidate with higher classification score indicates that it is more likely to be generated from the target class ... If the classification score of the located object is higher than the predefined threshold E (i.e., the current tracking result is reliable), samples in XC are regarded as labeled ones ... candidates are considered as unlabeled samples and utilized to train the classifier together with collected samples stored in the sample pool).

As per claim 6, Wu and Zhang disclose the method of claim 1, wherein the intermediate multidimensional feature is one of an eight-dimensional feature vector or a twelve-dimensional feature vector (Wu, page 1517, A. Experimental Setup, 144 dimensional gray scale feature and 128 dimensional HOG feature are extracted from each image patch, and they are concatenated into a single feature vector).

As per claim 7, Wu and Zhang disclose the method of claim 1, wherein the detector model is trained in a semi-supervised manner (Wu, page 1512, III. Landmark-Based Label Propagation, the objective function of semi-supervised learning).

As per claim 8, Wu discloses a detector system having a detector model, the detector system comprising: 
one or more processors; and a memory in communication with the one or more processors (Wu, page 1522, G. Computational Complexity, The proposed approach was implemented in MATLAB on an Intel Core2 2.5 GHz processor with 4GB RAM), the memory having: 
an image acquisition module, the image acquisition module having instructions that, when executed by the one or more processors, cause the one or more processors to obtain a first training set that includes images each having pixels that form one or more objects, the one or more objects being annotated with a known object location and a known class label (Wu, Abstract, a limited amount of initial labels; Wu, page 1512, III. Landmark-Based Label Propagation, we have l labeled samples ... and u unlabeled samples ... where xi and yi ... is the label vector ... classification task to separate the object from its surrounding background … known labels Yl), 
a training module, the training module having instructions that, when executed by the one or more processors, cause the one or more processors to train the detector model using the first training set and a first loss function, the first loss function expresses a difference between the known object location and the known class label for the one or more objects and a predicted object location and a predicted class label for the one or more objects as predicted by the detector model (Wu, page 1514, C. Solving Label Prediction Matrix A, The second term L(·, ·) in Eq. (13) is an empirical loss function, which requires that the prediction f should be consistent with the known class labels … fl ... is the sub-matrix corresponding to the labeled samples; Wu, page 1515, A. Object Representation, used to train a single classifier), 
a label propagating module, the label propagating module having instructions that, when executed by the one or more processors, cause the one or more processors to label propagate a second training set by the detector model after the detector model is trained with the first training set, the second training set includes images having pixels that form one or more objects, the images of the second training set are sequentially associated with at least one image of the first training set (Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) … After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, page 1515, A. Object Representation, an object is represented by five image feature vectors inside the object region. The first patch is the entire object. Then the object is partitioned into 2 × 2 subsets which constitute the 4 remaining patches ... image patches corresponding to the same part of all samples construct a sub-sample set ... Each sub-sample set X(τ ) is used to train a single classifier f (τ ) using the label propagation model; Wu, page 1516, D. Bayesian State Inference, the observation set of the object ... which models the temporal correlation of the tracking results in consecutive frames), and 
the training module further having instructions that, when executed by the one or more processors, cause the one or more processors to train the detector model using the first training set, the second training set, the first loss function, wherein the object detector model learns an instance identifier from the known object location of the one or more objects of the first training set and the second training set, the instance identifier expressing a temporal consistency of the one or more objects along a temporal axis (Wu, page 1514, C. Solving Label Prediction Matrix A, The second term L(·, ·) in Eq. (13) is an empirical loss function, which requires that the prediction f should be consistent with the known class labels; Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) ... After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, page 1515, A. Object Representation, an object is represented by five image feature vectors inside the object region. The first patch is the entire object. Then the object is partitioned into 2 × 2 subsets which constitute the 4 remaining patches ... image patches corresponding to the same part of all samples construct a sub-sample set ... Each sub-sample set X(τ ) is used to train a single classifier f (τ ) using the label propagation; Wu, page 1515-1516, C. Updating the Samples and Landmarks, For each new frame, candidates predicted by the particle filter are considered as unlabeled samples X. According to Eq. (19), we can get the classification score of each candidate. A candidate with higher classification score indicates that it is more likely to be generated from the target class ... If the classification score of the located object is higher than the predefined threshold E (i.e., the current tracking result is reliable), samples in XC are regarded as labeled ones ... candidates are considered as unlabeled samples and utilized to train the classifier together with collected samples stored in the sample pool; Wu, page 1516, D. Bayesian State Inference, the observation set of the object ... which models the temporal correlation of the tracking results in consecutive frames).
Wu does not explicitly disclose the following limitations as further recited however Zhang discloses 
train the detector model using the first loss function and a discriminative loss function, wherein the object detector model learns an instance identifier from the known object location of the one or more objects of the first training set and the second training set using the discriminative loss function, wherein the detector model is trained through an intermediate multidimensional feature predicted at each pixel location of the one or more objects of the first training set and the second training set, the intermediate multidimensional feature being the instance identifier (Zhang, page 167, 3. Proposed large margin learning with sample matching costs, there are limited labeled data pairs ... and sufficient unlabeled ones ... By extending the semi-supervised learning ... to learn the optimal hyperplane and the labels for unlabeled instances simultaneously ... C1 and C2 are the regularization parameters that balance the computational cost and the empirical error on labeled as well as unlabeled data. The h(·) and h|·| denote the hinge loss for labeled data and symmetric hinge loss for unlabeled data; Zhang, page 169, 4. Tracking framework, We build our tracking framework by referring to the tracking-learning-detection scheme, which is shown in Fig. 1. In our framework, the online tracking consists of an appearance model and a motion model. After initial location of the target is specified, motion model will be used to generate the consistent feature (e.g., brightness, SIFT or SURF) of a pixel or patch between consecutive frames ... Initialization: During the stage of initialization, the target object is represented using HoG Features obtained from a normalized image patch (normally using 32 x 32 bounding box) based on the manually ground truth).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Wu to include the loss function for the unlabeled data as taught by Zhang in order to balance the computational cost while evaluating the model on labeled as well as unlabeled data (Zhang, page 168, 3. Proposed large margin learning with sample matching costs).

As per claim 9, Wu and Zhang disclose the system of claim 8, wherein, after the detector model is trained with the first training set, the second training set, the first loss function, and the discriminative loss function, the detector model is configured to output, for a detected object within an input image, a detected object location, a detected class label, and a detected instance identifier indicating a consistency of the detected object along the temporal axis (Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) ... After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, pages 1515-1516, C. Updating the Samples and Landmarks, For each new frame, candidates predicted by the particle filter are considered as unlabeled samples X. According to Eq. (19), we can get the classification score of each candidate. A candidate with higher classification score indicates that it is more likely to be generated from the target class ... If the classification score of the located object is higher than the predefined threshold E (i.e., the current tracking result is reliable); Wu, page 1516, D. Bayesian State Inference, the observation set of the object ... which models the temporal correlation of the tracking results in consecutive frames;  Zhang, page 169, 3. Proposed large margin learning with sample matching costs, matching cost function EΦ is defined for grid cell correspondence between data samples, which is a multi-scale extension of descriptor-wise matching with a pyramid graph model ... Let d(p) be the feature descriptor extracted at location p and n be the total number of the descriptors, and z denote correspondence vector obtained by descriptor matching at point p [Equations 8 and 9] ... where d1(p) and d2(p) denote the descriptors of pixel p).

As per claim 10, Wu and Zhang disclose the system of claim 9, wherein, after the detector model is trained with the first training set, the second training set, the first loss function, and the discriminative loss function, the detector model is configured to output the instance identifier to an object tracking system (Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) ... After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, page 1515-1516, C. Updating the Samples and Landmarks, For each new frame, candidates predicted by the particle filter are considered as unlabeled samples X. According to Eq. (19), we can get the classification score of each candidate. A candidate with higher classification score indicates that it is more likely to be generated from the target class ... If the classification score of the located object is higher than the predefined threshold E (i.e., the current tracking result is reliable), samples in XC are regarded as labeled ones ... candidates are considered as unlabeled samples and utilized to train the classifier together with collected samples stored in the sample pool).

As per claim 11, Wu and Zhang disclose the system of claim 10, further comprising an object tracking system configured to determine an instance similarity based on the instance identifier (Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) ... After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, page 1515-1516, C. Updating the Samples and Landmarks, For each new frame, candidates predicted by the particle filter are considered as unlabeled samples X. According to Eq. (19), we can get the classification score of each candidate. A candidate with higher classification score indicates that it is more likely to be generated from the target class ... If the classification score of the located object is higher than the predefined threshold E (i.e., the current tracking result is reliable), samples in XC are regarded as labeled ones ... candidates are considered as unlabeled samples and utilized to train the classifier together with collected samples stored in the sample pool).

As per claim 13, Wu and Zhang disclose the system of claim 8, wherein the intermediate multidimensional feature is one of an eight-dimensional feature vector or a twelve-dimensional feature vector (Wu, page 1517, A. Experimental Setup, 144 dimensional gray scale feature and 128 dimensional HOG feature are extracted from each image patch, and they are concatenated into a single feature vector).

As per claim 14, Wu and Zhang disclose the system of claim 8, wherein the detector model is trained in a semi-supervised manner (Wu, page 1512, III. Landmark-Based Label Propagation, the objective function of semi-supervised learning).

As per claim 15, Wu discloses a non-transitory computer-readable medium storing instruction that, when executed by one or more processors (Wu, page 1522, G. Computational Complexity, The proposed approach was implemented in MATLAB on a Intel Core2 2.5 GHz processor with 4GB RAM), cause the one or more processors to: 
obtain a first training set that includes images having pixels that form one or more objects, the one or more objects being annotated with a known object location and a known class label (Wu, Abstract, a limited amount of initial labels; Wu, page 1512, III. Landmark-Based Label Propagation, we have l labeled samples ... and u unlabeled samples ... where xi and yi ... is the label vector ... classification task to separate the object from its surrounding background … known labels Yl); 
train a detector model using the first training set and a first loss function, the first loss function expresses a difference between the known object location and the known class label for the one or more objects and a predicted object location and a predicted class label for the one or more objects as predicted by the detector model (Wu, page 1514, C. Solving Label Prediction Matrix A, The second term L(·, ·) in Eq. (13) is an empirical loss function, which requires that the prediction f should be consistent with the known class labels … fl ... is the sub-matrix corresponding to the labeled samples; Wu, page 1515, A. Object Representation, used to train a single classifier); 
label propagate a second training set by the detector model after the detector model is trained with the first training set, the second training set includes images having pixels that form one or more objects, the images of the second training set are sequentially associated with at least one image of the first training set (Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) … After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, page 1515, A. Object Representation, an object is represented by five image feature vectors inside the object region. The first patch is the entire object. Then the object is partitioned into 2 × 2 subsets which constitute the 4 remaining patches ... image patches corresponding to the same part of all samples construct a sub-sample set ... Each sub-sample set X(τ ) is used to train a single classifier f (τ ) using the label propagation model; Wu, page 1516, D. Bayesian State Inference, the observation set of the object ... which models the temporal correlation of the tracking results in consecutive frames); and 
train the detector model using the first training set, the second training set, the first loss function, wherein the detector model learns an instance identifier from the known object location of the one or more objects of the first training set and the second training set, the instance identifier expressing a temporal consistency of the one or more objects along a temporal axis (Wu, page 1514, C. Solving Label Prediction Matrix A, The second term L(·, ·) in Eq. (13) is an empirical loss function, which requires that the prediction f should be consistent with the known class labels; Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) ... After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, page 1515, A. Object Representation, an object is represented by five image feature vectors inside the object region. The first patch is the entire object. Then the object is partitioned into 2 × 2 subsets which constitute the 4 remaining patches ... image patches corresponding to the same part of all samples construct a sub-sample set ... Each sub-sample set X(τ ) is used to train a single classifier f (τ ) using the label propagation; Wu, page 1515-1516, C. Updating the Samples and Landmarks, For each new frame, candidates predicted by the particle filter are considered as unlabeled samples X. According to Eq. (19), we can get the classification score of each candidate. A candidate with higher classification score indicates that it is more likely to be generated from the target class ... If the classification score of the located object is higher than the predefined threshold E (i.e., the current tracking result is reliable), samples in XC are regarded as labeled ones ... candidates are considered as unlabeled samples and utilized to train the classifier together with collected samples stored in the sample pool; Wu, page 1516, D. Bayesian State Inference, the observation set of the object ... which models the temporal correlation of the tracking results in consecutive frames).
Wu does not explicitly disclose the following limitations as further recited however Zhang discloses
train the detector model using the first loss function and a  discriminative loss function wherein the detector model learns an instance identifier from the known object location of the one or more objects of the first training set and the second training set using the discriminative loss function wherein the detector model is trained through an intermediate multidimensional feature predicted at each pixel location of the one or more objects of the first training set and the second training set, the intermediate multidimensional feature being the instance identifier (Zhang, page 167, 3. Proposed large margin learning with sample matching costs, there are limited labeled data pairs ... and sufficient unlabeled ones ... By extending the semi-supervised learning ... to learn the optimal hyperplane and the labels for unlabeled instances simultaneously ... C1 and C2 are the regularization parameters that balance the computational cost and the empirical error on labeled as well as unlabeled data. The h(·) and h|·| denote the hinge loss for labeled data and symmetric hinge loss for unlabeled data; Zhang, page 169, 4. Tracking framework, We build our tracking framework by referring to the tracking-learning-detection scheme, which is shown in Fig. 1. In our framework, the online tracking consists of an appearance model and a motion model. After initial location of the target is specified, motion model will be used to generate the consistent feature (e.g., brightness, SIFT or SURF) of a pixel or patch between consecutive frames ... Initialization: During the stage of initialization, the target object is represented using HoG Features obtained from a normalized image patch (normally using 32 x 32 bounding box) based on the manually ground truth).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Wu to include the loss function for the unlabeled data as taught by Zhang in order to balance the computational cost while evaluating the model on labeled as well as unlabeled data (Zhang, page 168, 3. Proposed large margin learning with sample matching costs).

As per claim 16, Wu and Zhang disclose the non-transitory computer-readable medium of claim 15, wherein, after the detector model is trained with the first training set, the second training set, the first loss function, and the discriminative loss function, the detector model is configured to output, for a detected object within an input image, a detected object location, a detected class label, and a detected instance identifier indicating a consistency of the detected object along the temporal axis (Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) ... After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, pages 1515-1516, C. Updating the Samples and Landmarks, For each new frame, candidates predicted by the particle filter are considered as unlabeled samples X. According to Eq. (19), we can get the classification score of each candidate. A candidate with higher classification score indicates that it is more likely to be generated from the target class ... If the classification score of the located object is higher than the predefined threshold E (i.e., the current tracking result is reliable); Wu, page 1516, D. Bayesian State Inference, the observation set of the object ... which models the temporal correlation of the tracking results in consecutive frames;  Zhang, page 169, 3. Proposed large margin learning with sample matching costs, matching cost function EΦ is defined for grid cell correspondence between data samples, which is a multi-scale extension of descriptor-wise matching with a pyramid graph model ... Let d(p) be the feature descriptor extracted at location p and n be the total number of the descriptors, and z denote correspondence vector obtained by descriptor matching at point p [Equations 8 and 9] ... where d1(p) and d2(p) denote the descriptors of pixel p).

As per claim 17, Wu and Zhang disclose the non-transitory computer-readable medium of claim 16, further comprising instructions that, when executed by one or more processors, cause the one or more processors to output the instance identifier to an object tracking system (Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) ... After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, page 1515-1516, C. Updating the Samples and Landmarks, For each new frame, candidates predicted by the particle filter are considered as unlabeled samples X. According to Eq. (19), we can get the classification score of each candidate. A candidate with higher classification score indicates that it is more likely to be generated from the target class ... If the classification score of the located object is higher than the predefined threshold E (i.e., the current tracking result is reliable), samples in XC are regarded as labeled ones ... candidates are considered as unlabeled samples and utilized to train the classifier together with collected samples stored in the sample pool).

As per claim 18, Wu and Zhang disclose the non-transitory computer-readable medium of claim 17, further comprising instructions that, when executed by one or more processors, cause the one or more processors to determine, by the object tracking system, an instance similarity based on the instance identifier (Wu, page 1515, D. Soft Label Propagation, Through applying the label propagation model Eq. (2), we are able to predict the soft label for any sample xi (unlabeled training samples) ... After deriving the soft label prediction (i.e., classification) of each sample, the classification score can be utilized as the similarity measure; Wu, page 1515-1516, C. Updating the Samples and Landmarks, For each new frame, candidates predicted by the particle filter are considered as unlabeled samples X. According to Eq. (19), we can get the classification score of each candidate. A candidate with higher classification score indicates that it is more likely to be generated from the target class ... If the classification score of the located object is higher than the predefined threshold E (i.e., the current tracking result is reliable), samples in XC are regarded as labeled ones ... candidates are considered as unlabeled samples and utilized to train the classifier together with collected samples stored in the sample pool).

As per claim 20, Wu and Zhang disclose the non-transitory computer-readable medium of claim 15, wherein the intermediate multidimensional feature is one of an eight-dimensional feature vector or a twelve-dimensional feature vector (Wu, page 1517, A. Experimental Setup, 144 dimensional gray scale feature and 128 dimensional HOG feature are extracted from each image patch, and they are concatenated into a single feature vector).


Claim(s) 5, 12 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu, Yuwei, et al. "Robust discriminative tracking via landmark-based label propagation." IEEE Transactions on Image Processing 24.5 (2015): 1510-1523, hereinafter, “Wu”, in view of Zhang, Peng, et al. "Online tracking based on efficient transductive learning with sample matching costs." Neurocomputing 175 (2016): 166-176, hereinafter, “Zhang” as applied to claims 1, 8 and 15 above, and further in view of Hajizadeh, Siamak, Alfredo Núnez, and David MJ Tax. "Semi-supervised rail defect detection from imbalanced image data." IFAC-PapersOnLine 49.3 (2016): 78-83, hereinafter, “Hajizadeh”.

As per claim 5, Wu and Zhang disclose the method of claim 1, but do not explicitly disclose the following limitations as further recited however Hajizadeh discloses wherein the images of the first training set and the second training set are RGB images captured by a camera mounted to a vehicle (Hajizadeh, page 78, 1. Introduction, use semi-supervised learning for finding new candidate samples; Hajizadeh, page 80, 2. Problems and Methods, we use label propagation to identify only new positive data and we use them for re-training our classifier; Hajizadeh, pages 80-81, 3. Data and Features, Our data comes from a high frame rate camera that is mounted on a measurement vehicle. It consists of 37 high frame rate videos ... In total 21979 objects are labeled manually from 7 selected videos. The other 30 videos are used to extract 718520 unlabeled objects).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Wu and Zhang to include the acquisition of data via a camera mounted to a moving vehicle as taught by Hajizadeh in order to be able to provide sufficient training and testing data for evaluation of the model (Hajizadeh, pages 80-81, 3. Data and Features).

Regarding claim(s) 12 and 19: 
A corresponding reasoning as given earlier (see rejection of claim(s) 5) applies, mutatis mutandis, to the subject-matter of claim(s) 12 and 19, and therefore is/are also considered rejected under the grounds given in the rejection of claim(s) 5.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Sun L, Zhao C, Stolkin R. Weakly-supervised DCNN for RGB-D object recognition in real-world applications which lack large-scale annotated training data. arXiv preprint arXiv:1703.06370. 2017 Mar 19, discloses propagating labels to unlabeled data in order to train a detector / object recognition model.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to TRACY MANGIALASCHI whose telephone number is (571)270-5189. The examiner can normally be reached M-F, 9:30AM TO 6:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached on (571) 272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/TRACY MANGIALASCHI/Examiner, Art Unit 2668                            
/VU LE/Supervisory Patent Examiner, Art Unit 2668