DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Claims
Claims 1-20, as originally filed, are currently pending and have been considered below.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1-3, 10-12, 19 and 20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Zheng, Wei-Shi, Shaogang Gong, and Tao Xiang. "Group association: Assisting re-identification by visual context." Person Re-Identification. Springer, London, 2014. 183-201, hereinafter “Zheng”.

As per claim 1, Zheng discloses an object association method, comprising: 
obtaining a first image and a second image (Zheng, page 184, 9.1 Introduction, different camera views at different locations); and 
determining an association relationship between a plurality of objects in the first image and a plurality of objects in the second image based on surrounding information of the plurality of objects in the first image and surrounding information of the plurality of objects in the second image (Zheng, page 185, 9.1 Introduction, utilising associated group of people as visual context to improve the matching of individuals across camera views), 
wherein surrounding information of one object is determined according to pixels within a set range around a bounding box of the object in an image where the object is located (Zheng, pages 193-194, 9.5.2 Re-identification with Group Context, we expand the rectangular ring structure surrounding each person. This makes the group context person specific ... the most inner rectangular region P1 is the bounding box of a person, for other outer rings, they are max {M − a1 − 0.5 · M1, a1 − 0.5 · M1}/(l − 1) and max {N −b1−0.5 · N1, b1−0.5 · N1}/(l −1) thick along the horizontal and vertical directions, where (a1, b1) is the centre of region P1, M and N are width and height of the group image, and M1 and N1 are width and height of P1 … combine the distance metric dp of a pair of person descriptors and the distance metric dr of the corresponding group context descriptors computed from a probe and gallery image pair to be matched. More specifically, denote the person descriptors of person image I1p and I2p as P1 and P2 respectively and denote their corresponding group context descriptors as T1 and T2 respectively. Then the distance between two people is computed).

As per claim 2, Zheng discloses the method according to claim 1, wherein determining the association relationship between the plurality of objects in the first image and the plurality of objects in the second image based on the surrounding information of the plurality of objects in the first image and the surrounding information of the plurality of objects in the second image comprises: 
determining the association relationship between the plurality of objects in the first image and the plurality of objects in the second image based on the surrounding information and appearance information of the plurality of objects in the first image, and the surrounding information and appearance information of the plurality of objects in the second image, wherein appearance information of one object is determined according to pixels within a bounding box of the object in an image where the object is located (Zheng, page 187, 9.3.1, we first assign a label to each pixel of a given group image I ... we extract SIFT features for each RGB channel at each pixel with a surrounding support region (12×12 in our experiment). We also obtain an average RGB colour vector of pixel over a support region (3×3) ... The SIFT vector and colour vector are then concatenated for each pixel for representation ... an appearance label image is built by assigning a visual word index to the corresponding SIFT+RGB feature at each pixel of the group image; Zheng, pages 193-194, 9.5.2 Re-identification with Group Context, we expand the rectangular ring structure surrounding each person. This makes the group context person specific ... the most inner rectangular region P1 is the bounding box of a person, for other outer rings, they are max {M − a1 − 0.5 · M1, a1 − 0.5 · M1}/(l − 1) and max {N −b1−0.5 · N1, b1−0.5 · N1}/(l −1) thick along the horizontal and vertical directions, where (a1, b1) is the centre of region P1, M and N are width and height of the group image, and M1 and N1 are width and height of P1 … combine the distance metric dp of a pair of person descriptors and the distance metric dr of the corresponding group context descriptors computed from a probe and gallery image pair to be matched. More specifically, denote the person descriptors of person image I1p and I2p as P1 and P2 respectively and denote their corresponding group context descriptors as T1 and T2 respectively. Then the distance between two people is computed).

As per claim 3, Zheng discloses the method according to claim 2, wherein determining the association relationship between the plurality of objects in the first image and the plurality of objects in the second image based on the surrounding information and appearance information of the plurality of objects in the first image, and the surrounding information and appearance information of the plurality of objects in the second image comprises: 
determining a plurality of first feature distances based on the appearance information of the plurality of objects in the first image and the appearance information of the plurality of objects in the second image, wherein a first feature distance represents a degree of similarity between one object of the plurality of objects in the first image and one object of the plurality of objects in the second image (Zheng, page 192, 9.5.1 Re-identification by Ranking, Person re-identification can be casted as a ranking problem, by which the problem is further addressed either in terms of feature selection or matching distance metric learning. This approach aims to learn a set of most discriminant and robust features, based on which a weighted L1 norm distance is used to measure the similarity between a pair of person images); 
determining a plurality of second feature distances based on the surrounding information of the plurality of objects in the first image and the surrounding information of the plurality of objects in the second image, wherein a second feature distance represents a degree of similarity between surrounding information of one object of the plurality of objects in the first image and surrounding information of one object of the plurality of objects in the second image (Zheng, page 194, 9.5.2 Re-identification with Group Context, combine the distance metric dp of a pair of person descriptors and the distance metric dr of the corresponding group context descriptors computed from a probe and gallery image pair to be matched. More specifically, denote the person descriptors of person image I1p and I2p as P1 and P2 respectively and denote their corresponding group context descriptors as T1 and T2 respectively. Then the distance between two people is computed [Equation 9.11; Equation 9.12]); 
for one object in the first image and one object in the second image, determining, according to a first feature distance and a second feature distance between the object in the first image and the object in the second image, a feature distance between the object in the first image and the object in the second image (Zheng, page 192, 9.5.1 Re-identification by Ranking, Person re-identification can be casted as a ranking problem, by which the problem is further addressed either in terms of feature selection or matching distance metric learning. This approach aims to learn a set of most discriminant and robust features, based on which a weighted L1 norm distance is used to measure the similarity between a pair of person images); and 
determining the association relationship between the plurality of objects in the first image and the plurality of objects in the second image based on a plurality of determined feature distances (Zheng, page 194, 9.5.2 Re-identification with Group Context, combine the distance metric dp of a pair of person descriptors and the distance metric dr of the corresponding group context descriptors computed from a probe and gallery image pair to be matched. More specifically, denote the person descriptors of person image I1p and I2p as P1 and P2 respectively and denote their corresponding group context descriptors as T1 and T2 respectively. Then the distance between two people is computed [Equation 9.11; Equation 9.12]).

As per claim 10, Zheng discloses an object association apparatus, comprising: 
a processor; and a memory configured to store computer instructions executable by the processor (Zheng, page 184, Introduction, computer vision; Zheng, page 192, 9.5.1 Re-identification by Ranking, memory usage … memory cost; Zheng, page 193, 9.5.2 Re-identification with Group Context, computing the group descriptors), wherein the processor is configured to: 
obtain a first image and a second image (Zheng, page 184, 9.1 Introduction, different camera views at different locations); and 
determine an association relationship between a plurality of objects in the first image and a plurality of objects in the second image based on surrounding information of the plurality of objects in the first image and surrounding information of the plurality of objects in the second image (Zheng, page 185, 9.1 Introduction, utilising associated group of people as visual context to improve the matching of individuals across camera views), 
wherein surrounding information of one object is determined according to pixels within a set range around a bounding box of the object in an image where the object is located (Zheng, pages 193-194, 9.5.2 Re-identification with Group Context, we expand the rectangular ring structure surrounding each person. This makes the group context person specific ... the most inner rectangular region P1 is the bounding box of a person, for other outer rings, they are max {M − a1 − 0.5 · M1, a1 − 0.5 · M1}/(l − 1) and max {N −b1−0.5 · N1, b1−0.5 · N1}/(l −1) thick along the horizontal and vertical directions, where (a1, b1) is the centre of region P1, M and N are width and height of the group image, and M1 and N1 are width and height of P1 … combine the distance metric dp of a pair of person descriptors and the distance metric dr of the corresponding group context descriptors computed from a probe and gallery image pair to be matched. More specifically, denote the person descriptors of person image I1p and I2p as P1 and P2 respectively and denote their corresponding group context descriptors as T1 and T2 respectively. Then the distance between two people is computed).

Regarding claim(s) 11 and 12: 
A corresponding reasoning as given earlier (see rejection of claim(s) 2 and 3, respectively) applies, mutatis mutandis, to the subject-matter of claim(s) 11 and 12, and therefore is/are also considered rejected under the grounds given in the rejection of claim(s) 2 and 3 respectively.

As per claim 19, Zheng discloses an object association system, comprising: a first image acquisition device, configured to acquire one scene at a first view to obtain a first image (Zheng, page 184, 9.1 Introduction, different camera views at different locations); a second image acquisition device, configured to acquire the scene at a second view to obtain a second image, wherein the first view is different from the second view (Zheng, page 184, 9.1 Introduction, different camera views at different locations); and a processor, configured to perform the object association method according to claim 1 (Zheng, page 184, Introduction, computer vision; Zheng, page 192, 9.5.1 Re-identification by Ranking, memory usage … memory cost; Zheng, page 193, 9.5.2 Re-identification with Group Context, computing the group descriptors).

As per claim 20, Zheng discloses a non-transitory computer readable storage medium, having a computer program stored thereon, wherein the computer program, when being executed by a processor (Zheng, page 184, Introduction, computer vision; Zheng, page 192, 9.5.1 Re-identification by Ranking, memory usage … memory cost; Zheng, page 193, 9.5.2 Re-identification with Group Context, computing the group descriptors), enables the processor to implement operations of: 
obtaining a first image and a second image (Zheng, page 184, 9.1 Introduction, different camera views at different locations); and 
determining an association relationship between a plurality of objects in the first image and a plurality of objects in the second image based on surrounding information of the plurality of objects in the first image and surrounding information of the plurality of objects in the second image (Zheng, page 185, 9.1 Introduction, utilising associated group of people as visual context to improve the matching of individuals across camera views), 
wherein surrounding information of one object is determined according to pixels within a set range around a bounding box of the object in an image where the object is located (Zheng, pages 193-194, 9.5.2 Re-identification with Group Context, we expand the rectangular ring structure surrounding each person. This makes the group context person specific ... the most inner rectangular region P1 is the bounding box of a person, for other outer rings, they are max {M − a1 − 0.5 · M1, a1 − 0.5 · M1}/(l − 1) and max {N −b1−0.5 · N1, b1−0.5 · N1}/(l −1) thick along the horizontal and vertical directions, where (a1, b1) is the centre of region P1, M and N are width and height of the group image, and M1 and N1 are width and height of P1 … combine the distance metric dp of a pair of person descriptors and the distance metric dr of the corresponding group context descriptors computed from a probe and gallery image pair to be matched. More specifically, denote the person descriptors of person image I1p and I2p as P1 and P2 respectively and denote their corresponding group context descriptors as T1 and T2 respectively. Then the distance between two people is computed).


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 4 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zheng, Wei-Shi, Shaogang Gong, and Tao Xiang. "Group association: Assisting re-identification by visual context." Person Re-Identification. Springer, London, 2014. 183-201, hereinafter “Zheng” as applied to claims 3 and 12 above, and further in view of Chen, Yiqiang, et al. "Person Re-identification using group context." International Conference on Advanced Concepts for Intelligent Vision Systems. Springer, Cham, 2018, hereinafter, “Chen”.

As per claim 4, Zheng discloses the method according to claim 3, but does not explicitly disclose the following limitations as further recited however Chen discloses wherein determining, according to the first feature distance and the second feature distance between the object in the first image and the object in the second image, the feature distance between the object in the first image and the object in the second image comprises: 
performing weighted summation on the first feature distance and the second feature distance between the object in the first image and the object in the second image to obtain the feature distance between the object in the first image and the object in the second image, wherein in condition that the degree of similarity between the object in the first image and the object in the second image is higher, a weight coefficient of the second feature distance between the object in the first image and the object in the second image is larger during weighted summation (Chen, page 396, Figure 2, group context distance and single-person distance are computed and summed to obtain the final distance; Chen, pages 395-396, 3.1 Group Association, W and b are weights and bias of the last fully-connected layer and P(yj = 1|x) is the predicted probability that the input x corresponds to identity j … The distance between two images is measured with the cosine distance between the feature vectors; Chen, page 397, 3.2 Group Assisted Person Re-identification, input data is composed of group images with annotated individual identities and corresponding bounding boxes ... a query person image P is obtained from the raw group image by using the given annotated bounding box. Second, its group context image G is obtained from the raw group image ... extract the feature embeddings for respectively the group context input image and the person image ... For a given query and candidate image, the cosine distance is used to separately compute a group context distance Dgr between the two group images and a person distance Did between the two single-person images. The final distance measure is simply the sum [Equation 4] This equation can also be formulated as a weighted sum).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Zheng to include the distance calculations as taught by Chen in order to enhance the re-identification accuracy (Chen, page 393, Introduction).

Regarding claim(s) 13: 
A corresponding reasoning as given earlier (see rejection of claim(s) 4) applies, mutatis mutandis, to the subject-matter of claim(s) 13, and therefore is/are also considered rejected under the grounds given in the rejection of claim(s) 4.

Claims 5 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zheng, Wei-Shi, Shaogang Gong, and Tao Xiang. "Group association: Assisting re-identification by visual context." Person Re-Identification. Springer, London, 2014. 183-201, hereinafter “Zheng” as applied to claims 3 and 12 above, and further in view of Xu, Yuanlu, et al. "Cross-view people tracking by scene-centered spatio-temporal parsing." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 31. No. 1. 2017, hereinafter, “Xu”.

As per claim 5, Zheng discloses the method according to claim 3, but does not explicitly disclose the following limitations as further recited however Xu discloses wherein the method further comprises: 
determining a plurality of geometric distances between the plurality of objects in the first image and the plurality of objects in the second image (Xu, page 3, Semantic Attributes, Besides the identity label ʅ(·), a tracklet Ƭi is enriched with four kinds of attributes: [Equation 5] where f(Ƭi) denotes the appearance attribute, h(Ƭi) denotes the geometry attribute … We also define the geometry attribute h(Ƭi) as the 2D object bounding boxes and projected footprints on the 3D ground plane ... Given the camera calibration, the foot point of each 2D bounding box is calculated and projected back onto the 3D ground; Xu, page 4, Bayesian Formulation, Given two tracklets, we consider both traditional visual relations (i.e., appearance and geometry) and leveraged semantic attribute relations (i.e., motion and pose/action) ... Appearance similarity. This constraint assumes that the same person should share similar appearance across time and cameras ... Geometric proximity measures how far two tracklets are located. We project the foot points of two tracklets onto the scene 3D ground plane using the given 2D to 3D homograph, and then compute the proximity of two tracklets [Equation 10] D(·, ·) denotes the averaged Euclidean distance between foot points of Ƭi and Ƭj over all overlapped frames); and 
determining the association relationship between the plurality of objects in the first image and the plurality of objects in the second image based on the plurality of determined feature distances comprises: 
for one object in the first image and one object in the second image, determining, according to a feature distance and a geometric distance between the object in the first image and the object in the second image, a distance between the object in the first image and one object in the second image (Xu, page 4, Geometric proximity measures how far two tracklets are located. We project the foot points of two tracklets onto the scene 3D ground plane using the given 2D to 3D homograph, and then compute the proximity of two tracklets [Equation 10] D(·, ·) denotes the averaged Euclidean distance between foot points of Ƭi and Ƭj over all overlapped frames); and 
determining the association relationship between the plurality of objects in the first image and the plurality of objects in the second image according to a plurality of distances between the plurality of objects in the first image and the plurality of objects in the second image (Xu, page 4, Bayesian Formulation, Given two tracklets, we consider both traditional visual relations (i.e., appearance and geometry) and leveraged semantic attribute relations (i.e., motion and pose/action) ... Appearance similarity. This constraint assumes that the same person should share similar appearance across time and cameras ... Geometric proximity measures how far two tracklets are located. We project the foot points of two tracklets onto the scene 3D ground plane using the given 2D to 3D homograph, and then compute the proximity of two tracklets [Equation 10] D(·, ·) denotes the averaged Euclidean distance between foot points of Ƭi and Ƭj over all overlapped frames).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Zheng to include the geometric distances as taught by Xu in order to impose proximity consistency restraints between images thereby improving re-identification between images (Xu, page 2, Introduction).

Regarding claim(s) 14: 
A corresponding reasoning as given earlier (see rejection of claim(s) 5) applies, mutatis mutandis, to the subject-matter of claim(s) 14, and therefore is/are also considered rejected under the grounds given in the rejection of claim(s) 5.

Claims 6 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zheng, Wei-Shi, Shaogang Gong, and Tao Xiang. "Group association: Assisting re-identification by visual context." Person Re-Identification. Springer, London, 2014. 183-201, hereinafter “Zheng”, in view of Chen, Yiqiang, et al. "Person Re-identification using group context." International Conference on Advanced Concepts for Intelligent Vision Systems. Springer, Cham, 2018, hereinafter, “Chen” as applied to claims 4 and 13 above, and further in view of Xu, Yuanlu, et al. "Cross-view people tracking by scene-centered spatio-temporal parsing." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 31. No. 1. 2017, hereinafter, “Xu”.

As per claim 6, Zheng and Chen disclose the method according to claim 4, but do not explicitly disclose the following limitations as further recited however Xu discloses wherein the method further comprises: 
determining a plurality of geometric distances between the plurality of objects in the first image and the plurality of objects in the second image (Xu, page 3, Semantic Attributes, Besides the identity label ʅ(·), a tracklet Ƭi is enriched with four kinds of attributes: [Equation 5] where f(Ƭi) denotes the appearance attribute, h(Ƭi) denotes the geometry attribute … We also define the geometry attribute h(Ƭi) as the 2D object bounding boxes and projected footprints on the 3D ground plane ... Given the camera calibration, the foot point of each 2D bounding box is calculated and projected back onto the 3D ground; Xu, page 4, Bayesian Formulation, Given two tracklets, we consider both traditional visual relations (i.e., appearance and geometry) and leveraged semantic attribute relations (i.e., motion and pose/action) ... Appearance similarity. This constraint assumes that the same person should share similar appearance across time and cameras ... Geometric proximity measures how far two tracklets are located. We project the foot points of two tracklets onto the scene 3D ground plane using the given 2D to 3D homograph, and then compute the proximity of two tracklets [Equation 10] D(·, ·) denotes the averaged Euclidean distance between foot points of Ƭi and Ƭj over all overlapped frames); and 
determining the association relationship between the plurality of objects in the first image and the plurality of objects in the second image based on the plurality of determined feature distances comprises: 
for one object in the first image and one object in the second image, determining, according to a feature distance and a geometric distance between the object in the first image and the object in the second image, a distance between the object in the first image and one object in the second image (Xu, page 4, Geometric proximity measures how far two tracklets are located. We project the foot points of two tracklets onto the scene 3D ground plane using the given 2D to 3D homograph, and then compute the proximity of two tracklets [Equation 10] D(·, ·) denotes the averaged Euclidean distance between foot points of Ƭi and Ƭj over all overlapped frames); and 
determining the association relationship between the plurality of objects in the first image and the plurality of objects in the second image according to a plurality of distances between the plurality of objects in the first image and the plurality of objects in the second image (Xu, page 4, Bayesian Formulation, Given two tracklets, we consider both traditional visual relations (i.e., appearance and geometry) and leveraged semantic attribute relations (i.e., motion and pose/action) ... Appearance similarity. This constraint assumes that the same person should share similar appearance across time and cameras ... Geometric proximity measures how far two tracklets are located. We project the foot points of two tracklets onto the scene 3D ground plane using the given 2D to 3D homograph, and then compute the proximity of two tracklets [Equation 10] D(·, ·) denotes the averaged Euclidean distance between foot points of Ƭi and Ƭj over all overlapped frames).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Zheng and Chen to include the geometric distances as taught by Xu in order to impose proximity consistency restraints between images thereby improving re-identification between images (Xu, page 2, Introduction).

Regarding claim(s) 15: 
A corresponding reasoning as given earlier (see rejection of claim(s) 6) applies, mutatis mutandis, to the subject-matter of claim(s) 15, and therefore is/are also considered rejected under the grounds given in the rejection of claim(s) 6.

Claims 7 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zheng, Wei-Shi, Shaogang Gong, and Tao Xiang. "Group association: Assisting re-identification by visual context." Person Re-Identification. Springer, London, 2014. 183-201, hereinafter “Zheng”, in view of Xu, Yuanlu, et al. "Cross-view people tracking by scene-centered spatio-temporal parsing." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 31. No. 1. 2017, hereinafter, “Xu”, as applied to claims 5 and 14 above, and further in view of Phillips S, Daniilidis K. All graphs lead to rome: Learning geometric and cycle-consistent representations with graph convolutional networks. arXiv preprint arXiv:1901.02078. 2019 Jan 7, hereinafter, “Phillips:

As per claim 7, Zheng and Xu disclose the method according to claim 5, but do not explicitly disclose the following limitations as further recited however Phillips discloses wherein determining the plurality of geometric distances between the plurality of objects in the first image and the plurality of objects in the second image comprises: 
obtaining a first position of a first image acquisition device which acquires the first image, and a second position of a second image acquisition device which acquires the second image, and obtaining a first intrinsic parameter of the first image acquisition device and a second intrinsic parameter of the second image acquisition device (Phillips, page 5, 3.4. Geometric Consistency Loss, Given a relative pose (Rij, Tij) between two cameras i and j (transforms j to i) the epipolar on corresponding feature locations Xi and Xj: [Equation 9] ... we use the two pose epipolar constraint [Equation 10] ... The constraint assumes that the Xk are calibrated i.e. the camera intrinsics are known); 
determining a third position of a center point of one object in the first image (Phillips, page 4, Figure 4); 
determining a polar line in the second image based on the first position, the second position, the third position, the first intrinsic parameter, and the second intrinsic parameter, wherein the polar line represents a straight line formed by projecting a connection line between a center point of one object in the first image and an image point of the object in an imaging plane of the first image acquisition device to the second image (Phillips, page 4, Figure 4, Errors are computed via absolute distance from the epipolar line, as expressed by (10) via the epipolar constraint. The epipolar line is the line of projection of the feature on the first image, projected onto to the second. The distance to this line on the second image indicates how likely that point is to correspond geometrically to the original feature; Phillips, page 5, 3.4. Geometric Consistency Loss, Given a relative pose (Rij, Tij) between two cameras i and j (transforms j to i) the epipolar on corresponding feature locations Xi and Xj: [Equation 9] ... we use the two pose epipolar constraint [Equation 10] ... The constraint assumes that the Xk are calibrated i.e. the camera intrinsics are known); 
determining a vertical pixel distance between one object in the second image and the polar line (Phillips, page 4, Figure 4, Errors are computed via absolute distance from the epipolar line, as expressed by (10) via the epipolar constraint. The epipolar line is the line of projection of the feature on the first image, projected onto to the second. The distance to this line on the second image indicates how likely that point is to correspond geometrically to the original feature); and 
determining the plurality of geometric distances between the plurality of objects in the first image and the plurality of objects in the second image according to a plurality of determined vertical pixel distances (Phillips, pages 3-4, 3.1. Correspondence Graph, we do pairwise feature matching between the images, creating putative correspondences for each of the features; Phillips, page 4, 3.3. Cycle Consistency, Let M be the noiseless set of matches between our features, with Mij being the matches between image i and image j; Phillips, page 5, 3.4. Geometric Consistency Loss, to add geometric consistency losses ... is to use the epipolar constraint. The epipolar constraint describes how the positions of features in different images corresponding to the same point should be related. Given a relative pose (Rij, Tij) between two cameras i and j (transforms j to i) the epipolar on corresponding feature locations Xi and Xj: [Equations 9 and 10]).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Zheng and Xu to include the epipolar geometric calculations as taught by Phillips in order to impose proximity consistency restraints between images as this distance calculation indicates the probability that a point corresponds geometrically to the original feature thereby improving re-identification between images (Phillips, page 4, Section 3.2).

Regarding claim(s) 16: 
A corresponding reasoning as given earlier (see rejection of claim(s) 7) applies, mutatis mutandis, to the subject-matter of claim(s) 16, and therefore is/are also considered rejected under the grounds given in the rejection of claim(s) 7.

Claims 8 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zheng, Wei-Shi, Shaogang Gong, and Tao Xiang. "Group association: Assisting re-identification by visual context." Person Re-Identification. Springer, London, 2014. 183-201, hereinafter “Zheng”, in view of Xu, Yuanlu, et al. "Cross-view people tracking by scene-centered spatio-temporal parsing." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 31. No. 1. 2017, hereinafter, “Xu”, as applied to claims 5 and 14 above, and further in view of Nithin, Kanishka, and François Bremond. "Multi-camera tracklet association and fusion using ensemble of visual and geometric cues." IEEE Transactions on Circuits and Systems for Video Technology 27.3 (2017): 431-440, hereinafter, “Nithin”.

As per claim 8, Zheng and Xu disclose the method according to claim 5, but do not explicitly disclose the following limitations as further recited however Nithin discloses wherein determining, according to the feature distance and the geometric distance between the object in the first image and the object in the second image, the distance between the object in the first image and the object in the second image comprises: 
performing weighted summation on the feature distance and the geometric distance between the object in the first image and the object in the second image to obtain the distance between the object in the first image and the object in the second image (Nithin, page 1, Introduction, real time multi-camera data association, fusion based on geometry and visual cues ... Association and fusion is performed based on weighted combination of local and global features such as geometry, appearance and motion; Nithin, page 5, 4.1. Ensemble feature combination using online learnt discriminative weights, Local tracklet similarity. At local stage, importance is given to local frame to frame geometric information ... we derive Euclidean Distance metric for each set of trajectories ... Global tracklet similarity. At global stage, information pertaining to overall appearance of the object is taken into account for determining the similarity between tracklets ... A global matching score quantified from such features represent global tracklet similarity. Each element of trajectory association cost matrix represents weighted sum of Euclidean distance and Global Matching Score(GMS) between two trajectories ... Where ωm is appearance descriptor weight learned to specify if more importance should be given to appearance cues or geometric cue).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Zheng and Xu to include the geometric and appearance distances as taught by Nithin in order to improve similarity determinations between images obtained at different time from different viewpoints (Nithin, Introduction).

Regarding claim(s) 17: 
A corresponding reasoning as given earlier (see rejection of claim(s) 8) applies, mutatis mutandis, to the subject-matter of claim(s) 17, and therefore is/are also considered rejected under the grounds given in the rejection of claim(s) 8.

Claims 9 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zheng, Wei-Shi, Shaogang Gong, and Tao Xiang. "Group association: Assisting re-identification by visual context." Person Re-Identification. Springer, London, 2014. 183-201, hereinafter “Zheng”, in view of Xu, Yuanlu, et al. "Cross-view people tracking by scene-centered spatio-temporal parsing." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 31. No. 1. 2017, hereinafter, “Xu”, as applied to claims 5 and 14 above, and further in view of Cai Z, Zhang J, Ren D, Yu C, Zhao H, Yi S, Yeo CK, Loy CC. MessyTable: Instance Association in Multiple Camera Views. arXiv preprint arXiv:2007.14878. 2020 Jul 29, hereinafter, “Cai”.

As per claim 9, Zheng and Xu disclose the method according to claim 5, but do not explicitly disclose the following limitations as further recited however Cai discloses wherein determining the association relationship between the plurality of objects in the first image and the plurality of objects in the second image according to the plurality of distances between the plurality of objects in the first image and the plurality of objects in the second image comprises: 
forming a distance matrix based on the plurality of distances between the plurality of objects in the first image and the plurality of objects in the second image, wherein a value of one element in the distance matrix represents a distance between one object in the first image and one object in the second image (Cai, page 20, G Additional Details on the Framework, Figure 13, compute pair-wise distances between instances. KM stands for Kuhn-Munkres algorithm, which globally optimizes the matches such that the total loss (the sum of distances of matched pairs) is the minimum. An additional thresholding step further rejects matches with large distances); and 
determining an adjacency matrix between the first image and the second image according to the distance matrix, wherein a value of an element in the adjacency matrix represents that one object in the first image is associated with or unassociated with one object in the second image (Cai, page 20, G Additional Details on the Framework, Figure 13, compute pair-wise distances between instances. KM stands for Kuhn-Munkres algorithm, which globally optimizes the matches such that the total loss (the sum of distances of matched pairs) is the minimum. An additional thresholding step further rejects matches with large distances, View #1, View #2, Distance Matrix, KM and Thresholding, Adjacency Matrix).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the teachings of Zheng and Xu to include the distance and adjacency matrices as taught by Cai in order to optimize correspondence between objects in images so that the sum of distances of matched pairs is minimum (Cai, page 20, G Additional Details on the Framework).

Regarding claim(s) 18: 
A corresponding reasoning as given earlier (see rejection of claim(s) 9) applies, mutatis mutandis, to the subject-matter of claim(s) 18, and therefore is/are also considered rejected under the grounds given in the rejection of claim(s) 9.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TRACY MANGIALASCHI whose telephone number is (571)270-5189. The examiner can normally be reached M-F, 9:30AM TO 6:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached on (571) 272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/TRACY MANGIALASCHI/Examiner, Art Unit 2668                                 
/VU LE/Supervisory Patent Examiner, Art Unit 2668