DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 27 January 2021 has been entered.

Response to Amendment
Applicant’s response, filed 18 December 2020, to the last office action has been entered and made of record. 
In response to the cancellation of claim 14, it is acknowledged and made of record.
In response to the amendments to the claims, they are acknowledged, supported by the original disclosure, and no new matter is added.
In response to the addition of new claim 24, it is acknowledged and made of record.

EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given in an interview with Johnny Lam (Reg. No. 66,279) on 12 February 2021.
The application has been amended as follows: 
Claim 1. (Currently amended) 	A computer-implemented method of object identification, the computer-implemented method comprising:
capturing, with a first camera at a first point in time, a first image depicting a first scene;
capturing, with a second camera at a second point in time different from the first point in time, a second image depicting a second scene that does not overlap with the first scene;
extracting a first patch from the first image;
extracting a second patch from the second image;
extracting a first patch descriptor from the first patch;
extracting a second patch descriptor from the second patch;
mapping the first and second patch descriptors to a concatenated codeword in a clustered codebook learned via coupled clustering of a set of concatenated features, wherein at least one concatenated feature of the set of concatenated features is generated by concatenating corresponding patch descriptors from the first and second cameras, the concatenated codeword comprising dimensions of cluster centers of coupled clusters to which the first and second patch descriptors map;
	computing an appearance cost comprising a measure of visual dissimilarity between the first and second images, based on the concatenated codeword and by operation of one or more computer processors;

	combining the appearance cost and the temporal cost into a single cost function; and 
	determining whether the first and second images depict a common object, using the single cost function, after which an indication of whether the first and second images depict a common object is output.

Claim 20. (Currently Amended)	A non-transitory computer-readable medium containing computer program code executable to perform an operation for object identification, the operation comprising:
capturing, with a first camera at a first point in time, a first image depicting a first scene;
capturing, with a second camera at a second point in time different from the first point in time, a second image depicting a second scene that does not overlap with the first scene;
extracting a first patch from the first image;
extracting a second patch from the second image;
extracting a first patch descriptor from the first patch;
extracting a second patch descriptor from the second patch;
mapping the first and second patch descriptors to a concatenated codeword in a clustered codebook learned via coupled clustering of a set of concatenated features, wherein at least one concatenated feature of the set of concatenated features is generated by concatenating corresponding patch descriptors from the first and second cameras, the concatenated codeword comprising dimensions of cluster centers of coupled clusters to which the first and second patch descriptors map;

	computing a temporal cost between the first and second images, using a temporal context model and based on the first and second points in time;
	combining the appearance cost and the temporal cost into a single cost function; and
	determining whether the first and second images depict a common object, using the single cost function, after which an indication of whether the first and second images depict a common object is output.

	Claim 21. (Currently Amended)	A computer-implemented method to facilitate object identification, the computer-implemented method comprising:
extracting a first patch descriptor from a first image depicting a first scene captured by a first camera device;
extracting a second patch descriptor from a second image depicting a second scene captured by a second camera device, wherein the first and second scenes are non-overlapping, wherein each of the first and second images has a respective timestamp;
concatenating corresponding patch descriptors from the first and second camera devices into at least a concatenated patch descriptor of a set of concatenated patch descriptors;
generating, by operation of one or more computer processors, a clustered codebook based on coupled clustering of the set of concatenated patch descriptors, the clustered codebook including a concatenated codeword, the concatenated codeword comprising dimensions of cluster centers of coupled clusters that the concatenated patch descriptor maps to; and


Allowable Subject Matter
Claims 1-13, 15-16, and 20-24 are allowed.
The following is an examiner’s statement of reasons for allowance: 
Regarding the subject matter of independent claim 8, a previous examiner’s statement of reasons for indicating allowable subject matter of claim 8 was given in the previous Office action, dated 27 October 2020. 
Regarding the subject matter of the amended independent claims 1, 20, and 21, the prior art of record, alone or in combination fails to fairly teach or suggest the limitations:
“mapping the first and second patch descriptors to a concatenated codeword in a clustered codebook learned via coupled clustering of a set of concatenated features, wherein at least one concatenated feature of the set of concatenated features is generated by concatenating corresponding patch descriptors from the first and second cameras, the concatenated codeword comprising dimensions of cluster centers of coupled clusters to which the first and second patch descriptors map” (claims 1 and 20); and
“concatenating corresponding patch descriptors from the first and second camera devices into at least a concatenated patch descriptor of a set of concatenated patch descriptors;
generating, by operation of one or more computer processors, a clustered codebook based on coupled clustering of the set of concatenated patch descriptors, the clustered codebook including a concatenated codeword, the concatenated codeword comprising dimensions of cluster centers of coupled clusters that the concatenated patch descriptor maps to” (claim 21).

Previously applied combination of Tariq, Perronnin, and Zhao references suggested to one of ordinary skill in the art to associate and combine features corresponding to the same object from multiple camera and viewpoints as a merged cluster in a multi-view gallery, and a visual word vocabulary represented by a Gaussian mixture model is learned from the associated and merged clusters (see Tariq [0036] and [0086]; see Perronnin sect. 3. Fisher Kernels on Visual Vocabularies; and see Zhao sect. 3.2. Multi-view gallery learning and Fig. 4). 
The merged clusters suggested by Zhao’s multi view gallery may suggest the broadest reasonable interpretation for the claimed feature of concatenating corresponding patch descriptors from a first and second cameras, and the combined teachings of Tariq, Perronnin, and Zhao may suggest creating a Gaussian mixture model (GMM) representing a visual word vocabulary, where the visual words represent the merged clusters of the associated features of the multi-view gallery, reading upon a broadest reasonable interpretation for generating a clustered codebook based on couple clustering using a concatenated patch descriptor.  However, the combined teachings of Tariq, Perronnin, and Zhao fails to fairly teach or suggest to one of ordinary skill in the art of additionally performing clustering upon a set of merged clusters to generate a codebook of codewords. Thus, the combined teachings of Tariq, Perronnin, and Zhao fail to provide a fair teaching for the combined features of the clustered codebook is learned via coupled clustering of a set of concatenated patch descriptors, wherein a concatenated feature / patch descriptor is generated by concatenating corresponding patch descriptors from the first and second cameras.
Further search and consideration of the prior art yielded the following pertinent prior art references:
(see Black [0149]), and that silhouette representation may be reduced and a codebook is learned by performing k-means clustering upon a combined set of shape context vectors from the training silhouettes (see Black [0151]). However, Black fails to provide a fair teaching that the concatenated features associated with each view are corresponding pairs of features from a first and second camera. 
The combination of Zhao and Black may suggest to one of ordinary skill in the art of concatenating corresponding features from across multiple views into a merged cluster, represented by one feature vector. However, the codebook learned from performing clustering of the set of feature vectors, each representing a merged cluster corresponding to a regular person, would comprise of codewords representing cluster centers in which features from multiple different persons would map to and render Zhao’s original intended purpose of re-identifying specific regular persons moot (see Zhao sect. 1. Introduction). As the proposed modification would render Zhao’s teachings to be modified unsatisfactory for its intended purpose, there is no fair suggestion or motivation to make the proposed modification. See MPEP 2143.01 V. As a fair suggestion or motivation to combine the teachings of Zhao and Black is lacking, the combination of Zhao and Black fails to provide a fair suggestion to one of ordinary skill in the art of the clustered codebook is learned via coupled clustering of a set of concatenated patch descriptors, wherein a concatenated feature / patch descriptor is generated by concatenating corresponding patch descriptors from the first and second cameras.
Pedagadi et al. (“Local Fisher Discriminant Analysis for Pedestrian Re-identification”), cited in the previous Advisory action, is pertinent in teaching a metric learning method for person re-identification where output from a first stage is the concatenation of two sets of principal component descriptors (see Pedagadi sect. 2. Proposed Method), however the feature vectors concatenated together are of different color spaces of the same observation (see Pedagadi sect. 2.2 Unsupervised Dimensionality Reduction). Thus, Pedagadi fails to provide a fair teaching, alone or in combination, for performing clustering upon a set of concatenated feature or patch descriptor, where a concatenated feature / patch descriptor is generated by concatenating corresponding patch descriptors from the first and second camera devices.
Ma et al. (“BiCov: a novel image representation for person re-identification and face verification”), cited in the previous Advisory action, is pertinent in teaching a method for person/face re-identification which proposes an image representation based on the concatenation of image descriptors into a single signature (see Ma sect. 3.2. Bicov Descriptor) which is applied to a data set of images comprised of two non- overlapping view per pedestrian (see Ma sect. 4.1. Pedestrian re-identification on VIPeR Dataset), however the Bicov descriptor computes the similarity of covariance descriptors within the same image between two consecutive scales (see Ma sect. 3.3. Analysis of BiCoV). Thus, Ma fails to provide a fair teaching, alone or in combination, for performing clustering upon a set of concatenated feature or patch descriptor, where a concatenated feature / patch descriptor is generated by concatenating corresponding patch descriptors from the first and second camera devices.
Zheng et al. (“Person Re-identification by Probabilistic Relative Distance Comparison”) is pertinent in teaching performing person re-identification matching across non-overlapping camera views based on a probabilistic relative distance comparison model (see Zheng Abstract), where a pairwise set is defined based on a difference vectors computed between a pair of relevant samples and difference vector from a pair of related irrelevant samples (see Zheng sect. 2. Probabilistic relative distance comparison for person re-identification). However, Zheng fails to provide a fair teaching, alone or in combination, for performing clustering upon a set of concatenated feature or patch descriptor, 

Regarding claims 2-13, 15-16, and 22-24, they are dependent claims of independent claims 1 and 21, which incorporate the allowable subject matter of the respective independent claims they depend from, and are therefore allowed.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TIMOTHY WING HO CHOI whose telephone number is (571)270-3814.  The examiner can normally be reached on 9:00 AM to 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VINCENT RUDOLPH can be reached on (571) 272-8243.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available 




/TIMOTHY CHOI/Examiner, Art Unit 2661                                                                                                                                                                                         

/VINCENT RUDOLPH/Supervisory Patent Examiner, Art Unit 2661