DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments
The amendments filed 04/06/2021 have been entered. Claims 1-5 and 8-21 remain pending in the application. 
Applicant’s arguments, with respect to the rejection(s) of claim(s) 21 under 35 U.S.C. 112(a) have been fully considered and are persuasive. Therefore, the previous rejection set forth in the previous office action mailed 01/06/2021 has been withdrawn. 
Applicant’s arguments, with respect to the rejection(s) of claim(s) Claims 1-10, 2, 7, 12, 16, 19, and 21 under 35 U.S.C. 112(b) have been fully considered and are persuasive. Therefore, the previous rejection set forth in the previous office action mailed 01/06/2021 has been withdrawn. 
Applicant’s arguments, with respect to the rejection(s) of claim(s) 7 under 35 U.S.C. 112(d) have been fully considered and are persuasive. Therefore, the previous rejection set forth in the previous office action mailed 01/06/2021 has been withdrawn. 





Response to Arguments
Applicant's arguments regarding the rejection under 35 U.S.C.103 have been fully considered but they are not persuasive. 
	 First and foremost, the applicant has amended the claims with subject matter that has not been previously examined. Therefore, arguments regarding such language is rendered moot. The examiner refers to the rejection under 35 U.S.C. 103 below. 
	Second, the applicant has not cited or referenced the applied art in any way but rather makes the conclusory statement (Pg. 20 of remarks filed 04/06/2021): 
	It is respectfully submitted each cited reference analyzed individually, or an analysis of any combination of the cited references…fails to teach or suggest the presented claimed in amended claim 1, when analyzed as a whole and as described in the overview summary above.
	
This argument cannot be found persuasive as the applicant’s remarks do not comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
	The examiner reminds the applicant of MPEP 2141(IV) which describes how an applicant should respond to a factual finding (as is the case here). Specifically, the applicant’s above argument is nothing more than:
A mere statement or argument that the Office has not established a prima facie case of obviousness or that the Office’s reliance on common knowledge is unsupported by documentary evidence

The MPEP is clear on this and says that such a statement “…will not be considered substantively adequate to rebut the rejection or an effective traverse of the rejection…”


Examiner’s Remarks
Due at least to the current amendments, prior art of record Abu-Mostafa et al. (US 2015/0206067 hereinafter “Abu”) is no longer required to teach the claims which Abu was relied upon in at least the previous action. Therefore, this reference has been removed. 
	This should NOT be understood as a concession of Abu’s teachings nor the subject matter of the amended claim language. Merely, Abu is not necessary.
	The examiner may re-introduce Abu in subsequent actions if deemed necessary. The examiner notes the above for clarity of record. 

Claim Interpretation
	The examiner notes the language of representative Claim 1 and specifically the two “extracting” limitations. 
	Under the BRI of these limitations, the output of the first “extracting” limitation is a training data histogram and the output of the second “extracting” limitation is a test data histogram. The claim requires nothing more. 
	That is, and specifically addressing the amended limitation (“wherein a value…”), the specific features allegedly required by the “extracting” limitations are inherent features that necessarily define a histogram. 

	The examiner notes with importance that this reference is NOT relied upon in the rejection under 35 U.S.C. 103, but rather is simply used to provide evidence of the well-known understanding and definition of a histogram. 
	In particular, NIST recites that: 
“The purpose of a histogram…is to graphically summarize the distribution of a univariate data set...”
	
NIST continues: 
“…The most common form of the histogram is obtained by splitting the range of the data into equal-sized bins (called classes). Then for each bin, the number of points from the data set that fall into each bin are counted. That is…Vertical axis: Frequency (i.e., counts for each bin)…Horizontal Axis: Response Variable…”
	
Going back to the claim language, the above citation from NIST (e.g. frequency), shows that at least the claimed feature of: 
wherein a value in the each feature value bin represents a number of times the respective feature has attained a value in that each feature value bin’s feature value range from analysis of the entire collection of training data by the each of the at least one classifier

is an inherent feature of histograms and thus if the applied art teaches the use of histograms, the applied art necessarily teaches at least the above feature.  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


Claim 1-5, 8-13, and 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Ijiri et al.  ("Human Re-Identification Through Distance Metric Learning Based on Jensen-Shannon Kernel", NPL 2012) in view of Jin et al. ("Landmark selection for scene matching with knowledge of color histogram", NPL 2014).

With respect to Claim 1, Ijiri teaches a computer-implemented method comprising: collecting by at least one processor of at least one computing device of a machine learning system, training data having meta-data information used for training the machine learning system, resulting in a collection of training data (Pg. 605 Col. 1 "The training datasets from many surveillance cameras under different conditions and corresponding subject labels are assumed available for metric learning…the optimal distance metric is learned with a training dataset in the proposed work."). 
Ijiri also teaches collecting by the at least one processor, test data lacking meta-data information, resulting in a collection of test data (Pg. 605 Col. 2 "In the re-identification process, color histograms of test images, T…are obtained, where K is the number of images to be matched. Histograms M and T are matched user the learned distance metric." The examiner notes that the purpose of the Ijiri paper is to re-identify a person based on clothing color, skin color, etc. The first image (i.e. training 
Ijiri further teaches training at least one classifier of the machine learning system with the collection of training data, the at least one classifier including a set of features for classifying the training data (Pg. 605 Col. 2 “To learn the optimal metric, color histograms X... Are firstly computed from the training dataset…" Pg. 606 Col. 2 describes the use of training Large Margin Component Analysis and X, the training dataset, is described as being used." The examiner notes that Large Margin Component Analysis (LMCA) teaches “at least one classifier of the machine learning system.” Further the color histograms, the colors themselves, and/or any or all of the metrics used as input into the classifier teaches “a set of features.”). 
Ijiri further teaches extracting feature response values for each feature in the set of features over the entire collection of training for each of the at least one classifier, from analysis of the entire collection of training data by each of the at least one classifier resulting in a training data extraction (Pg. 605 Col. 2 “For each person, color histograms mc are extracted as models…” Further note the description of what a clothing color histogram is made up of. Pg. 605 Col. 2 "Following this scheme, in this paper, the human region is segmented vertically into P pieces, and for each sub-region p HSV joint histograms... are computed, where bh, bs, bv are the number of bins in the H,S, and V color channels..." Pg. 609 Col. 2 "We used five bins for HS channels 
Ijiri further teaches quantizing the feature response values into individual feature value bins in a set of feature value bins in a training data feature response histogram for each feature in the set of features associated with the training data in which each feature value bin has a feature value range, and aggregating all quantized feature response values in each feature value bin in the set of feature value bins in the training data feature response histogram for the each feature in the set of features associated with the training data from the analysis of all of the collection training data and wherein a value in the each feature value bin represents a number of times the respective feature has attained a value in that each feature value bin’s feature value range from analysis of the entire collection of training data by the each of the at least one classifier (Pg. 605 Figure 2. Note Compute histograms. Pg. 605 Col. 2 "Following this scheme, in this paper, the human region is segmented vertically into P pieces, and for each sub-region p HSV joint histograms... are computed, where bh, bs, bv are the number of bins in the H,S, and V color channels..." Pg. 609 Col. 2 "We used five bins for HS channels and three bins for V channel for quantization…" The examiner notes that this limitation, under BRI, is creating a histogram for the training data set.). 
extracting feature response values for the each feature in the set of features over the entire collection of test data for each of the at least classifier, from analysis of the entire collection of test data by the each of the at least one classifier resulting in a test data extraction, quantizing the feature response values in to individual feature value bins in a set of feature value bins in a test data feature response histogram associated with the each feature in the set of features associated with the test data in which each feature value bin has a feature value range, and aggregating all quantized feature responses value in each feature value bin in the set of feature value bins in the test data feature response histogram associated with the each feature in the set of features associated with the test data from the analysis of all of the collection of test data and wherein a value in the each feature value bin represents a number of time the respective each feature attained a value in that each feature value bin’s feature value range from analysis of the entire collection of test data by the each of the at least one classifier (Pg. 605 Figure 2. Note Compute histograms. Pg. 605 Col. 2 "Following this scheme, in this paper, the human region is segmented vertically into P pieces, and for each sub-region p HSV joint histograms... are computed, where bh, bs, bv are the number of bins in the H,S, and V color channels..." Pg. 609 Col. 2 "We used five bins for HS channels and three bins for V channel for quantization…"). 
Ijiri further teaches performing a low-dimensional comparison of the training data feature response histograms with the test data feature response histograms for the same respective features in the training data and in the test data using a Jensen-Shannon Divergence technique (Pg. 608 Col. 1 describes the use of a 
Ijiri does not explicitly disclose presenting in a user interface communicatively coupled with the at least one computing device an indication of a similarity between the collection of training data to the collection of test data for each feature of the same respective features in the training data and in the test data, based on the low-dimensional comparison of the training data feature response histograms with the test data feature response histograms using the Jensen-Shannon Divergence technique. 
Jin, however, does teach presenting in a user interface communicatively coupled with the at least one computing device an indication of a similarity between the collection of training data to the collection of test data for each feature of the same respective features in the training data and in the test data, based on the low-dimensional comparison of the training data feature response histograms with the test data feature response histograms using the Jensen-Shannon Divergence technique (Jin Pg. 3 Col. 2 “A probabilistic divergence measure is used to measure the population similarity between the landmark H1_training and the one Hi_testing on the test image for each patch in the test image. In this work…Jensen-Shannon Divergence are considered.” Pg. 4 Col. 1 “Finally population similarity between [the pixels] is given by [Equation 10]. Hence, the selection result is provided by the population similarity measures over the test image…” Pg. 5 Col. 1 Section A “Landmark selection results based on Population Sampling…The results of landmarks selection for roads calculated using different criteria of divergence is shown in Fig.7, where the 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the present invention to combine the histogram creation and comparison as taught by Ijiri modified with the presentation of similarity on a user interface as taught by Jin because this would allow a user to visually see the differences between two images and thus improving the user’s experience (Jin Fig.9 and Pg. 7 Col. 2). 

With respect to Claim 2, the combination of Ijiri and Jin teach presenting for each feature in the set of feature the low-dimensional comparison of the training data feature response histogram with the test feature response histogram on the user interface, and displaying via the user interface identification of particular one or more features in the set of features for which the comparison creates the largest difference (Ijiri, Pg. 608 Figure 4. Alternatively Jin Fig. 9). 


With respect to Claim 3, the combination of Ijiri and Jin teach wherein the low-dimensional comparison includes a pairwise dimensional comparison which generates a numerical distance between predetermined features in the set of features in the training data and in the test data(Ijiri, Pg. 608 Figure 4. Alternatively Jin Fig. 9; Ijiri Pg. 608 Col. 1 describes the use of a Jensen-Shannon kernel which uses the Jensen-Shannon Divergence. This divergence is used to compare the color histogram from the training image to the color histogram of the test image. The examiner notes that the Jensen-Shannon Divergence, or any statistical comparison technique, will generate a numerical distance (e.g. a value).
Iijri Pg. 608 Equation 15. Note that the kernel function is equal to k (a, b) and thus is at least a pairwise comparison. The examiner further notes that any “comparison” is at least pairwise.)

With respect to Claim 4, the combination of Ijiri and Jin teach wherein each feature of the set of feature for classifying the training and the test data has numerical feature response values that are normalized to a feature value range from zero to one, before performing the low-dimensional comparison (Ijiri Pg. 605 Col. 2 "Following this scheme, in this paper, the human region is segmented vertically into P pieces, and for each sub-region p HSV joint histograms... are computed, where bh, bs, bv are the number of bins in the H,S, and V color channels..." Pg. 609 Col. 2 "We used five bins for HS channels and three bins for V channel for quantization…" The examiner notes that these two passages continue to say that the values are normalized. A person of ordinary skill in the art would know that if values are normalized a certain value range is defined, usually {0,1}. The examiner further notes that because Ijiri compares the color histograms, it logically follows the histograms have already been must have been performed “before…the low-dimensional comparison.”). 

With respect to Claim 5, the combination of Ijiri and Jin teach wherein the low-dimensional comparison of the each training data feature response histogram with the respective each test data feature response histogram includes at least a pairwise dimensional comparison of a pair of the features selected from the set of features (Iijri Pg. 608 Equation 15. Note that the kernel function is equal to k (a, b) and thus is at least a pairwise comparison. The examiner further notes that any “comparison” is at least pairwise.). 

With respect to Claim 8, the combination of Ijiri and Jin teach wherein the training data comprises an image having at least one of objects or concepts represented by the image (Ijiri teaches where the image in the training set is a picture of a person wearing colored clothing; here the meta-data is the person, clothing color, etc. This is shown at least in Figure 1, Figure 2, and Figure 6; alternatively, the Jin reference has a training image of a landscape and the objects or concepts and corresponding meta-data are the individual landmarks (i.e. Roads, rivers, buildings, etc.). This can be seen in at least Figs. 4-11). 

With respect to Claim 9, the combination of Ijiri and Jin teach wherein the low-dimensional comparison of the training data feature response histograms with the test data feature response histograms includes generating a numerical distances between predetermined features in the set of features in the training data and the test data (Ijiri, Pg. 608 Figure 4. Alternatively Jin Fig. 9; Ijiri Pg. 608 Col. 1 describes the use of a Jensen-Shannon kernel which uses the Jensen-Shannon Divergence. This divergence is used to compare the color histogram from the training image to the color histogram of the test image. The examiner notes that the Jensen-Shannon Divergence, or any statistical comparison technique, will generate a numerical distance (e.g. a value).
Iijri Pg. 608 Equation 15. Note that the kernel function is equal to k (a, b) and thus is at least a pairwise comparison. The examiner further notes that any “comparison” is at least pairwise.).
	 


With respect to Claim 10, the combination of Ijiri and Jin also teach wherein the low-dimensional comparison of the training data feature response histogram with the test data feature response histogram is at least a pairwise dimension comparison of a pair of the features selected from the set of features and wherein the pairwise dimensional comparison provides a predetermined feature relationship between predetermined pairs of features in the set of features of the training data and the test data (Iijri Pg. 608 Equation 15. Note that the kernel function is equal to k (a, b) and thus is at least a pairwise comparison. The examiner further notes that any “comparison” is at least pairwise. The examiner notes that in general, Ijiri aims to Re-Identify humans based on a distance from color histograms. For example 


With respect to Claim 11, Ijiri teaches A system comprising: at least one memory; and at least one processor of a machine learning system communicatively coupled to the at least one memory, the at least one processor responsive to instructions stored in memory, being configured to perform a method comprising: 
collecting training data having meta-data information used for training the machine learning system, resulting in a collection of training data (Pg. 605 Col. 1 "The training datasets from many surveillance cameras under different conditions and corresponding subject labels are assumed available for metric learning…the optimal distance metric is learned with a training dataset in the proposed work."). 
Ijiri also teaches collecting test data lacking meta-data information, resulting in a collection of test data (Pg. 605 Col. 2 "In the re-identification process, color histograms of test images, T…are obtained, where K is the number of images to be matched. Histograms M and T are matched user the learned distance metric." The examiner notes that the purpose of the Ijiri paper is to re-identify a person based on clothing color, skin color, etc. The first image (i.e. training image) is used as comparison to the second image (i.e. test image). Therefore the first image will have meta-data relating to clothing color, name of the person, etc. Therefore, the second image which is 
Ijiri further teaches training at least one classifier of the machine learning system with the collection of training data, the at least one classifier including a set of features for classifying the training data (Pg. 605 Col. 2 “To learn the optimal metric, color histograms X... Are firstly computed from the training dataset…" Pg. 606 Col. 2 describes the use of training Large Margin Component Analysis and X, the training dataset, is described as being used." The examiner notes that Large Margin Component Analysis (LMCA) teaches “at least one classifier of the machine learning system.” Further the color histograms, the colors themselves, and/or any or all of the metrics used as input into the classifier teaches “a set of features.”). 
Ijiri further teaches extracting feature response values for each feature in the set of features over the entire collection of training for each of the at least one classifier, from analysis of the entire collection of training data by each of the at least one classifier resulting in a training data extraction (Pg. 605 Col. 2 “For each person, color histograms mc are extracted as models…” Further note the description of what a clothing color histogram is made up of. Pg. 605 Col. 2 "Following this scheme, in this paper, the human region is segmented vertically into P pieces, and for each sub-region p HSV joint histograms... are computed, where bh, bs, bv are the number of bins in the H,S, and V color channels..." Pg. 609 Col. 2 "We used five bins for HS channels and three bins for V channel for quantization…" The extraction of histograms from input images teaches “extraction feature values…” The examiner notes that these two passages continue to say that the values are normalized. A person of ordinary skill in 
Ijiri further teaches quantizing the feature response values into individual feature value bins in a set of feature value bins in a training data feature response histogram for each feature in the set of features associated with the training data in which each feature value bin has a feature value range, and aggregating all quantized feature response values in each feature value bin in the set of feature value bins in the training data feature response histogram for the each feature in the set of features associated with the training data from the analysis of all of the collection training data and wherein a value in the each feature value bin represents a number of times the respective feature has attained a value in that each feature value bin’s feature value range from analysis of the entire collection of training data by the each of the at least one classifier (Pg. 605 Figure 2. Note Compute histograms. Pg. 605 Col. 2 "Following this scheme, in this paper, the human region is segmented vertically into P pieces, and for each sub-region p HSV joint histograms... are computed, where bh, bs, bv are the number of bins in the H,S, and V color channels..." Pg. 609 Col. 2 "We used five bins for HS channels and three bins for V channel for quantization…" The examiner notes that this limitation, under BRI, is creating a histogram for the training data set.). 
Ijiri further teaches extracting feature response values for the each feature in the set of features over the entire collection of test data for each of the at least classifier, from analysis of the entire collection of test data by the each of the at least one classifier resulting in a test data extraction, quantizing the feature response values in to individual feature value bins in a set of feature value bins in a test data feature response histogram associated with the each feature in the set of features associated with the test data in which each feature value bin has a feature value range, and aggregating all quantized feature responses value in each feature value bin in the set of feature value bins in the test data feature response histogram associated with the each feature in the set of features associated with the test data from the analysis of all of the collection of test data and wherein a value in the each feature value bin represents a number of time the respective each feature attained a value in that each feature value bin’s feature value range from analysis of the entire collection of test data by the each of the at least one classifier (Pg. 605 Figure 2. Note Compute histograms. Pg. 605 Col. 2 "Following this scheme, in this paper, the human region is segmented vertically into P pieces, and for each sub-region p HSV joint histograms... are computed, where bh, bs, bv are the number of bins in the H,S, and V color channels..." Pg. 609 Col. 2 "We used five bins for HS channels and three bins for V channel for quantization…"). 
Ijiri further teaches performing a low-dimensional comparison of the training data feature response histograms with the test data feature response histograms for the same respective features in the training data and in the test data using a Jensen-Shannon Divergence technique (Pg. 608 Col. 1 describes the use of a Jensen-Shannon kernel which uses the Jensen-Shannon divergence criterion which is a statistical comparison technique. This divergence is used to compare the color histogram from the training image to the color histogram of the test image.). 

Jin, however, does teach presenting in a user interface communicatively coupled with the at least one computing device an indication of a similarity between the collection of training data to the collection of test data for each feature of the same respective features in the training data and in the test data, based on the low-dimensional comparison of the training data feature response histograms with the test data feature response histograms using the Jensen-Shannon Divergence technique (Jin Pg. 3 Col. 2 “A probabilistic divergence measure is used to measure the population similarity between the landmark H1_training and the one Hi_testing on the test image for each patch in the test image. In this work…Jensen-Shannon Divergence are considered.” Pg. 4 Col. 1 “Finally population similarity between [the pixels] is given by [Equation 10]. Hence, the selection result is provided by the population similarity measures over the test image…” Pg. 5 Col. 1 Section A “Landmark selection results based on Population Sampling…The results of landmarks selection for roads calculated using different criteria of divergence is shown in Fig.7, where the [Jensen-Shannon Divergence] JSD shows the best contrast…” Note at least Figure 7 and Figure 9. Both show the test image when compared to the known features and images of, for example “roads.” Note especially the contrast which clearly highlights the 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the present invention to combine the histogram creation and comparison as taught by Ijiri modified with the presentation of similarity on a user interface as taught by Jin because this would allow a user to visually see the differences between two images and thus improving the user’s experience (Jin Fig.9 and Pg. 7 Col. 2). 

With respect to Claim 12, the combination of Ijiri and Jin teach the user interface for presenting for each feature in the set of features the low-dimensional comparison of the training data feature response histogram with the test data feature response histogram on the user interface, and displaying via the user interface identification of particular one or more features in the set of features for which the comparison creates the largest difference (Ijiri, Pg. 608 Figure 4. Alternatively Jin Fig. 9). 

With respect to Claim 13, the combination of Ijiri and Jin teach wherein the training data comprises an image having at least one of objects or concepts represented by the image (Ijiri teaches where the image in the training set is a picture of a person wearing colored clothing; here the meta-data is the person, clothing color, etc. This is shown at least in Figure 1, Figure 2, and Figure 6; alternatively, the Jin 

With respect to Claim 15, the combination of Ijiri and Jin teach wherein the low-dimensional comparison incudes a pairwise dimensional comparison which generates a numerical distance between predetermined features in the set of features in the training data and in the test data.  (Ijiri, Pg. 608 Figure 4. Alternatively Jin Fig. 9; Ijiri Pg. 608 Col. 1 describes the use of a Jensen-Shannon kernel which uses the Jensen-Shannon Divergence. This divergence is used to compare the color histogram from the training image to the color histogram of the test image. The examiner notes that the Jensen-Shannon Divergence, or any statistical comparison technique, will generate a numerical distance (e.g. a value).
Iijri Pg. 608 Equation 15. Note that the kernel function is equal to k (a, b) and thus is at least a pairwise comparison. The examiner further notes that any “comparison” is at least pairwise.).

	With respect to Claim 16, the combination of Ijiri and Jin teach wherein the user interface being further for presenting by displaying at least one of: the differences compared between features of the training data and the corresponding features of the test data; and identification of at least one feature that created the largest difference between the features of the training data and corresponding features of the test data (Jin Pg. 3 Col. 2 “A probabilistic divergence ence, the selection result is provided by the population similarity measures over the test image…” Pg. 5 Col. 1 Section A “Landmark selection results based on Population Sampling…The results of landmarks selection for roads calculated using different criteria of divergence is shown in Fig.7, where the [Jensen-Shannon Divergence] JSD shows the best contrast…” Note at least Figure 7 and Figure 9. Both show the test image when compared to the known features and images of, for example “roads.” Note especially the contrast which clearly highlights the roads. This indication of a landmark by contrast teaches “indication of similarity”. Further notes that as shown in figure 7 and Figure 9 this indication is presented on a screen (user interface). Thus, Jin teaches the claim language as required.). 
With respect to Claim 17, the combination of Ijiri and Jin teach wherein the set of features for classifying the training data and the set of features for classifying the test data each have the same multiple features and each of the multiple features have numerical feature response values that are normalized to a feature value range from zero to one, before performing the low-dimensional comparison (Ijiri Pg. 607 Section 2.3 “In this section kernel function that are suitable for matching two distributions                         
                             
                            a
                            ,
                            b
                            ∈
                            R
                            
                                
                                    
                                        
                                            0
                                             
                                            ,
                                            1
                                        
                                    
                                
                                
                                    D
                                
                            
                        
                     are investigated, such as normalized histograms or probability distributions…” note this kernel functions which include the Jensen-Shannon kernel therefore operate on normalized histograms as the claim language requires. 

With respect to Claim 18, the combination of Ijiri and Jin teach wherein the low-dimensional comparison of the each training data feature response histogram with the respective each test data feature response histogram includes at least a pairwise dimensional comparison of a part of the feature selected from the set of features (Iijri Pg. 608 Equation 15. Note that the kernel function is equal to k (a, b) and thus is at least a pairwise comparison. The examiner further notes that any “comparison” is at least pairwise.).

	With respect to 19, the combination of Ijiri and Jin teach wherein the Jensen-Shannon Divergence technique, performed by the method, providing a result in a range between 0 and 1 where 0 signifies zero differences and 1 signifies a maximal difference or alternatively where 0 signifies the maximal difference and 1 signifies zero differences in the comparison (Ijiri Pg. 607 Section 2.3 “In this section kernel function that are suitable for matching two distributions                         
                             
                            a
                            ,
                            b
                            ∈
                            R
                            
                                
                                    
                                        
                                            0
                                             
                                            ,
                                            1
                                        
                                    
                                
                                
                                    D
                                
                            
                        
                     are investigated, such as normalized histograms or probability distributions…” Pg. 608 Col. 1 describes the use of the Jensen-Shannon kernel which uses the Jensen-Shannon divergence. Pg 608 Col. 2 Section 2.4 talks about how matching is performed; Note Eq. 17. Eq. 17 shows that the similarity function takes the max from the color histograms as shown in Eq.16. These values that are used within the max function of Eq. 17 come from the Jensen-Shannon Divergence metric. Since the histograms are normalized to 

With respect to Claim 20, Ijiri teaches A non-transitory computer-readable medium having stored therein instructions which, when executed by at least one processor, cause a machine learning system to perform a method comprising: collecting by the at least one processor of the machine learning system, training data having meta-data information used for training the machine learning system, resulting in a collection of training data (Pg. 605 Col. 1 "The training datasets from many surveillance cameras under different conditions and corresponding subject labels are assumed available for metric learning…the optimal distance metric is learned with a training dataset in the proposed work."). 
Ijiri also teaches collecting by the at least one processor, test data lacking meta-data information, resulting in a collection of test data (Pg. 605 Col. 2 "In the re-identification process, color histograms of test images, T…are obtained, where K is the number of images to be matched. Histograms M and T are matched user the learned distance metric." The examiner notes that the purpose of the Ijiri paper is to re-identify a person based on clothing color, skin color, etc. The first image (i.e. training image) is used as comparison to the second image (i.e. test image). Therefore the first image will have meta-data relating to clothing color, name of the person, etc. Therefore, 
Ijiri further teaches training at least one classifier of the machine learning system with the collection of training data, the at least one classifier including a set of features for classifying the training data (Pg. 605 Col. 2 “To learn the optimal metric, color histograms X... Are firstly computed from the training dataset…" Pg. 606 Col. 2 describes the use of training Large Margin Component Analysis and X, the training dataset, is described as being used." The examiner notes that Large Margin Component Analysis (LMCA) teaches “at least one classifier of the machine learning system.” Further the color histograms, the colors themselves, and/or any or all of the metrics used as input into the classifier teaches “a set of features.”). 
Ijiri further teaches extracting feature response values for each feature in the set of features over the entire collection of training for each of the at least one classifier, from analysis of the entire collection of training data by each of the at least one classifier resulting in a training data extraction (Pg. 605 Col. 2 “For each person, color histograms mc are extracted as models…” Further note the description of what a clothing color histogram is made up of. Pg. 605 Col. 2 "Following this scheme, in this paper, the human region is segmented vertically into P pieces, and for each sub-region p HSV joint histograms... are computed, where bh, bs, bv are the number of bins in the H,S, and V color channels..." Pg. 609 Col. 2 "We used five bins for HS channels and three bins for V channel for quantization…" The extraction of histograms from input images teaches “extraction feature values…” The examiner notes that these two 
Ijiri further teaches quantizing the feature response values into individual feature value bins in a set of feature value bins in a training data feature response histogram for each feature in the set of features associated with the training data in which each feature value bin has a feature value range, and aggregating all quantized feature response values in each feature value bin in the set of feature value bins in the training data feature response histogram for the each feature in the set of features associated with the training data from the analysis of all of the collection training data and wherein a value in the each feature value bin represents a number of times the respective feature has attained a value in that each feature value bin’s feature value range from analysis of the entire collection of training data by the each of the at least one classifier (Pg. 605 Figure 2. Note Compute histograms. Pg. 605 Col. 2 "Following this scheme, in this paper, the human region is segmented vertically into P pieces, and for each sub-region p HSV joint histograms... are computed, where bh, bs, bv are the number of bins in the H,S, and V color channels..." Pg. 609 Col. 2 "We used five bins for HS channels and three bins for V channel for quantization…" The examiner notes that this limitation, under BRI, is creating a histogram for the training data set.). 
Ijiri further teaches extracting feature response values for the each feature in the set of features over the entire collection of test data for each of the at least classifier, from analysis of the entire collection of test data by the each of the at least one classifier resulting in a test data extraction, quantizing the feature response values in to individual feature value bins in a set of feature value bins in a test data feature response histogram associated with the each feature in the set of features associated with the test data in which each feature value bin has a feature value range, and aggregating all quantized feature responses value in each feature value bin in the set of feature value bins in the test data feature response histogram associated with the each feature in the set of features associated with the test data from the analysis of all of the collection of test data and wherein a value in the each feature value bin represents a number of time the respective each feature attained a value in that each feature value bin’s feature value range from analysis of the entire collection of test data by the each of the at least one classifier (Pg. 605 Figure 2. Note Compute histograms. Pg. 605 Col. 2 "Following this scheme, in this paper, the human region is segmented vertically into P pieces, and for each sub-region p HSV joint histograms... are computed, where bh, bs, bv are the number of bins in the H,S, and V color channels..." Pg. 609 Col. 2 "We used five bins for HS channels and three bins for V channel for quantization…"). 
Ijiri further teaches performing a low-dimensional comparison of the training data feature response histograms with the test data feature response histograms for the same respective features in the training data and in the test data using a Jensen-Shannon Divergence technique (Pg. 608 Col. 1 describes the use of a Jensen-Shannon kernel which uses the Jensen-Shannon divergence criterion which is a statistical comparison technique. This divergence is used to compare the color histogram from the training image to the color histogram of the test image.). 

Jin, however, does teach presenting in a user interface communicatively coupled with the at least one computing device an indication of a similarity between the collection of training data to the collection of test data for each feature of the same respective features in the training data and in the test data, based on the low-dimensional comparison of the training data feature response histograms with the test data feature response histograms using the Jensen-Shannon Divergence technique (Jin Pg. 3 Col. 2 “A probabilistic divergence measure is used to measure the population similarity between the landmark H1_training and the one Hi_testing on the test image for each patch in the test image. In this work…Jensen-Shannon Divergence are considered.” Pg. 4 Col. 1 “Finally population similarity between [the pixels] is given by [Equation 10]. Hence, the selection result is provided by the population similarity measures over the test image…” Pg. 5 Col. 1 Section A “Landmark selection results based on Population Sampling…The results of landmarks selection for roads calculated using different criteria of divergence is shown in Fig.7, where the [Jensen-Shannon Divergence] JSD shows the best contrast…” Note at least Figure 7 and Figure 9. Both show the test image when compared to the known features and images of, for example “roads.” Note especially the contrast which clearly highlights the 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the present invention to combine the histogram creation and comparison as taught by Ijiri modified with the presentation of similarity on a user interface as taught by Jin because this would allow a user to visually see the differences between two images and thus improving the user’s experience (Jin Fig.9 and Pg. 7 Col. 2). 

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Ijiri et al.  ("Human Re-Identification Through Distance Metric Learning Based on Jensen-Shannon Kernel", NPL 2012) in view of Jin et al. ("Landmark selection for scene matching with knowledge of color histogram", NPL 2014) and further in view of Shankar Reddy et al. (“Probabilistic Detection Methods for acoustic surveillance using Audio Histograms”, NPL 2014; Hereinafter “Shankar”). 

With respect to Claim 14, the combination of Ijiri and Jin teach all of the limitations of Claim 11 as described above. 
The combination of Ijiri and Jin however, do not appear to explicitly disclose wherein the training data comprises audio having feature represented by the audio and further including corresponding meta-data representing the features. 
wherein the training data comprises audio having feature represented by the audio and further including corresponding meta-data representing the features (Title; Figure 3. Pg. 1982 Section 3.2 “The distribution of audio data can be characterized by histograms. In general, audio histograms are computed by splitting the data points (feature vectors) into equal-sized bins and counting number of data points in each bin.).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the present invention to combine the histograms and method as taught by the combination of Ijiri and Jin modified with the audio input as taught by Shankar because this would allow the measurement of different audio sequences by use of a histogram (Shankar Pg. 1982). 
Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Ijiri et al.  ("Human Re-Identification Through Distance Metric Learning Based on Jensen-Shannon Kernel", NPL 2012) in view of Jin et al. ("Landmark selection for scene matching with knowledge of color histogram", NPL 2014) and further in view of Howbert (“Machine Learning: Logistic Regression”, UW lecture slides, NPL 2014).
With respect to Claim 21, the combination of Ijiri and Jin teach all of the limitations of Claim 1 as described above. 
The combination of Ijiri and Jin also teach normalizing the quantized and aggregated feature response values in each feature value bin in the set of feature value bins in the training data feature response histogram (Ijiri Pg. 605 Col. 2 "Following this scheme, in this paper, the human region is segmented vertically into P 
The combination of Ijiri and Jin also teach normalizing the quantized and aggregated feature response values in each feature value bin the set of feature value bins in the test data feature response histogram for each feature in the set of features for the each of the at least one classifier (Ijiri Pg. 605 Col. 2 "Following this scheme, in this paper, the human region is segmented vertically into P pieces, and for each sub-region p HSV joint histograms... are computed, where bh, bs, bv are the number of bins in the H,S, and V color channels..." Pg. 609 Col. 2 "We used five bins for HS channels and three bins for V channel for quantization…" The examiner notes that these two passages continue to say that the values are normalized. A person of ordinary skill in the art would know that if values are normalized a certain value range is defined, usually {0,1}.). 
The combination of Ijiri and Jin, however, do not appear to teach: 
Wherein the normalizing comprises mapping the feature response values in the training data feature response histogram into a logistic response curve by approximating the feature response values in the training data feature response histogram to fit the logistic response curve. 
Wherein the normalizing comprises mapping the feature response values in the test data feature response histogram into the logistic response curve by approximating the feature response values in the test data response histogram to fit the logistic response curve. 
Howbert, however, does teach wherein the normalizing comprises mapping the feature response values in the training data feature response histogram into a logistic response curve by approximating the normalized feature response values in the training data feature response histogram to fit the logistic response curve (Note Slide 5 which shows a logistic function. Next see Slide 8 which shows (3rd bullet point), that the value is mapped to range of 0 to 1 using the logistic function. This teaches the claim language). 
Howbert also teaches wherein the normalizing comprises mapping the feature response values in the test data feature response histogram into the logistic response curve by approximating the feature response values in the test data response histogram to fit the logistic response curve (Note Slide 5 which shows a logistic function. Next see Slide 8 which shows (3rd bullet point), that the value is mapped to range of 0 to 1 using the logistic function. This teaches the claim language). 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the normalizing as taught by the combination of Ijiri and Jin with the normalization by mapping logistic response curve as taught by Howbert because this would give good predictive accuracy (Howbert Slide 7).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FEN TAMULONIS whose telephone number is (571)272-0934.  The examiner can normally be reached on 7:30AM-5:30PM MON-FRI EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/F.C.T./Examiner, Art Unit 2126   




                                                                                                                          
/MICHAEL J HUNTLEY/Primary Examiner, Art Unit 2116