DETAILED ACTION
This action is in response to the claims filed 03/12/2019. Claims 1-14 are pending and have been examined.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 14 is objected to because of the following informalities:   	a. apparent typographical errors in the equation. The “floating zero” pointed out below, should be removed.

    PNG
    media_image1.png
    106
    341
    media_image1.png
    Greyscale

Appropriate correction is required.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 3-4, 8-14 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.


Claim 3 recites the limitation "the samples".  There is insufficient antecedent basis for this limitation in the claim. Examiner suggests replacing “the samples” with “samples”.
Claim 8 recites the limitation "the loss weight”.  There is insufficient antecedent basis for this limitation in the claim. Examiner suggests replacing “the loss” with “a loss”.

Claim 9 recites the limitation "the mean of the batch" and “the ith feature vector” and “the class center”.  There is insufficient antecedent basis for this limitation in the claim. 

Claim 11 recites the limitation "the feature vector” and “the derivative”.  There is insufficient antecedent basis for this limitation in the claim. Examiner suggests replacing “the feature vector” with “each feature vector” and replacing “the derivative” with “a derivative”.

Claim 13 recites the limitation "the feature vector” and “the derivative”.  There is insufficient antecedent basis for this limitation in the claim. Examiner suggests replacing “the feature vector” with “each feature vector” and replacing “the derivative” with “a derivative”.

Dependent claims 4, 9, 10, 12, and 14 are rejected by virtue of dependency.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-14 are rejected under 35 U.S.C. 101 because:

Regarding Claim 1
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:
providing a primary loss function;
augmenting the primary loss function with a secondary loss function
the secondary loss function comprising a planetary loss portion that minimizes intra-class variation of the feature vectors and a sun loss portion that maximizes the inter-class variation of the feature vectors
minimizing the augmented loss function for each feature vector in the batch
back propagating the results of the minimization of the augmented loss function into the deep neural network, such that the network learns an increased discrimination of the feature vectors.


as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)), the above limitations in the context of this claim encompass the following: “providing a primary loss function” and “augmenting the primary loss function…” and “the secondary loss function comprising a planetary loss portion…” and “minimizing the augmented loss function…” and “back propagating the results of the minimization of the augmented loss function into the deep neural network…” (Mathematical function). Providing and augmenting a mathematical function is simply prose for using a mathematical calculating (2106.04(a)(2)(C)). Both Minimization and back propagation are mathematical algorithms, while the limitations “such that the network learns an increased discrimination of the feature vector” generally suggests learning in a neural network, this is simply the intended use of the Mathematical algorithm.   As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claims do not present additional elements that integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 2
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:
wherein the secondary loss function maintains a center for each class of feature vectors and computes a mean of the batch of feature vectors.
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “augmenting the primary loss function with a secondary loss function” and further defines the abstract idea, the  above limitation including: “wherein the secondary loss function maintains a center for each class of feature vectors and computes a mean of the batch of feature vectors.” (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 3
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:
wherein the planetary loss portion of the secondary loss function minimizes intra-class variation by minimizing the cosine distance of the samples to their corresponding class center. 
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “augmenting the primary loss function with a secondary loss function” and further defines the abstract idea, the  above limitation including: “wherein the planetary loss portion of the secondary loss function minimizes intra-class variation by minimizing the cosine distance of the samples to their corresponding class center.” (Mathematical calculation). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 4
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:
wherein the sun loss portion of the secondary loss function maximizes the inter-class variation of the feature vectors by maximizing the cosine distance of each feature vector away from the mean of the batch.
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “augmenting the primary loss function with a secondary loss function” and further defines the abstract idea, the above limitation including: “wherein the sun loss portion of the secondary loss function maximizes the inter-class variation of the feature vectors by maximizing the cosine distance of each feature vector away from the mean of the batch.” (Mathematical calculation). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 5
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:
wherein the secondary loss function includes a loss weight enforcing a trade-off between the primary loss function and the secondary loss function
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “augmenting the primary loss function with a secondary loss function” and further defines the abstract idea, the above limitation including: “wherein the secondary loss function includes a loss weight enforcing a trade-off between the primary loss function and the secondary loss function” (Mathematical calculation). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 6
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:
wherein the primary loss function is Softmax.
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “augmenting the primary loss function with a secondary loss function” and further defines the abstract idea, the above limitation including: “wherein the primary loss function is Softmax.” (Mathematical calculation). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 7
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:
wherein the secondary loss function is minimized using stochastic gradient descent.
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “minimizing the augmented loss function…” and further defines the abstract idea, the above limitation including: “wherein the secondary loss function is minimized using stochastic gradient descent.” (Mathematical calculation). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 8
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:
wherein the augmented loss function is given by: Lc = Lprimary + λL + Ls where: λ is the loss weight; LPrimary is the primary loss function; LP is the planet loss portion of the secondary loss function; and LS is the sun loss portion of the secondary loss function.
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “augmenting the primary loss function with a secondary loss function” and further defines the abstract idea, the above limitation including: “wherein the augmented loss function is given by: Lc = Lprimary + λL + Ls where: λ is the loss weight; LPrimary is the primary loss function; LP is the planet loss portion of the secondary loss function; and LS is the sun loss portion of the secondary loss function.” (Mathematical calculation). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 9
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:

    PNG
    media_image2.png
    106
    572
    media_image2.png
    Greyscale
where:   m is the batch size; xi is the i' feature vector in the batch; Pyi is the class center for the class in which feature vector xi is classified; s is the mean of the batch; and Beta is a margin for the sun loss portion.  
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “augmenting the primary loss function with a secondary loss function” and further defines the abstract idea, the above limitation including: “…where:   m is the batch size; xi is the i' feature vector in the batch; Pyi is the class center for the class in which feature vector xi is classified; s is the mean of the batch; and Beta is a margin for the sun loss portion.” (Mathematical calculation). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 10
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:

    PNG
    media_image3.png
    76
    166
    media_image3.png
    Greyscale

as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “augmenting the primary loss function with a secondary loss function” and further defines the abstract idea, the above limitation including: “
    PNG
    media_image4.png
    76
    166
    media_image4.png
    Greyscale
” (Mathematical calculation). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 11
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:
wherein each feature vector is adjusted based on a planet loss gradient function representing the derivative of the planet loss portion of the secondary loss function with respect to the feature vector.
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)), the above limitations in the context of this claim encompass the following: “wherein each feature vector is adjusted based on a planet loss gradient function representing the derivative of the planet loss portion of the secondary loss function with respect to the feature vector.” (Evaluation performed in the human mind with the aid of pen and paper). The step of adjusting the vector based on a mathematical function is an analytical step that can be performed by pen and paper. As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claims do not present additional elements that integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 12
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:
wherein the planet loss gradient function is given by 
    PNG
    media_image5.png
    70
    382
    media_image5.png
    Greyscale

as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “augmenting the primary loss function with a secondary loss function” and further defines the abstract idea, the above limitation including: “wherein the planet loss gradient function is given by…” (Mathematical calculation). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 13
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:
wherein each feature vector is further adjusted based on a sun loss gradient function representing the derivative of the sun loss portion of the secondary loss function with respect to the feature vector.
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)), the above limitations in the context of this claim encompass the following: “wherein each feature vector is further adjusted based on a sun loss gradient function representing the derivative of the sun loss portion of the secondary loss function with respect to the feature vector.” (Evaluation performed in the human mind with the aid of pen and paper). The step of adjusting the vector based on a mathematical function is an analytical step that can be performed by pen and paper. As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claims do not present additional elements that integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 14
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [in a deep neural network, for discriminating a batch of feature vectors]. Each of the following limitations:
wherein the sun loss gradient function is given by: 
    PNG
    media_image6.png
    84
    344
    media_image6.png
    Greyscale

as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “augmenting the primary loss function with a secondary loss function” and further defines the abstract idea, the above limitation including: “wherein the sun loss gradient function is given by: …” (Mathematical calculation). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 5-7, 11, and 13 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Wen et al. “A Discriminative Feature Learning Approach for Deep Face Recognition” hereinafter Wen.

Regarding claim 1
Wen teaches, A method, in a deep neural network, for discriminating a batch of feature vectors comprising: (Abstract “With the joint supervision of softmax loss and center loss, we can train a robust CNNs”) a. providing a primary loss function; b. augmenting the primary loss function with a secondary loss function (pg 3 ¶01 “The CNNs are trained under the joint supervision of the softmax loss and center loss” softmax loss corresponds to the primary loss functions provided by the CNN, the joint loss function is comprised of a primary loss function, softmax loss, and a secondary loss function, the center loss) the secondary loss function comprising a planetary loss portion that minimizes intra-class variation of the feature vectors and a sun loss portion that maximizes the inter-class variation of the feature vectors; ( pg 3 ¶01 “With the joint supervision, not only the inter-class features differences are enlarged, but also the intra-class features variations [of the feature vectors] are reduced” pg 5 Section 3.2 “minimizing the intra-class variations while keeping the features of different classes separable is the key. To this end, we propose the center loss function” the joint supervision refers to the combined softmax loss and center loss functions. The center loss function which corresponds and includes both the planetary loss and the sun loss, minimizes intra-class variations, and keeps features ‘separable’ or maximizes the inter-class variations.) c. minimizing the augmented loss function for each feature vector in the batch; and ( pg 5 Section 3.2 ¶03 “instead of updating the centers with respect to the entire training set, we perform the update based on mini-batch.” The loss is computed for each mini batch, as previously stated the loss function is minimized via the training.) d. back propagating the results of the minimization of the augmented loss function into the deep neural network, such that the network learns an increased discrimination of the feature vectors. (pg 6 Algorithm 1 “In Algorithm 1, we summarize the learning details in the CNNs with joint supervision.” Algorithm 1 describes backpropagation which minimizes the Lc loss, or center loss, of the deep neural network. This achieves the result of increasing discrimination because the loss function is designed, as previously cited, to maximize inter-class variation while minimizing intra class variations.)

Regarding claim 5
Wen teaches claim 1
Further Wen teaches, wherein the secondary loss function includes a loss weight enforcing a trade-off between the primary loss function and the secondary loss function. ( pg 6 “We adopt the joint supervision of softmax loss and center loss … The formulation is given in Equation 5…. 
    PNG
    media_image7.png
    20
    116
    media_image7.png
    Greyscale
…Clearly, the CNNs supervised by center loss are trainable and can be optimized by standard SGD. A scalar λ is used for balancing the two loss functions.” )

Regarding claim 6
Wen teaches claim 1
Further Wen teaches, wherein the primary loss function is Softmax. ( pg 6 “We adopt the joint supervision of softmax loss and center loss … The formulation is given in Equation 5…. 
    PNG
    media_image7.png
    20
    116
    media_image7.png
    Greyscale
…Clearly, the CNNs supervised by center loss are trainable and can be optimized by standard SGD. A scalar λ is used for balancing the two loss functions. The conventional softmax loss can be considered as a special case of this joint supervision, if λ is set to 0.” The primary loss Ls is the softmax loss.)

Regarding claim 7
Wen teaches claim 1
Further Wen teaches, wherein the secondary loss function is minimized using stochastic gradient descent. ( pg 6 “We adopt the joint supervision of softmax loss and center loss to train the CNNs for discriminative feature learning. The formulation is given in Equation 5…. 
    PNG
    media_image8.png
    23
    118
    media_image8.png
    Greyscale
 Clearly, the CNNs supervised by center loss are trainable and can be optimized by standard SGD” SGD is short for stochastic gradient descent. The loss is made up of a primary and secondary loss which are optimized using the SGD algorithm.)


Regarding claim 11
Wen teaches claim 1
Further Wen teaches, wherein each feature vector is adjusted based on a planet loss gradient function representing the derivative of the planet loss portion of the secondary loss function with respect to the feature vector. (pg 6 “In Algorithm 1, we summarize the learning detailsin the CNNs with joint supervision…. 
    PNG
    media_image9.png
    397
    868
    media_image9.png
    Greyscale
”  in step 5-7 the parameters are updated based on the loss gradient. The total loss gradient includes the secondary loss function as previously described in claim 1. Thus the feature vector is adjusted based on the derivative of planet loss which is used to update parameters.)

Regarding claim 13
Wen teaches claim 1
Further Wen teaches, wherein each feature vector is adjusted based on a planet loss gradient function representing the derivative of the planet loss portion of the secondary loss function with respect to the feature vector. (pg 6 “In Algorithm 1, we summarize the learning detailsin the CNNs with joint supervision…
    PNG
    media_image9.png
    397
    868
    media_image9.png
    Greyscale
”  in step 5-7 the parameters are updated based on the loss gradient. The total loss gradient includes the primary loss function as previously described in claim 1. Thus the feature vector is adjusted based on the derivative of sun loss which is used to update parameters.)

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 2 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wen. Further in view of Kang et al. “Learning Deep Semantic Embeddings for Cross-Modal Retrieval” hereinafter Kang. 

Regarding claim 2
Wen teaches claim 1
Wen does not explicitly teach, wherein the secondary loss function maintains a center for each class of feature vectors and computes a mean of the batch of feature vectors.
However Kang when addressing issues related to maintaining an average embedding for each class teaches in a machine learning loss function teaches, wherein the secondary loss function maintains a center for each class of feature vectors and computes a mean of the batch of feature vectors. (Section 3.3 “the center lossis also exploited in this paper for cross-modal matching… 
    PNG
    media_image10.png
    49
    218
    media_image10.png
    Greyscale
” where cyi ∈ Rd is the class center of the embedding xi. Differently, unlike other weight parameters to be learned by backpropagation, the updating of the class centers cj , j = 1, 2, . . . , C are additionally performed as follows” the secondary loss maintains the center of each class by updated the class centers for the batch. The Summation in the cited equation 2 corresponds to the mean of the batch of feature vectors, because each element of feature vector is summed over the set in the batch which is then multiplied by the reciprocal of the size of the batch (1/M).)
	It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a moving average center for each class embedding as taught by Kang to the disclosed invention of Wen.
One of ordinary skill in the arts would have been motivated to make this modification so that “For class label guided cross-modal matching, such as matching between images and texts, it is even more important to reduce the intra-class variance, so that different modalities of the same class will have small distances to enable direct matching” (Kang Section 3.3)


Claim 3 and 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wen/Kang. Further in view of Liu et al “Learning Deep Features via Congenerous Cosine Loss for Person Recognition” hereinafter Liu.

Regarding claim 3
Wen/Kang teaches claim 2
Wen/Kang does not explicitly teach, wherein the planetary loss portion of the secondary loss function minimizes intra-class variation by minimizing the cosine distance of the samples to their corresponding class center.
However Liu when addressing issues related a center loss function that employs cosine similarity teaches, wherein the planetary loss portion of the secondary loss function minimizes intra-class variation by minimizing the cosine distance of the samples to their corresponding class center. (pg 3 Section 3.2 “ The intuition behind designing a COCO loss is that we directly compare and optimize the cosine distance (similarity) between two features…. We first define the cosine similarity of two features from a mini-batch B as:  … A natural intuition to a desirable loss is to increase the similarity of samples within a category and enlarge the centroid distance of samples across classes…. Incorporating the spirit of Eqn. 3 with class centroid, we have the following output of sample i to maximize:  ” “The numerator ensures sample i is close enough to its own class li” Examiner notes that the cosine similarity or distance is measured by the functions C(x,y). Because the minimization function is represented as a fraction, in order to minimize the fraction the numerator must be minimized while the denominator is maximized. Cli is the centroid of the class that matches the class of the feature vector, thus minimizes the numerator, minimizes intra class variation.)
	It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a center loss that measures distance using cosine similarity as an alternative to mean squared error as taught by Liu to the disclosed invention of Wen/Kang.
One of ordinary skill in the arts would have been motivated to make this modification so that one could implement “a…loss is that we directly compare and optimize the cosine distance (similarity) between two features… The cosine similarity measures how close two samples are in the feature space” in contrast regarding the MSE based loss “the positives of one class will get as far as possible from the negatives of other classes. It optimizes the inter-class distance in some sense but fails to differentiate among the negatives” (Section 3.3 and Section 3.2)

Regarding claim 4
Wen/Kang/Liu teaches claim 3
Further Liu teaches, wherein the sun loss portion of the secondary loss function maximizes the inter-class variation of the feature vectors by maximizing the cosine distance of each feature vector away from the mean of the batch. (pg 3 Section 3.2 “The intuition behind designing a COCO loss is that we directly compare and optimize the cosine distance (similarity) between two features…. We first define the cosine similarity of two features from a mini-batch B as: … A natural intuition to a desirable loss is to increase the similarity of samples within a category and enlarge the centroid distance of samples across classes…. Incorporating the spirit of Eqn. 3 with class centroid, we have the following output of sample i to maximize:  … the denominator enforces a minimal distance against samples in other classes… we propose the congenerous cosine (COCO) loss, which is to increase similarity within classes and enlarge variation across categories in a cooperative way:” Examiner notes that the cosine similarity or distance is measured by the functions C(x,y). Because the minimization function is represented as a fraction, in order to minimize the fraction the numerator must be minimized while the denominator is maximized. Ck is the centroid of the class that is the average of all features representing the inter class average, thus maximizing the denominator maximizes or enlarges inter class variation.) 

Claim 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wen. Further in view of Zhang et al “Deep Metric Learning with Improved Triplet Loss for Face Clustering in Videos” hereinafter Zhang.

Regarding claim 8
Wen teaches claim 1
Further Wen teaches, wherein the augmented loss function is given by:  Lc = Lprimary + [Secondary loss]… LPrimary is the primary loss function; (pg 3 ¶01 “The CNNs are trained under the joint supervision of the softmax loss and center loss” softmax loss corresponds to the primary loss functions provided by the CNN, the joint loss function is comprised of a primary loss function, softmax loss, and a secondary loss function, the center loss)
	Wen does not explicitly teach, wherein the [secondary loss is given by] (lambda)Lp + Ls  where:  lambda is the loss weight;…LP is the planet loss portion of the secondary loss function; and LS is the sun loss portion of the secondary loss function.
	Zhang when addressing issues related to an improved secondary loss function teaches, wherein the [secondary loss is given by] (lambda)Lp + Ls  where:  lambda is the loss weight;…LP is the planet loss portion of the secondary loss function; and LS is the sun loss portion of the secondary loss function. (pg 5 “To address these issues, we propose an improved triplet loss function… We define the ImpTriplet loss as:… 
    PNG
    media_image11.png
    89
    616
    media_image11.png
    Greyscale
the intra class constraints, corresponds to the LP function, while the inter class constraints, correspond to the sun loss. The art provides an loss function that separates the interclass and intraclass functions by addition.
	It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate separable interclass and intraclass loss functions which enforces the angle that a feature embedding is encouraged to update as taught by Zhang to the disclosed invention of Wen.
One of ordinary skill in the arts would have been motivated to make this modification so that one could implement “an improved triplet loss function, which pushes the negative face away from the positive pairs simultaneously, and requires the distance of the positive pair to be less than a margin, such that the Euclidean distances correspond to a measure of semantic face similarity” (Conclusion Zhang)

Allowable Subject Matter
Claims 9, 10, 12 and 14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.	
Specifically, none of the reference of record either alone or in combination fairly disclose or suggest the limitations of claim 9.

	The closest prior art of record Zhang et al (“Deep Metric Learning with Improved Triplet Loss for Face Clustering in Videos”) teaches a secondary loss function which is the sum of two terms. Where on term minimized intra class variation and another maximizes inter class variation. However Zhang’s loss function is based on the Euclidean distance between feature vectors, not between a feature vector in a batch and a class center. Further, the claimed equation measures “angular distance” rather than “Euclidean distance” as described in the Specification ¶0020. Further Liu et al (“Learning Deep Features via Congenerous Cosine Loss for Person Recognition”) discloses COCO loss which uses “angular distance” to minimizes intra class variation while maximizing inter-class variation, however this formulation maximizes inter class variation against class specific centroids instead of the centroid of an entire batch of samples irrespective of class. It would not have been obvious to one of ordinary skill in the art before the effective filling data to combine these reference to teach at least the limitations of claim 9.


Conclusion
Prior art
Wu et al. “Deep Face Recognition with Center Invariant Loss” discusses a loss function that minimizes the sum of 3 loss functions including: a soft max loss, a center invariant loss for inter class variations, and center loss for intra class variations.
Deng et al “ArcFace: Additive Angular Margin Loss for Deep Face Recognition” incorporates angular and cosine margin to enhance the discriminative power of soft-max loss.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached on Monday-Friday 7:30 am – 4:00 pm (EST).
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki, can be reached at telephone number 5712723719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://portal.uspto.gov/external/portal. Should you have questions about access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

/J.R.G./Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122