DETAILED ACTION
Response to Amendment
The amendment was received 2/15/2022. Claims 1-20 are pending.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
Regarding claims 1-20, 35 USC 112(f) is not invoked in claims 1-20.





Accordingly the following definitions are “taken” via MPEP 2111.01 III. "PLAIN MEANING" REFERS TO THE ORDINARY AND CUSTOMARY MEANING GIVEN TO THE TERM BY THOSE OF ORDINARY SKILL IN THE ART, 3rd paragraph, emphasis added:
“It is also appropriate to look to how the claim term is used in the prior art, which includes prior art patents, published applications, trade publications, and dictionaries. Any meaning of a claim term taken from the prior art must be consistent with the use of the claim term in the specification and drawings. Moreover , when the specification is clear about the scope and content of a claim term, there is no need to turn to extrinsic evidence for claim interpretation. 3M Innovative Props. Co. v. Tredegar Corp., 725 F.3d 1315, 1326-28, 107 USPQ2d 1717, 1726-27 (Fed. Cir. 2013) (holding that "continuous microtextured skin layer over substantially the entire laminate" was clearly defined in the written description, and therefore, there was no need to turn to extrinsic evidence to construe the claim).”















The claimed “engine” (as in “generating, using a trained facial classification neural engine, one or more first labels” and “generating, using a supporting engine, a second label” in claim 1) is interpreted in light of applicant’s disclosure: 
A.	“[0041]       The system includes various engines, each of which is constructed, 
programmed, configured, or otherwise adapted, to carry out a function or set of functions. The term engine as used herein means a tangible device, component, or arrangement of components implemented using hardware, such as by an application specific integrated circuit (ASIC) or field-programmable gate array (FPGA), for example, or as a combination of hardware and software, such as by a processor- based computing platform and a set of program instructions that transform the computing platform into a special-purpose device to implement the particular functionality. An engine may also be implemented as a combination of the two, with certain functions facilitated by hardware alone, and other functions facilitated by a combination of hardware and software.”; 

B.	[0067]: penultimate S: “ The layers following the input layer may be convolution layers that produce feature maps that are filtering results of the inputs and are used by the next convolution layer.”; and  
C.	fig. 7:[0098]: “FIG. 7 is a flow chart illustrating an example method 700 for fine tuning the facial classification neural engine 612, in accordance with some embodiments.”; 
and definition thereof via Dictionary.com:
A.	“a piece or collection of software that drives a later process” or 
B.	“a means by which something is achieved, accomplished, or furthered” 
is “taken” as the meaning of the claimed “engine” via MPEP 2111.01 III:
engine, noun
4	Computers. a piece or collection of software that drives a later process (used in combination, as in game engine; software engine).See also search engine.
6	a means by which something is achieved, accomplished, or furthered:
Trade is an engine of growth that creates jobs, reduces poverty, and increases economic opportunity; 

and use in the prior art:

A.	O’Toole et al. (Face Space Representations in Deep Convolutional Neural 
Networks), September 2018, uses “engine” as provided by a “model” (four models are
shown in fig.1, below): 
“High-end computational graphical processing units, the preferred computational engine of choice for DCNNs, are 30–50 thousand times faster than computers in the 80s.” (page 795, Box 1. Deep Convolutional Neural Networks for Face Recognition, 2nd para, 3rd S); 

“In computer vision, despite the strong limitation of image-based PCA to operate only within a single (frontal) viewpoint, this model provided the computational engine for the first generation of commercially viable face recognition systems [22].” (pages 795,796); 

and

See Key Figure (Figure 1), evolution in computational models, pages 797,798:










    PNG
    media_image1.png
    772
    675
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    146
    678
    media_image2.png
    Greyscale

, wherein “graphical processing units” or “GPU” is used in Modasshir et al. as “an NVIDIA GTX1080” (a graphics card) (Deep Neural Networks: a Comparison on Different Computing Platforms):
	“As DNNs have intense computational requirements in the majority of applications, they utilize a cluster of computers or a cutting edge Graphical Processing Unit (GPU), often having excessive power consumption and generating a lot of heat.” (Abstract, 3rd S); 

and

“A Dell Alienware gaming3 laptop with Intel Core i7-7820HK as CPU, 32 GB DDR4 as RAM, and an NVIDIA GTX1080 as GPU was the most powerful machine tested.” (pages 385,386);

B.	Araujo et al. (US Patent App. Pub. No.: US 2020/0019699A1), filed July 10, 

2018, uses “engine”:

“[0039] Moreover, it should be appreciated that the use of the term “engine,” if used herein with regard to describing embodiments and features of the invention, is not intended to be limiting of any particular implementation for accomplishing and/or performing the actions, steps, processes, etc., attributable to and/or performed by the engine. An engine may be, but is not limited to, software, hardware and/or firmware or any combination thereof that performs the specified functions including, but not limited to, any use of a general and/or specialized processor in combination with appropriate software loaded or stored in a machine-readable memory and executed by the processor. Further, any name associated with a particular engine is, unless otherwise specified, for purposes of convenience of reference and not intended to be limiting to a specific implementation. Additionally, any functionality attributed to an engine may be equally performed by multiple engines, incorporated into and/or combined with the functionality of another engine of the same or different type, or distributed across one or more engines of various configurations.”;









The claimed “identification” (in “verifying, via at least one client computing device, a correct identification” in claim 5) is interpreted in light of applicant’s disclosure via applicant’s fig. 3: 312,314: “RECOGNIZE IMAGE”: “IMAGE RECOGNIZED” and definition thereof via Dictionary.com wherein “an act or instance of identifying; the state of being identified” is “taken” as the meaning of the claimed “identification” in 
“verifying, via at least one client computing device, a correct identification” in claim 5
 via MPEP 2111.01 III:
identification
noun
1	an act or instance of identifying; the state of being identified.















The claimed “identification” (in “upon receiving, from at least M employee client computing devices, a consistent identification of the person: verifying that the consistent identification is the correct identification” in claim 7) is interpreted in light of applicant’s disclosure (“The authentication engine receives, from at least one of the client computing device(s) 640, a selection of one of the possible identifications as the correct identification.” [00108], last S) and definition thereof via Dictionary.com wherein “something that verifies the identity of a person, animal, or thing” is “taken” as the meaning of the claimed “identification” in 
“upon receiving, from at least M employee client computing devices, a consistent identification (“something that verifies the identity of a person, animal, or thing”) of the person: verifying that the consistent identification (“something that verifies the identity of a person, animal, or thing”) is the correct identification (the meaning of this “identification” has already been “taken” above regarding claim 5 as “the state of being identified”)” 

in claim 7 via MPEP 2111.01 III:
identification
noun
2	something that identifies a person, animal, or thing:
He carries identification with him at all times.

wherein “identifies” is defined:
identify
verb (used with object), i·den·ti·fied, i·den·ti·fy·ing.
1	to recognize or establish as being a particular person or thing; verify the identity of:
to identify handwriting; to identify the bearer of a check.
2	to serve as a means of identification for:
His gruff voice quickly identified him.
3	to make, represent to be, or regard or treat as the same or identical:
They identified Jones with the progress of the company.



Response to Arguments
CLAIM OBJECTIONS
Applicant’s arguments, see remarks, page 8:
“Claims 5-8 and 13 were objected to for various informalities. Applicant respectfully asserts that the objections are unclear and lack explicit basis in the MPEP, but as best Applicant can understand the objections, the Office appeared to primarily object to the terms "correct identification" and "consistent identification," as used in Claims 5-8 and 13, as being in an incorrect verb "tense" (e.g., past or present tense). (See Office Action at pp. 2-3.) Applicant respectfully asserts that such objections are inappropriate. The terms "correct identification" and "consistent identification," as used in Claims 5-8 and 13, are nouns, which are not a part of speech that require conjugation for tense (i.e., past, present, or future tense). Accordingly, the objections to the terms "correct identification" and "consistent identification," as used in Claims 5-8 and 13, are not proper and should be withdrawn.”

, filed 2/15/2022, with respect to the claim objection of claims 5-8 and 13 have been fully considered and are persuasive. The claim objection of claims 5-8 and 13 has been withdrawn.  
Thus, claims 5,6,7,8 and 13 were improperly interpreted by incorporating limitations from applicant’s disclosure into claims 5,6,7,8 and 13. Thus, the scope of claims 5,6,7,8 and 13 are given more claim scope than being limited to word-tense as used in the Office action of 12/17/2021.







CLAIM REJECTIONS – 35 USC 102
Claims 1,14, and 15

In response to applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., “generating both” (i.e., generating two together) and “its label” (probability’s label) and “output with respect to the two input images” via applicant’s remarks, page 9:
“First, as noted above, Claim 1 recites "generating, using a trained facial classification neural engine, one or more first labels for a person depicted in the probe image and a probability for at least one of the one or more first labels" (emphasis added). That is, Claim 1 recites generating both (1) a label for the person depicted in the probe image and (2) a probability for that particular label. Contrary to the Office's assertion, Luo does not disclose generating both a label and a probability for such label. Rather, Luo appears merely to describe taking two colored images as input and ultimately outputting an "image label" of "1" if the two images are of the same person or "0" otherwise. (See Luo at § 3.1.) While Luo appears to describe the use of a multidimensional vector, in which each value of the vector represents a probability of correlation to each class to which the input data belongs, the generation of such multidimensional vector appears merely to be a sub-step in ultimately generating the final output label (i.e., the "1" or "0" previously described) for the image. (See Luo at § 3.1.3.) Applicant cannot find anywhere Luo describes generating a probability for its label (i.e., the "1" or "0") that is output with respect to the two input images. Thus, Claim 1 is patentable over Luo for at least this reason.”

 ) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
In contrast to “generating both”, claim 1, lines 6,7 states, “generating…labels… and a probability”.
In contrast to “its label”, claim 1, line 7 states, “a probability for at least the one or more first labels”.
In contrast to “output with respect to the two input images”, claim 1 says, “generating…for the…probe image”.
Applicant's arguments filed 2/15/2022 have been fully considered but they are not persuasive.
Applicants state in page 9:
“First, as noted above, Claim 1 recites "generating, using a trained facial classification neural engine, one or more first labels for a person depicted in the probe image and a probability for at least one of the one or more first labels" (emphasis added). That is, Claim 1 recites generating both (1) a label for the person depicted in the probe image and (2) a probability for that particular label. Contrary to the Office's assertion, Luo does not disclose generating both a label and a probability for such label. Rather, Luo appears merely to describe taking two colored images as input and ultimately outputting an "image label" of "1" if the two images are of the same person or "0" otherwise. (See Luo at § 3.1.) While Luo appears to describe the use of a multidimensional vector, in which each value of the vector represents a probability of correlation to each class to which the input data belongs, the generation of such multidimensional vector appears merely to be a sub-step in ultimately generating the final output label (i.e., the "1" or "0" previously described) for the image. (See Luo at § 3.1.3.) Applicant cannot find anywhere Luo describes generating a probability for its label (i.e., the "1" or "0") that is output with respect to the two input images. Thus, Claim 1 is patentable over Luo for at least this reason.”

	The examiner respectfully disagrees since Luo teaches generating labels as discussed in the Office action, page 8, line 6 and generating a probability or “output…a probability”, section: 3.1.3 The loss function: 1. SoftmaxWithLoss:1st and 2nd Ss.









In response to applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., “demonstrate how… determining that the probability of a particular label is within a predefined low accuracy range” and “demonstrate how…determining that the probability of a particular label is within a predefined low accuracy range” and “determining whether any such value is ‘within a predefined low accuracy range,’ ” via applicant’s remarks, pages 9,10:
“Second, as noted above, Claim 1 recites "determining that the probability is within a predefined low accuracy range." The Office appeared to allege that Luo describes this limitation merely because it recites ". . . the proposed network is able to accurately compute the similarity of the two input images." (See Office Action at p. 8.) However, the mere statement that Luo's proposed network is able to accurately compute the similarity of two input images does not demonstrate how Luo discloses determining that the probability of a particular label is within a predefined low accuracy range. Moreover, the Office's reference to Section 3.1.3 of Luo (see id.) also fails to demonstrate how Luo discloses determining that the probability of a particular label is within a predefined low accuracy range. Section 3.1.3 of Luo appears merely to describe a multidimensional vector, in which each vector represents a value in the range of "0" and "1." As an initial matter, as explained above, the values of Luo's multidimensional vector are not equivalent to Applicant's claimed "probability." However, even if the values of Luo's multidimensional vector could be considered equivalent to Applicant's claimed "probability," nowhere does Luo appear to disclose determining whether any such value is "within a predefined low accuracy range," as called for in Applicant's Claim 1. Thus, Claim 1 is additionally patentable over Luo for at least this reason.”

) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).





In contrast to “demonstrate how… determining that the probability of a particular label is within a predefined low accuracy range”, claim 1, line 9 says, “determining that the probability is within a predefined low accuracy range” corresponding to Luo’s “probability…sum…equals to 1” (“1” is comprised by “in the range of 0 to 1”), section 3.1.3 The loss function: 1. SoftmaxWithLoss, 2nd and 3rd Ss. The determined sum (0.5 probability +0.5 probability) is within a predefined range of 0 to 1 by summing to 1, wherein probability is defined:
BRITISH DICTIONARY DEFINITIONS FOR PROBABILITY
probability
noun plural -ties
3	statistics a measure or estimate of the degree of confidence one may have in the occurrence of an event, measured on a scale from zero (impossibility) to one (certainty). It may be defined as the proportion of favourable outcomes to the total number of possibilities if these are indifferent (mathematical probability), or the proportion observed in a sample (empirical probability), or the limit of this as the sample size tends to infinity (relative frequency), or by more subjective criteria (subjective probability)











Applicant's arguments filed 2/15/2022 have been fully considered but they are not persuasive. Applicants state in page 10:
“Third, as noted above, Claim 1 recites "generating, using a supporting engine, a second label for the person depicted in the probe image, wherein the supporting engine operates independently of the trained facial classification neural engine." Applicant respectfully asserts that Luo does not disclose generating a second label, let alone generating a second label using a supporting engine that operates independently of a trained facial classification neural engine.” 

The examiner respectfully disagrees since Luo teaches “label is the k-th class” (or a label is used in the formation of a series, as first, second, and third), section: 3.1.3 The Loss function: 1. SoftmaxWithLoss, last para, last S, wherein “-th” is defined:
OTHER DEFINITIONS FOR TH (3 OF 6)
-th2
1	a suffix used in the formation of ordinal numbers (fourth, tenth), in some cases, added to altered stems of the cardinal (fifth; twelfth).

wherein “ordinal numbers” is defined:
ordinal number
noun
1	Also called ordinal numeral. any of the numbers that express degree, quality, or position in a series, as first, second, and third (distinguished from cardinal number).

BRITISH DICTIONARY DEFINITIONS FOR ORDINAL NUMBER
ordinal number
noun
1	a number denoting relative position in a sequence, such as first, second, third: Sometimes shortened to: ordinal

SCIENTIFIC DEFINITIONS FOR ORDINAL NUMBER
ordinal number
A number, such as 3rd, 11th, or 412th, used in counting to indicate position in a series but not quantity. Compare cardinal number.




Applicants state in page 10:
“Third, as noted above, Claim 1 recites "generating, using a supporting engine, a second label for the person depicted in the probe image, wherein the supporting engine operates independently of the trained facial classification neural engine." Applicant respectfully asserts that Luo does not disclose generating a second label, let alone generating a second label using a supporting engine that operates independently of a trained facial classification neural engine.” 

The examiner respectfully disagrees since Luo teaches:
A.	generating a second label (via said k-th label-class) using a supporting (via a layer or bed or stratum or forming a base as shown in fig. 1: “Feature maps of the convolution layer1”) engine1 (or a means by which something is achieved, accomplished, or furthered via “extracted”, section: 4.1.1 Feature map: 1st para, 2nd S, represented as an output arrow of fig. 3: “Conv1[5*5,20,1]”) that operates independently (via “images are being independently processed”, section: 3 Our method, 7th S as shown in fig. 6(a)(b)(c) and (d): extracted features of two separate people) of a trained facial classification neural engine (or a means by which something is achieved, accomplished, or furthered via another independent image processing represented as an output arrow of fig. 3: “Conv5[5*5,20,1]”: another independent image feature extractor).
1	Dictionary.com:	
engine
noun
7	a means by which something is achieved, accomplished, or furthered:
Trade is an engine of growth that creates jobs, reduces poverty, and increases economic opportunity; 

and



B.	generating a second label (via said k-th label-class) using a supporting (via a layer or bed or stratum or forming a base as shown in fig. 1: “Feature maps of the convolution layer1”) engine2 (or a piece or collection of software that drives a later process via a layer-part, fig. 3:“Conv1[5*5,20,1]”, in a specific set and arrangement of software comprised by a “caffe based environment”2, section: 4 Experiments, that drives a later process) that operates independently (via “images are being independently processed”, section: 3 Our method, 7th S, or via independent convolution-operation “processes…to extract and integrate features of the input images”, section: 4.1.1 Feature map: 1st S, wherein each independent process is as shown by any one of fig. 3: “Conv”) of a trained facial classification neural engine (or a piece or collection of software that drives a later process via another layer-part, fig. 3: “Conv5[5*5,20,1]”, in said specific set and arrangement of software comprised by a “caffe based environment” that drives a later process).
2	Dictionary.com:
engine, noun
4	Computers. a piece or collection of software that drives a later process (used in combination, as in game engine; software engine).See also search engine.

environment, noun
4	Computers. the hardware or software configuration, or the mode of operation, of a computer system:
In a time-sharing environment, transactions are processed as they occur.

wherein “configuration” is defined:
configuration, noun
5	Computers.
a	the way a computer or computer system is put together; a specific set and arrangement of internal and external components, including hardware, software, and devices.

b	the way a software program or device is set up for a particular computer, computer system, or task; the specific settings for a program or device:
configuration of your email program to work with your new ISP.
Applicants state in page 10:
“The Office asserted that this feature is disclosed in Section 3.1.3 of Luo. (Office Action at p. 8.) As an initial matter, Applicant alleges that the Office did not clearly identify/explain which alleged "label(s)" in Luo the Office considered to be equivalent with Applicant's claimed "one or more first labels" and which alleged "label(s)" in Luo the Office considered to be equivalent with Applicant's claimed "second label." At any rate, Luo appears to describe taking two colored images as input and outputting a single "image label" of "1" if the two images are of the same person or "0" otherwise. (See Luo at § 3.1). Nowhere does Luo appear to disclose generating a second image label.”

	In response, the k-th class-label (label-class 0 or label-class 1 are each used in the formation of a series, as first, second, and third) corresponds to the claimed “one…first label” and the claimed “second label” under the broadest reasonable interpretation.














Applicants state in pages 10,11:
“In addition, the Office appeared to equate the "loss functions" described in Section 3.1.3 with Applicant's claimed supporting engine that operates independently of a trained facial classification neural engine. (Office Action at p. 8.) However, contrary to the Office's assertion, the "loss functions" described in Section 3.1.3 of Luo do not operate independently of any trained neural engine described in Luo. Indeed, in the last two paragraphs of Section 3.1.2 (see also Fig. 3), Luo describes that the input to the loss functions comes explicitly from the previous layers of its network. Accordingly, the loss functions of Luo necessarily depend on the previous layers of Luo's network. As such, contrary to the Office's allegation, the loss functions in Luo do not, and cannot, operate independently from any other part of Luo's network. Thus, Claim 1 is additionally patentable over Luo for at least these reasons.”

	The examiner respectfully disagrees since Luo discloses:
operate (via “the convolution operation”, section: 3.1.2 Network structure design: 1. The parameters setting, 4th para, as shown by any one of fig. 3: “Conv” such as “Conv1[5*5,20,1]”) independently (via said “images are being independently processed” via said individual convolution operation as shown in fig. 1: “Convolutions”, comprising a process of operating) of any (other) trained neural engine (via said means by which something is achieved, accomplished, or furthered such as an output arrow of fig. 3: “Conv5[5*5,20,1]” resulting in said “training…completed”, wherein “operation” is defined via Dictionary.com:
operation
noun
1	an act or instance, process, or manner of functioning or operating.






Applicants state in pages 10,11:
“In addition, the Office appeared to equate the "loss functions" described in Section 3.1.3 with Applicant's claimed supporting engine that operates independently of a trained facial classification neural engine. (Office Action at p. 8.) However, contrary to the Office's assertion, the "loss functions" described in Section 3.1.3 of Luo do not operate independently of any trained neural engine described in Luo. Indeed, in the last two paragraphs of Section 3.1.2 (see also Fig. 3), Luo describes that the input to the loss functions comes explicitly from the previous layers of its network. Accordingly, the loss functions of Luo necessarily depend on the previous layers of Luo's network. As such, contrary to the Office's allegation, the loss functions in Luo do not, and cannot, operate independently from any other part of Luo's network. Thus, Claim 1 is additionally patentable over Luo for at least these reasons.”

	The examiner respectfully disagrees since Luo teaches “electronic” neural network3 component-parts as shown in fig. 3 wherein each convolution component-part (such as fig. 3: “Conv1[5*5,20,1]” mapped to the convolution of fig. 6(b): either person image) independently, as discussed above, or separately processes images (to independently process or independently extract separate people features).
3	Dictionary.com:
BRITISH DICTIONARY DEFINITIONS FOR NEURAL NETWORK
neural network
noun
1	an interconnected system of neurons, as in the brain or other parts of the nervous system
2	Also called: neural net an analogous network of electronic components, esp one in a computer designed to mimic the operation of the human brain







In response to applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., “generated by a supporting engine” and “independent supporting engine” via applicant’s remarks, page 11:
“Fourth, as noted above, Claim 1 recites "further training the facial classification neural engine based on the second label." The Office appeared to allege that Luo teaches this limitation merely because Luo states that "[t]he network parameters will be refined after numerous training iterations." (Office Action at p. 8.) However, nothing about the foregoing statement in Luo demonstrates that Luo's network is further trained based on a second label generated by a supporting engine that operates independently of Luo's network. Indeed, as explained above, Luo does not disclose generating a second label using an independent supporting engine. So, it necessarily follows that Luo also cannot disclose training a neural engine based on such a second label. Thus, Claim 1 is additionally patentable over Luo for at least this reason.”

) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
	In contrast, claim 1 says, “generating, using a supporting engine, a second label…wherein the supporting engine operates independently”.









Applicants state in page 11:
“Fourth, as noted above, Claim 1 recites "further training the facial classification neural engine based on the second label." The Office appeared to allege that Luo teaches this limitation merely because Luo states that "[t]he network parameters will be refined after numerous training iterations." (Office Action at p. 8.) However, nothing about the foregoing statement in Luo demonstrates that Luo's network is further trained based on a second label generated by a supporting engine that operates independently of Luo's network. Indeed, as explained above, Luo does not disclose generating a second label using an independent supporting engine. So, it necessarily follows that Luo also cannot disclose training a neural engine based on such a second label. Thus, Claim 1 is additionally patentable over Luo for at least this reason.”

	The examiner respectfully disagrees since Luo discloses:
	generating a second label (via said k-th label-class represented in fig. 3: “Loss1[SoftmaxWithLoss]”: “The output of the SoftmaxWithLoss function is…The class…flagged as the ultimate output”, section: 3.1.3 The loss function: 1. SoftmaxWithLoss:1st para, 1st and last Ss) using an independent (via said independent image extraction processing) supporting (via “train the net-work”, section: 3.1.1 Training set, wherein train comprises: guide or teach or assist or aid, wherein the network as shown in fig. 3 comprises said convolutional-layer independent feature extraction) engine (engine: i.e., means by which something is done and represented as any one output guide-training arrow in fig. 3 such as a max pooled output arrow or a convolved output arrow or a concatenated (combined or fused) output arrow or a fully connected output arrow or a dropped-out output arrow all which is part of said training the network).




Applicants state in page 11:
“Fourth, as noted above, Claim 1 recites "further training the facial classification neural engine based on the second label." The Office appeared to allege that Luo teaches this limitation merely because Luo states that "[t]he network parameters will be refined after numerous training iterations." (Office Action at p. 8.) However, nothing about the foregoing statement in Luo demonstrates that Luo's network is further trained based on a second label generated by a supporting engine that operates independently of Luo's network. Indeed, as explained above, Luo does not disclose generating a second label using an independent supporting engine. So, it necessarily follows that Luo also cannot disclose training a neural engine based on such a second label. Thus, Claim 1 is additionally patentable over Luo for at least this reason.”

	The examiner respectfully disagrees since Luo teaches:
	training a neural engine (via “re…training iterations”: section 3.1.3 The loss function: 2. ContrastiveLoss:, last para, last S, from said k-th class-label) based on such a second (via said k-th) label (or “From this…class”, id. penultimate S, re-train: iterate by refining the training).
Based on all of the above, there are two forms of “engine” as used in the art before the filing date:
A.	a graphical processing unit or GPU (comprised by said Luo’s Intel Graphics Intel Core i7-4790k or said Luo’s GTX960 graphics card); and
B.	a taken mimic model (comprised by a neural network as shown in Luo’s fig. 2: “Convolutional Network”, twice, and fig. 3: a series of convolution and pool boxes, twice) so that each taken model provides a respective engine, wherein “neural network” is defined via Dictionary.com:
BRITISH DICTIONARY DEFINITIONS FOR NEURAL NETWORK
neural network
noun
2	Also called: neural net an analogous network of electronic components, esp one in a computer designed to mimic the operation of the human brain

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1,14,15 and 16,19 and 20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Luo et al. (Pedestrian tracking in surveillance video based on modified CNN).
Regarding claim 1, Luo discloses a system comprising: processing circuitry (or “machine”, pg. 24049, 1st full para, last S); and 
a memory (“8G of memory”, id.) storing instructions which, when executed by the processing circuitry, cause the processing circuitry to perform operations comprising: 
receiving, from a vision device comprising one or more cameras (via “different cameras”, section 3.1.1 Training set, 2nd S), a probe image via (fig. 4: “Probe pedestrian image”);







generating, using a trained (via “training…completed”, 3.1.3 The loss function, 1. SoftmaxWithLoss: 2nd para, last S) facial (via “half face”, 4.2.2 Pedestrian tracing, last para, last S) classification (via “classify”, 3.1.3 The loss function, 1st para, last S) neural (via “The multi-loss regularized deep neural network”, id., 1st S) engine (via driving or guiding, as shown by the parts in fig. 3, resulting in said “training…completed”), one or more first (via “k-th”, section: 3.1.3 The loss function: 1. SoftmaxWithLoss, last S) labels (via “generate…labels”, 3.1.1 Training set, penultimate S, at fig. 3:“Loss1[SoftmaxWithLoss] ”:label for classification: “Loss2[ContrastiveLoss]”:label for clustering or tracking) for a person (as shown in fig. 6(a)) depicted in the probe image and a probability (or “a probability”, 3.1.3 The loss function:1. SoftmaxWithLoss: 1st para, 2nd S) for at least one of the one or more first labels; 
determining that the probability is within a predefined low accuracy (via “accurately”, 3.1.2 Network structure design, 1st para, last S) range (via a “correlation… range of 0 to 1”, 3.1.3 The loss function: 1. SoftmaxWithLoss:1st para,2nd S);
generating, using a supporting (or resulting in “refined…training”, 3.1.3 The loss function: 2. ContrastiveLoss: last para, last S) engine (via driving or guiding, as shown by other parts of said the parts in fig. 3, resulting in said “training…completed”), a second (via said k-th) label (via said “generate…labels”) for the person depicted in the probe image, wherein the supporting engine operates independently (or “independently processed”, 3 Our method, 7th S, represented by the extraction processes in fig. 3) of the trained facial classification neural engine; and

further training (via “numerous training iterations”, 3.1.3 The loss function: 2. ContrastiveLoss: last para, last S) the facial classification neural engine (or “the entire network”, id., penultimate S, as shown in figure 3 comprising said other parts of the parts) based on (via fig. 3: “Loss2[ContrastiveLoss]” being the basis for said “numerous training iterations”) the second label.  


















Regarding claim 14, Luo discloses the system of claim 1, wherein the probe image is one of a plurality of images that track the person, the plurality of images being received from the vision device, wherein generating, using the supporting engine, the second label for the person depicted in the probe image comprises: 
determining (via fig. 4: “Result=1?”), using the trained facial classification neural engine (represented in fig. 4: “MSN”), that at least a threshold (regarding what is considered the truth via fig. 4: “Result=1?” wherein “1” represents being true) number (or frame number via fig. 4: “Pedestrian image of current frame is numbered sequentially” being true) of the plurality of images have a specified identification (via “identity”, Abstract, 6th S, represented: fig. 4: “Output match similarity”) with a probability (via said “a probability”) within a predefined high accuracy range (via said “correlation… range of 0 to 1”); and 
determining (said via fig. 4: “Result=1?”=a red bounding box) that the probe image (represented in Table 4 as the red bounding box around pedestrian 1) has the specified identification based on the at least the threshold number (represented in Table 4: frames 1-3) of the plurality of images having the specified identification (given that each frame 1-3 has the red bounding box around pedestrian 1).  
Regarding claim 15, Luo discloses the system of claim 14, the operations further comprising: identifying (via said fig. 4: “Output match similarity”) the plurality of images that track the person based on timestamps (via Table 4: “pedestrian tracing…Time”) associated with the plurality of images and a physical position of the person within a space depicted in the plurality of images.  


Regarding claim 16, claim 16 is rejected the same as claim 1. Thus, argument presented in claim 1 is equally applicable to claim 16. Accordingly, Luo discloses claim 16 of a non-transitory machine-readable medium storing instructions which, when executed by processing circuitry of one or more computing machines, cause the processing circuity to perform operations comprising: 
receiving, from a vision device comprising one or more cameras, a probe image;  
generating, using a trained facial classification neural engine, one or more first labels for a person depicted in the probe image and a probability for at least one of the one or more first labels; 
determining that the probability is within a predefined low accuracy range;
generating, using a supporting engine, a second label for the person depicted in the probe image, wherein the supporting engine operates independently of the trained facial classification neural engine; and
further training the facial classification neural engine based on the second label.  








Regarding claim 19, claim 19 is rejected the same as claim 14. Thus, argument presented in claim 14 is equally applicable to claim 19. Accordingly, Luo discloses claim 19 of the machine-readable medium of claim 16, wherein the probe image is one of a plurality of images that track the person, the plurality of images being received from the vision device, wherein generating, using the supporting engine, the second label for the person depicted in the probe image comprises: 
determining, using the trained facial classification neural engine, that at least a threshold number of the plurality of images have a specified identification with a probability within a predefined high accuracy range; and 
determining that the probe image has the specified identification based on the at least the threshold number of the plurality of images having the specified identification.  
Regarding claim 20, claim 20 is rejected the same as claim 1. Thus, argument presented in claim 1 is equally applicable to claim 16. Accordingly, Luo discloses claim 20 of a method comprising: 
receiving, from a vision device comprising one or more cameras, a probe image;
generating, using a trained facial classification neural engine, one or more first labels for a person depicted in the probe image and a probability for at least one of the one or more first labels; 
determining that the probability is within a predefined low accuracy range;
generating, using a supporting engine, a second label for the person depicted in the probe image, wherein the supporting engine operates independently of the trained facial classification neural engine; and
further training the facial classification neural engine based on the second label.  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Regarding inquiry 4 see Suggestions.
Claims 2 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo et al. (Pedestrian tracking in surveillance video based on modified CNN) in view of Raut et al. (Result Oriented Based Face Recognition using Neural Network with Erosion and Dilation Technique).
Regarding claim 2, Luo teaches the system of claim 1, the operations further comprising: 



using the further trained facial classification neural engine to identify (in the context of being the same during pedestrian tracking or tracing) one or more persons in visual data from the vision device; and 
based on the identified one or more persons in the visual data, controlling access to a physical location or an electronic resource.  
Thus, Luo does not teach:
based on the identified one or more persons in the visual data, controlling access to a physical location or an electronic resource.
Thus, Raut teaches:
  based on the identified (via fig. 1: “Face recognition”) one or more persons in the visual data, controlling access to a physical location or an electronic resource (and thus “granting…access to physical and virtual domains”, 1st pg., left col, 3rd para, 1st S).
Thus one of ordinary skill in tracking as taught by both references (“tracking of known individuals” Raut, page 1822, l. col, 2nd full para, 1st S) can modify Luo’s tracking with Raut by:
a)	tracking people in video by “extracting…face…gait”, Raut, id., 2nd S;
b)	recognizing extracted faces in video; and
c)	recognizing that the modification is predictable or looked forward to because “Real time systems for identifying humans in a scene has a lot of importance in security and surveillance applications”, Raut, id., 1st S. 


Regarding claim 17, claim 17 is rejected the same as claim 2. Thus, argument presented in claim 2 is equally applicable to claim 17. Accordingly, Luo as combined teaches the machine-readable medium of claim 16, the operations further comprising: 
using the further trained facial classification neural engine to identify one or more persons in visual data from the vision device; and 
based on the identified one or more persons in the visual data, controlling access to a physical location or an electronic resource.















Claim 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo et al. (Pedestrian tracking in surveillance video based on modified CNN) in view of Buibas et al. (US Patent 10,282,852).
Regarding claim 3, Luo teaches the system of claim 1, wherein generating, using the supporting engine, the second label for the person depicted in the probe image comprises: 
generating the second label based on an identity card or token provided by the person or based on a user identifier and password entered by the person.  
Thus Luo does not teach:
generating the second label based on an identity card or token provided by the person or based on a user identifier and password entered by the person.
Accordingly, Buibas teaches:
generating (via fig. 4:arrows pointing to “take” “put” “move”) the second label (via fig. 4:“output labels 412 and 413”:“take pizza”: “move soup can”: c.16,ll. 60-63:) based on (A) an identity card (via fig. 19:1904: “Insert Card”) or (B) token provided by the person or (C) based on a user identifier and password entered by the person.
Thus, one of ordinary skill in the art of tracking can modify Luo’s label with Buibas’ by:
a)	making Luo’s tracking L2 loss function label be as Buibas’s fig. 4: “take pizza”:“move soup can”:“take”:“put”:”move”;
b)	putting the surveillance/tracking cameras in a food store; and
c)	recognizing that the modification is predictable or looked forward to because the modification results in generating money from pizza and soup.
Claim 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo et al. (Pedestrian tracking in surveillance video based on modified CNN) in view of Nambiar et al. (Gait-based Person Re-identification: A Survey).
Regarding claim 4, Luo teaches the system of claim 1, wherein generating, using the supporting engine, the second label for the person depicted in the probe image comprises: 
generating the second label based on a combination (via eqn. (8), pg. 24048, represented in fig. 4: “MSN”) of weak authentication factors, the weak authentication factors comprising one or more of: a height, a weight and a gait.  
Thus, Luo does not teach:
generating the second label based on a combination of weak authentication factors, the weak authentication factors comprising one or more of: a height, a weight and a gait.
Accordingly, Nambiar teaches:
generating the second label (or “a unique label”, pg. 33:3, 1st para, last S, represented in fig. 1: “Descriptor Generation”) based on a combination (via “Fusion”, page 33:14, 2.3.4 Mulit-modal Fusion) of weak authentication factors (or fig. 2:“Soft biometrics”, represented in fig. 1: “Feature extraction”, pg. 33:3), the weak authentication factors comprising one or more of: a height, a weight and a gait (via said fig. 2: “Body measurements…Gait”).


Thus, one of ordinary skill in the art of image extraction and tracking can modify Luo’s said eqn. (8) with Nambiar’s “unique label” by:
a)	making Luo’s fig. 4 be as Nambiar’s fig. 1 by inserting Nambiar’s fig. 1: “Descriptor Generation” right after Luo’s feature extraction of fig. 4: “MSN”;
b)	making Luo’s feature extraction extract gait;
c)	combining the extracted gait with another extracted feature;
d)	assigning Nambiar’s unique label at Luo’s fig. 4: “Result=1?”: “yes”; and
e)	recognizing that the modification is predictable or looked forward to because the combination “has important applications in tracking…when…discontinuities exist” and “to improve Re-ID results” or improve the results of identifying a person and then identifying the same person again, Nambiar, pg. 33:3, 2nd para, 5th S & pg. 33:14, 2.3.4 Multi-modal Fusion, 1st S.











Claim 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo et al. (Pedestrian tracking in surveillance video based on modified CNN) in view of Madhuri et al. (Pose-Robust Recognition of Low-Resolution Face Images).
Regarding claim 5, Luo teaches the system of claim 1, wherein generating, using the supporting engine, the second label for the person depicted in the probe image comprises: 
verifying, via at least one client computing device, a correct identification for the person depicted in the probe image.  
Thus, Luo does not teach:
verifying, via at least one client computing device, a correct identification for the person depicted in the probe image.
Accordingly, Madhuri teaches:
verifying (via “server…verification”, abstract,11th S), via at least one client computing device (via “client server architecture”, id., 9th S, as shown in fig. 1: “FACE architecture”), a correct (via “ ‘correction’ ”, 2nd pg., l.col, 3rd S) identification (or matched or recognized, the past-tense, via fig. 1: “Matching” or recognition) for the person depicted in the probe image (or “Face images in the probe”, 1st pg., l. col, 1st bullet: Pose Normalization).





Thus, one of ordinary skill in the art of video surveillance, as taught by both references, can modify Luo’s tracking label with Madhuri’s “server…verification” by:
a)	making Luo’s tracking L2 label be as shown in Madhuri’s server of fig. 1: “Reliability Evaluation”, comprising a “Face exemplars…label”, Madhuri, penultimate pg., l. col, last para, 2nd S);
b)	making Luo’s surveillance/tracking cameras be as clients via the upper half of Madhuri’s fig. 1; and
c)	recognizing that the modification is predictable or looked forward to because the modification provides said “ ‘correction’ ” during video surveillance.














Claim 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo et al. (Pedestrian tracking in surveillance video based on modified CNN) in view of Madhuri et al. (Pose-Robust Recognition of Low-Resolution Face Images) as applied in claim 5 above further in view of Howard et al. (US Patent App. Pub. No.: US 2004/0213437 A1).
Regarding claim 6, Luo as combined teaches the system of claim 5, wherein verifying the correct identification (or said matched, the past-tense, via said L2 tracking label as modified via the combination with a Face exemplars label) comprises: 
providing, for display at the at least one client computing device, the probe image and a plurality of possible (via said “a probability”) identifications for the person; and 
receiving, from the at least one client device, a selection (via “the same targeted pedestrians selected”, pg. 24045, 2nd S) of one of the possible identifications as the correct identification.
	Thus, Howard teaches:
providing, for display (via fig. 1:20: “DISPLAY”) at the at least one client computing device (via fig. 2:10: “WORKSTATION”), the probe image (via fig. 13:100: “Probe Image”) and a plurality of possible identifications for the person (as shown in fig. 13:700: a display of faces); and 
receiving (via fig. 2: arrows), from the at least one client device (via fig. 2:10: “WORKSTATION”), a selection (via “selecting at least one of the images”, [0147]) of one of the possible identifications as the correct (via “color correction”, [0144], represented in fig. 2:15: “IMAGE/DATA CAPTURE”) identification (as shown in fig. 13).  
Thus, one of ordinary skill in the art of surveillance can modify the combination’s client (a camera with normalization correction) with Howards said fig. 1:20: “DISPLAY” by:
a)	making the combination’s client be as shown in Howard’s fig. 2:15: “IMAGE/DATA CAPTURE” to be displayed via Howards figs.1,2:10,20: “DISPLAY”: “WORKSTATION”;
b)	making the server tracking reliability confidence L2 label be determined at Howard’s fig. 2:25: “FACIAL RECOGNITION SEARCH SYSTEM”; and
c)	recognizing that the modification is predictable or looked forward to because the modification provides another correction in the combination: color correction in addition to normalization.












Claim 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo et al. (Pedestrian tracking in surveillance video based on modified CNN) in view of Madhuri et al. (Pose-Robust Recognition of Low-Resolution Face Images) as applied in claim 5 above further in view of Messer et al. (US Patent App. Pub. No.: US 2020/0105111 A1).
Regarding claim 7, Luo as combined teaches the system of claim 5, wherein the at least one client computing device (via said L2 label as modified via the combination) comprises an administrator client computing device (via said “client server architecture”) and N (or 4 boxes in the upper half of Madhuri’s fig. 1: (1)“SPSI” ; (2)“Sample Selection”;(3) “Pose Normalization”; (4) “Illumination Normalization”) employee (via the server of said “client server architecture” comprising administration comprising an office comprising a staff comprising a group as employees) client computing devices (or any one box of said “client server architecture” as shown in Madhuri’s fig. 1), wherein N (said 4) is a positive integer greater than or equal to two, wherein verifying the correct identification (being the past-tense via being said matched or recognized) comprises: 
providing the probe image to at least a portion (or any one box of said “client server architecture”) of the N employee client computing devices; 
upon receiving (via Madhuri: fig. 1:arrows), from at least M (via said 4) employee client computing devices (said or any one box), a consistent identification (via said “server… verification”) of the person: verifying (via said “server…verification”) that the consistent identification is the correct identification (being the past-tense or matched via Madhuri’s fig. 1: “Matching” that is server-verified being consistent with the match), wherein M is a positive integer between half of N and N; and 
upon failing to receive, from the at least M employee client computing devices, the consistent identification of the person: providing the probe image to the administrator client computing device for verifying the correct identification via the administrator client computing device (wherein server is defined via Dictionary.com:
CULTURAL DEFINITIONS FOR SERVER
server
Computer or software that performs administration or coordination functions within a network.

wherein “administration” is defined:
administration
noun
1	the management of any office, business, or organization; direction.

wherein “office” is defined:
office
noun
4	the staff or designated part of a staff at a commercial or industrial organization:
The whole office was at his wedding.

wherein “staff” is defined:
staff
noun, plural staffs for 1-5, 9; staves  [steyvz] or staffs for 6-8, 10, 11.
1	a group of persons, as employees, charged with carrying out the work of an establishment or executing some undertaking.).  

Thus, Luo as combined does not teach:
upon failing to receive, from the at least M employee client computing devices, the consistent identification of the person: providing the probe image to the administrator client computing device for verifying the correct identification via the administrator client computing device.  


Accordingly, Messer teaches:
	upon failing (resulting in fig. 6:602: “Facial Recognition event with poor confidence”) to receive (and thus resulting in fig. 6:603: “Prompt issued to user”), from the at least M employee client computing devices (via fig. 1:1A,1B), the consistent  identification (and thus instead resulting in fig. 6:607: “Confirmation received from operator”) of the person (represented as a face via fig. 6:602: “Facial Recognition event with poor confidence”): providing the probe image to the administrator (via fig. 1:21,22: “Server A”: “Server B”) client (via fig. 1:4: “Customer Viewing Equipment”) computing device (as shown in fig. 1:all via said upon failing, “This prompt can include the… probe image”, [0301]) for verifying (via said fig. 6:607: “Confirmation received from operator”) the correct identification (in the past-tense via said fig. 6:602: “Facial Recognition event with poor confidence” “appears to have correctly identified the subject”, [0302], penultimate S) via the administrator client computing device.  
Thus, one of ordinary skill in matching and surveillance can modify the combination’s said M=N=4 boxes with Messer’s said fig. 6:603: “Prompt issued to user” by:
a)	making said four boxes be as Messer’s fig. 1:1A,1B;
b)	installing the program of Messer’s fig. 6:601-607 into the server half of Madhuri’s fig. 1: “Matching”; and
c)	recognizing that the modification is predictable or looked forward to because the modification is used such that the “surveillance system may serve to issue a prompt to user to attempt to acquire a better image”, Messer [0297].

Claim 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo et al. (Pedestrian tracking in surveillance video based on modified CNN) in view of Madhuri et al. (Pose-Robust Recognition of Low-Resolution Face Images) as applied in claim 5 above further in view of Messer et al. (US Patent App. Pub. No.: US 2020/0105111 A1) as applied in claim 7 above further in view of NIKNAM et al. (EP 2 978 249 A1).
Regarding claim 8, Luo as combined teaches the system of claim 7, wherein the N employee client computing devices are selected based on a corporate department or an office geographic location of at least one of the plurality of possible identifications. Thus, Luo as combined does not teach claim 8 as a whole. Accordingly, Niknam teaches claim 8 of:
The N employee client computing devices are selected (resulting in “selecting one or more client devices”, abstract, ll. 9,10) based on a corporate department or an office (via a “server system”, id., l.8, comprising said office staff employees) geographic location (or “predetermined” “geographical location”, id, ll. 10,11) of (Markush limitation follows: at least one) at least one of the plurality of possible identifications (via “possibly… recognizing each other’s face”, [0009], penultimate S and [0010], 4th S, as shown in fig. 5(b):510: happy face).





Thus, one of ordinary skill in the art of clients and servers can modify the combination’s memory machine as modified via the combination of Madhuri as modified via Messer with Niknam’s said “selecting one or more client devices” by:
a)	making the combination’s memory machine be as Niknam’s fig. 1:110: a server and 101-105:clients;
b)	performing face recognition or identification or authentication over the internet; and
c)	recognizing that the modification is predictable or looked forward to because the modification results in a “minimum of overhead”, Nikman [0010] 1st S.














Claims 9,10,11 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo et al. (Pedestrian tracking in surveillance video based on modified CNN) in view of Bhatt et al. (Improving Cross-Resolution Face Matching Using Ensemble-Based Co-Transfer Learning).
Regarding claim 9, Luo teaches the system of claim 1, wherein generating, using the supporting engine, the second label for the person depicted in the probe image comprises: 
providing the probe image to a training dataset for a semi-supervised learning facial classification engine; 
training the semi-supervised learning facial classification engine using the training dataset; 
generating, using the semi-supervised learning facial classification engine, the second label for the person depicted in the probe image and a probability value for the second label; and 
adjusting the trained facial classification neural engine based on the trained semi-supervised learning facial classification engine.  
Thus, Luo does not teach claim 9 as a whole.






Accordingly, Bhat teaches:
providing (via figs. 4,5:arrows) the probe image (via figs. 4,5: “Unlabeled probe instances from TD”: “Training data in target domain”) to a training dataset (or fig. 4: “Knowledge learnt from SD”) for a semi-supervised learning facial classification engine (so as to use “transfer learning for face recognition as a semi-supervised approach”, pg. 5658, left col, 1st full para, last S); 
training (via figs. 4,5: “Co-training”) the semi-supervised learning facial classification engine using the training dataset; 
generating, using the semi-supervised learning facial classification engine, the second label (via fig. 5: “Pseudo labels provided by E”) for the person (as shown in figs. 1,2) depicted in the probe image and a probability value (or a “confidently predicting…distance”, pg. 5659, l. col, penultimate S) for the second label; and 
adjusting (via “Updating…to…adjust”, pg. 5662, r. col, 1st bullet, represented in fig. 5 as weight “w”) the trained facial classification neural engine (as shown in fig. 5) based on the trained semi-supervised learning facial classification engine (as shown in fig. 5).







Thus, one of ordinary skill in the art of classifiers and tracking, as taught by both references: “track the activities”, Bhatt, pg. 5654, r. col, 5th full S, can modify Luo’s generation of the L2 loss function tracking label with Bhat’s teaching of said figs. 4,5:arrows by:
a)	making Luo’s MSN be as shown in Bhatt’s fig. 5: “Classifier” or fig. 6: “Feature extraction”: “SVM classifiers” or figs. 6(a)(b): ensemble “E”;
b)	matching faces by generating the matching label via Luo’s L1 softmax loss classification function;
c)	tracking activities by generating Luo’s contrastive L2 loss function clustering and tracking label; and
d)	recognizing that the modification is predictable or looked forward to because the modification is “efficiently matching low resolution probes with high resolution gallery”, Bhatt, right column 1st bullet.










Regarding claim 10, Luo as combined teaches the system of claim 9, wherein providing the probe image to the training dataset for the semi-supervised learning facial classification engine is in response to determining (via said MSN as modified via the combination of said “confidently predicting…distance”, pg. 5659, left col, penultimate S) that a quality (or fig. 5: “View” being a feature, represented in Bhatt’s fig. 6: “Feature extraction”, wherein “feature” comprises a distinguishing quality) of the probe image exceeds a quality threshold (or “the genuine threshold”, id., wherein “feature” is defined via Dictionary.com:
feature
noun
1	a prominent or conspicuous part or characteristic:
Tall buildings were a new feature on the skyline.

wherein “characteristic” is defined:
characteristic
noun
a distinguishing feature or quality:
Generosity is his chief characteristic.).  

Regarding claim 11, Luo as combined teaches the system of claim 10, wherein the quality of the probe image is computed using a quality measuring neural engine (via Bhatt’ Algorithm 1: “Process: Train classifiers” in page 5660 and represented in figs. 5,6(b): “Co-training”, comprising said MSN).  






Regarding claim 18, claim 18 is rejected the same as claim 9. Thus, argument presented in claim 9 is equally applicable to claim 18. Accordingly, Luo as combined teaches claim 18 of the machine-readable medium of claim 16, wherein generating, using the supporting engine, the second label for the person depicted in the probe image comprises: 
providing the probe image to a training dataset for a semi-supervised learning facial classification engine; 
training the semi-supervised learning facial classification engine using the training dataset; 
generating, using the semi-supervised learning facial classification engine, the second label for the person depicted in the probe image and a probability value for the second label; and
adjusting the trained facial classification neural engine based on the trained semi-supervised learning facial classification engine.  









Claim 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo et al. (Pedestrian tracking in surveillance video based on modified CNN) in view of Bhatt et al. ( Improving Cross-Resolution Face Matching Using Ensemble-Based Co-Transfer Learning) as applied in claim 10 above further in view of Ahonen et al. (Recognition of Blurred Faces Using Local Phase Quantization).
Regarding claim 12, Luo as combined teaches the system of claim 10, wherein the quality of the probe image comprises a blurriness of the probe image.  
Thus, Luo as combined does not teach “the quality of the probe image comprises a blurriness”. Accordingly, Ahonen, as cited by Bhatt, teaches “Blur…quality of…imaging”, 1st pg., l.col, last S.
Thus, one of ordinary skill in the art of features can modify the combination’s feature extraction with Ahonen’s “Blur…quality of…imaging” by:
a)	making the feature extraction of the combination also include Bhatt’s LPQ as shown in Bhatt’s fig. 6: “LPQ/SIFT”; and
b)	recognizing that the modification is predictable or looked forward to because LPQ “is very robust not only to blur but also to other challenges such as lighting and facial expression variations present in real-world images”, Ahonen, last pg., r.col, 1st full para, 4th S.





Claim 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo et al. (Pedestrian tracking in surveillance video based on modified CNN) in view of Bhatt et al. (Improving Cross-Resolution Face Matching Using Ensemble-Based Co-Transfer Learning) as applied in claim 9 above further in view of Alba-Castro et al. (Audiovisual biometric verification).
Regarding claim 13, Luo as combined teaches the system of claim 9, wherein generating, using the supporting engine, the second label for the person depicted in the probe image further comprises: 
determining that the probability value (via said “confidently predicting…distance”, pg. 5659, left col, penultimate S) for the second label is below a probability threshold (via said “the genuine threshold”, id., in the rejection of claim 10); and 
in response to the probability value for the second label being below the probability threshold: verifying, via at least one client computing device, a correct identification for the person depicted in the probe image.
Thus, Luo as combined does not teach:
in response to the probability value for the second label being below the probability threshold: verifying, via at least one client computing device, a correct identification for the person depicted in the probe image.
	




Accordingly, Alba-Castro teaches:
in response to the probability value for the second label being below the probability threshold (via pg. 190: fig. 7: “Threshold” and “Thresholding”): verifying (expressing the act or result of verify via “face…verification”, pg. 180: abstract, penultimate S), via at least one client computing device (via “client-server architectures”, pg., 197, 3rd full para, 1st S), a correct identification (or “true identity”, pg. 190, last S) for the person (as shown in pg. 182: fig. 1) depicted in the probe image (via fig. 6: “Probe image”).
Thus, one of ordinary skill in the art of face matching can modify the combination’s MSN as modified via Bhatt with Alba-Castro’s teaching of said fig. 7: “Threshold” and “Thresholding” by:
a)	making said combination’s “the genuine threshold” be as shown in Alba-Castro’s said fig. 7: “Threshold” and “Thresholding”;
b)	making Luo’s 8G memory machine be as Alba-Castro’s said “client-server architecture” by installing “The algorithms…in a web”, Alba-Castro, pg. 181, 1st full para, 2nd S); and 
c)	recognizing that the modification is predictable or looked forward to because “ using a global threshold show very promising results if compared to the best results of all the tests, obtained using accuracy-based node selection with user-specific thresholds. In general, a global threshold is preferred to user specific thresholds because the system will be less database-dependent and performance should not decrease too much on actual running-time.”, Alba-Castro, pg. 199, 2nd para, 3rd S.
Suggestions
Applicant’s disclosure states [0035]:
“The authentication engine generates the second label by determining, using the trained facial classification neural engine, that at least a threshold number of the plurality of images (e.g., at least five images or at least 60% of the images) have a specified identification with a probability within a predefined high accuracy range (e.g., at least 90%).”

	This appears as clear difference under 35 USC 103 as applied above.
Note that these suggestions are not provided with respect to overcoming 35 USC 101,112,102 and/or 103. These suggestion are mainly provided to seek out advantages in the disclosure regardless of 35 USC 101,112,102 and/or 103.
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 



Any inquiry concerning this communication or earlier communications from the examiner should be directed to DENNIS ROSARIO whose telephone number is (571)272-7397. The examiner can normally be reached Monday-Friday, 9AM-5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Bella can be reached on (571)272-7778. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/DENNIS ROSARIO/Examiner, Art Unit 2667  

/MATTHEW C BELLA/Supervisory Patent Examiner, Art Unit 2667