DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Examiner notes the entry of the following papers:
Amended claims filed 4/18/2022.
Applicant’s remarks made in amendment filed 4/18/2022.
Amendments to specification filed 4/18/2022.
Claims 21, 24, 26, 29-34, 36, 37, and 39 are amended. Claims 21-40 are pending.
Response to Arguments
Applicant presents several arguments. Each is addressed.	
Applicant argues that “The ‘first model’ and the ‘second model’ recited in claim 	37 are not generic placeholders for the term of ‘means.’ Furthermore, each of 	these terms would be understood by persons of skill in the art to have a 	sufficiently definite meaning and structure.” (Remarks, page 13, paragraph 6.) 	However, as explained in MPEP § 2181, subsection I, claim limitations that meet 	the following three-prong test will be interpreted under 35 U.S.C. 112(f): the 	claim limitation uses the term “means” or “step” or a term used as a substitute 	for “means” that is a generic placeholder i.e. “model” for performing the claimed 	function; the term “means” or “step” or the generic placeholder “model” is 	modified by functional language, typically, but not always linked by the transition 	word “for” (e.g., “means for”) or another linking word or phrase, such as 	“configured to” or “so that”; and the term “means” or “step” or the generic 	placeholder “model”  is not modified by sufficient structure, material, or acts for 	performing the claimed function. In the instant case, “model” is a generic 	placeholder and is used in claim 37. It is modified in lines 4, 6, and 10 of claim 37 	by the phrase “configured to.” And, it is not modified by sufficient structure, 	material, or acts for performing the claimed function.  Therefore, it is proper that 	claim 37 be interpreted under 35 U.S.C. § 112(f).
Applicant argues that “In view of the amendments herein, Applicant respectfully 	submits that the claims are not directed toward a mental process, and thus 	should be found patent-eligible at Prong One.” (Remarks, page 14, paragraph 2.) 	Examiner agrees. The rejections under 35 U.S.C. § 101 are withdrawn.
Applicant argues that “Independent claim 21 is amended to recite, in part, 	‘training a third model, different from the first and second models, based on (i) 	the additional data items, (ii) the reconstructions of the additional data items, 	and (iii) given reference feedback.’ Itou does not disclose at least this feature of 	amended claim 21.” (Remarks, page 16, paragraph 2.) However, Itou teaches 	this. (Itou, FIG. 2, shows additional data items and the reconstructions of 	additional data items to a third model. (Itou, P[0048], “Furthermore, after the 	adjustment processing of parameters, parameters of the first identifier 14 and 	the second identifier 16 are adjusted to reduce learning losses of the first 	identifier 14 and the second identifier 16 [which can be represented by equations 	(9) and (10)].” And, P[0050], 	“The second identifier 16 adjusts parameters to 	minimize the learning loss  [see Equation (10), where the elements] correspond 	to a first dimension and a second dimension of a two-dimension vector y output 	[given reference feedback] by the second identifier 16.”) Also, P[0016] FIG. 1 is a 	flowchart which shows an outline of processing of a learning device and an 	abnormality detection device of an embodiment.”  In other words, a learning 	device is training a detection device.) See detailed rejection. Therefore, the 	rejection under 35 U.S.C. §102 is proper and maintained.
Applicant argues that “As described above, independent claims 21, 34, and 37 	are patentable over Itou.  Zhou and Medel were cited for allegedly disclosing 	various features recited in dependent claims.  Without commenting on or 	conceding that Zhou or Medel disclose the features for which they were cited, 	Zhou and Medel nevertheless fail to cure the deficiencies of Itou with respect to 	independent claims 21, 34, and 37. Thus claims 27-33 and 40 are patentable at 	least by reason of their dependency.” (Remarks, page 17, paragraph 4.)  	However, independent claims 21, 34, and 37 remain rejected, therefore 	dependent claims 27-33 and 40 remain rejected as well.  See detailed rejection.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
	This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “a first model configured to generate,” “a second model configured to generate,” and “wherein the first model is configured to obtain … and update” in claim 37.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 21-40 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-2 and 4-13 of U.S. Patent No. 10,121,104 in view of Itou et al(US 2018/0082150, herein Itou). 
Instant Appl. No. 16/011,136
U.S. Patent No. 10,121,104
(Claim 21 - amended)
A method of facilitating anomaly detection via a multi-model architecture, the method being implemented by one or more processors executing computer program instructions that, when executed, perform the method, the method comprising: 


obtaining data items that corresponds to a concept; 

providing the data items to a first model to cause the first model to generate hidden representations of the data items from the data items; 

providing the hidden representations of the data items to a second model, different from the first model, to cause the second model to generate reconstructions of the data items from the hidden representations of the data items; 






updating one or more representation-generation-related configurations of the first model based on the data items and the reconstructions of the data items;



(added from Claim 24)
obtaining additional data items that corresponds to the concept; 

generating hidden representation of the additional data items using the first model;


generating hidden representations of the additional data items from the additional data items using the first model; 













training a third model, different from the first and second models, based on

(i) the additional data items, 

(ii) the reconstructions of the additional data items, and

(iii)  given reference feedback, the third model when trained configured 

to generate an indication that each additional data item of the additional data items and the reconstruction corresponding to the additional data item are similar; and 






assessing differences between the given data item of the additional items 

and the reconstruction corresponding to the additional data item using the third model.








(Claim 1)
A method of facilitating anomaly detection via a multi-neural-network architecture, the method being implemented by a computer system that comprises one or more processors executing computer program instructions that, when executed, perform the method, the method comprising:

obtaining data items that correspond to a concept;

providing the data items to a first neural network to cause the first neural network to generate hidden representations of the data items from the data items;

providing the hidden representations of the data items to a second neural network to cause the second neural network to generate reconstructions of the data items from the hidden representations of the data items;

providing the reconstructions of the data items as reference feedback to the first neural network to cause the first neural network to assess the reconstructions of the data items against the data items, the first neural network 

updating one or more representation-generation-related configurations of the first neural network based on the first neural network's assessment of the reconstructions of the data items; and
…

(Claim 4)
obtaining additional data items that correspond to the concept;

providing the additional data items to the first neural network to cause the first neural network to 

generate hidden representations of the additional data items from the additional data items;

providing the hidden representations of the additional data items to the second neural network to cause the second neural network to generate reconstructions of the additional data items from the hidden representations of the additional data items;

providing the additional data items, the reconstructions of the additional data items, and given reference feedback to a third neural network to cause 

the third neural network to be trained based on 

the additional data items, 

the reconstructions of the additional data items, and

the given reference feedback 


to generate an indication that each additional data item of the additional data items and the reconstruction corresponding to the additional data item are similar; and

providing the first data item and the reconstruction of the first data item to the third neural network to cause the third neural network to 

assess the differences between the first data item 

and the reconstruction of the first data item, 
the third neural network generating an indication that the first data item 

and the reconstruction of the first data item are not similar based on the differences between the first data item and the reconstruction of the first data item,

(Claim 23 -previously presented)
The method of claim 21, further comprising:

subsequent to providing theFiling Date: June 18, 2018 reconstructions of the data items, performing the following operations: 

providing a given data item to the first model to cause the first model to generate a hidden representation of the given data item from the given data item; 

providing the hidden representation of the given data item to the second model to cause the second model to generate a reconstruction of the given data item from the hidden representation of the given data item; and 


detecting an anomaly in the given data item based on differences between the given data item and the reconstruction of the given data item.


(Claim 1 – continued)


subsequent to providing the reconstructions of the data items, performing the following operations:

providing a first data item to the first neural network to cause the first neural network to generate a hidden representation of the first data item from the first data item;

providing the hidden representation of the first data item to the second neural network to cause the second neural network to generate a reconstruction of the first data item from the hidden representation of the first data item; and


detecting an anomaly in the first data item based on differences between the first data item and the reconstruction of the first data item.
(Claim 22 – previously presented)
The method of claim 21, further comprising: subsequent to providing the reconstructions of the data items, performing the following operations: 

providing a given data item to the first model to cause the first model to generate a hidden representation 
of the given data item from the given data item; and 

providing the hidden representation of the given data item to the second model to cause the second model to generate a reconstruction of the given data item from the hidden representation of the given data item, 


wherein no anomaly is detected in the given data item based on differences between the given data item and the reconstruction of the given data item.  

(Claim 2)
The method of claim 1, further comprising:
subsequent to providing the reconstructions of the data items, performing the following operations:

providing a second data item to the first neural network to cause the first neural network to generate a hidden representation of the second data item from the second data item; and

providing the hidden representation of the second data item to the second neural network to cause the second neural network to generate a reconstruction of the second data item from the hidden representation of the second data item,


wherein no anomaly is detected in the second data item based on differences between the second data item and the reconstruction of the second data item.

(Claim 24 - amended)
The method of claim 23, 

(parts of claim 24 are deleted and moved to claim 21)













wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the indication generated by the third model.  

(Claim 4)
The method of claim 1, further comprising:
subsequent to providing the reconstructions of the data items, performing the following operations:

(parts below are moved up to compare against amended claim 21 of the claimed invention.)

obtaining additional data items that correspond to the concept;

providing the additional data items to the first neural network to cause the first neural network to generate hidden representations of the additional data items from the additional data items;

providing the hidden representations of the additional data items to the second neural network to cause the second neural network to generate reconstructions of the additional data items from the hidden representations of the additional data items;

providing the additional data items, the reconstructions of the additional data items, and given reference feedback to a third neural network to cause the third neural network to be trained based on the additional data items, the reconstructions of the additional data items, and the given reference feedback to generate an indication that each additional data item of the additional data items and the reconstruction corresponding to the additional data item are similar; and

providing the first data item and the reconstruction of the first data item to the third neural network to cause the third neural network to assess the differences between the first data item and the reconstruction of the first data item, the third neural network generating an indication that the first data item and the reconstruction of the first data item are not similar based on the differences between the first data item and the reconstruction of the first data item,)

wherein detecting the anomaly comprises detecting the anomaly in the first data item based on the indication generated by the third neural network.

(Claim 25 – previously presented)

The method of claim 23, wherein the first model is configured to generate additional hidden representations of the data items from the data items subsequent to the updating of the first model, the method further comprising: 

providing the additional hidden representations of the data items to the second model to cause the second model to generate additional reconstructions of the data items from the additional hidden representations of the data items; and 


providing the additional reconstructions of the data items as reference feedback to the first model to cause the first model to assess the additional reconstructions of the data items against the data items, the first model further 

updating one or more representation-generation- related configurations of the first model based on the first model's assessment of the additional reconstructions of the data items.  

Claim 5

The method of claim 1, wherein the first neural network is configured to generate additional hidden representations of the data items from the data items subsequent to the updating of the first neural network, the method further comprising:

providing the additional hidden representations of the data items to the second neural network to cause the second neural network to generate additional reconstructions of the data items from the additional hidden representations of the data items; and

providing the additional reconstructions of the data items as reference feedback to the first neural network to cause the first neural network to assess the additional reconstructions of the data items against the data items, the first neural network further 

updating one or more representation-generation-related configurations of the first neural network based on the first neural network's assessment of the additional reconstructions of the data items.

(Claim 26 - amended)
The method of claim 25, further comprising: 




providing the given data item and the reconstruction of the given data item to the third model to cause the third model to assess the differences between the given data item and the reconstruction of the given data item, the third model generating an indication that the given data item and the reconstruction of the given data item are not similar based on the differences between the given data item and the reconstruction of the given data item, 


wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the indication generated by the third model.  

(Claim 6)
The method of claim 5, further comprising:

providing the data items, the additional reconstructions of the data items, and given reference feedback to a third neural network to cause the third neural network to be trained based on the data items, the additional reconstructions of the data items, and the given reference feedback to generate an indication that each data item of the data items and the additional reconstruction corresponding to the data item are similar; and

providing the first data item and the reconstruction of the first data item to the third neural network to cause the third neural network to assess the differences between the first data item and the reconstruction of the first data item, the third neural network generating an indication that the first data item and the reconstruction of the first data item are not similar based on the differences between the first data item and the reconstruction of the first data item,

wherein detecting the anomaly comprises detecting the anomaly in the first data item based on the indication generated by the third neural network.

(Claim 27 – previously presented)
The method of claim 26, 
wherein the third model generates one or more indications of which portions of the given data item and the reconstruction of the given data item are not similar, and 

wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more indications generated by the third model.  

(Claim 7)
The method of claim 6,
wherein the third neural network generates one or more indications of which portions of the first data item and the reconstruction of the first data item are not similar, and

wherein detecting the anomaly comprises detecting the anomaly in the first data item based on the one or more indications generated by the third neural network.

(Claim 28 – previously presented)
The method of claim 27, 
wherein the third model generates one or more additional indications of which portions of the given data item and the reconstruction of the given data item are similar, and 


wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more indications and the one or more additional indications generated by the third model.  

(Claim 8)
The method of claim 7,
wherein the third neural network generates one or more additional indications of which portions of the first data item and the reconstruction of the first data item are similar, and

wherein detecting the anomaly comprises detecting the anomaly in the first data item based on the one or more indications and the one or more additional indications generated by the third neural network.

(Claim 29 – amended )
The method of claim 25, further comprising: 

determining pairs such that each of the pairs comprises one of the data items and the additional reconstruction of another one of the data items; 

providing the pairs to [[a]] the third model to cause the third model to, with respect to each of the pairs, generate an indication of whether the corresponding data item and additional reconstruction of the pair are similar; 

providing given reference feedback to the third model to cause the third model to assess the generated indications against the given reference feedback, the given reference feedback indicating that the corresponding data item 

and additional reconstruction of each of the pairs are not similar, the third model updating one or more configurations of the third model based on the 
third model's assessment of the generated indications; and 

providing the given data item and the reconstruction of the given data item to the third model to cause the third model to assess the differences between the given data item and the reconstruction of the given data item, 

the third model generating an indication that the given data item and the reconstruction of the given data item are not similar based on the differences between the given data item and the reconstruction of the given data item, 

wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the indication generated by the third model.  

(Claim 9)
The method of claim 5, further comprising:

determining pairs such that each of the pairs comprises one of the data items and the additional reconstruction of another one of the data items;

providing the pairs to a third neural network to cause the third neural network to, with respect to each of the pairs, generate an indication of whether the corresponding data item and additional reconstruction of the pair are similar;

providing given reference feedback to the third neural network to cause the third neural network to assess the generated indications against the given reference feedback, the given reference feedback indicating that the corresponding data item 

and additional reconstruction of each of the pairs are not similar, the third neural network updating one or more configurations of the third neural network based on the third neural network's assessment of the generated indications; and

providing the first data item and the reconstruction of the first data item to the third neural network to cause the third neural network to assess the differences between the first data item and the reconstruction of the first data item, 

the third neural network generating an indication that the first data item and the reconstruction of the first data item are not similar based on the differences between the first data item and the reconstruction of the first data item,

wherein detecting the anomaly comprises detecting the anomaly in the first data item based on the indication generated by the third neural network.

(Claim 30 – amended)
The method of claim 21, further comprising: 

determining subsets of data items such that each of the data item subsets comprise at least two data items of the data items; 

providing the data item subsets to [[a]] the third model to cause the third model to, with respect to each of the data item subsets, generate an indication of whether the two data items of the data item subset are similar; 


providing given reference feedback to the third model to cause the third model to assess the generated indications against the given reference feedback, the given reference feedback indicating that the two data items of each of the data item subsets are not similar, the 
third model updating one or more configurations of the third model based on the third model's assessment of the generated indications; and 

providing the given data item and the reconstruction of the given data item to the third model to cause the third model to assess the differences between the given data item and the reconstruction of the given data item, the third model generating an indication that the given data item and the reconstruction of the given data item are not similar based on the differences between the given data item and the reconstruction of the given data item, 


wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the indication generated by the third model.  

(Claim 10)
The method of claim 1, further comprising:

determining subsets of data items such that each of the data item subsets comprise at least two data items of the data items;

providing the data item subsets to a third neural network to cause the third neural network to, with respect to each of the data item subsets, generate an indication of whether the two data items of the data item subset are similar;


providing given reference feedback to the third neural network to cause the third neural network to assess the generated indications against the given reference feedback, the given reference feedback indicating that the two data items of each of the data item subsets are not similar, the third neural network updating one or more configurations of the third neural network based on the third neural network's assessment of the generated indications; and

providing the first data item and the reconstruction of the first data item to the third neural network to cause the third neural network to assess the differences between the first data item and the reconstruction of the first data item, the third neural network generating an indication that the first data item and the reconstruction of the first data item are not similar based on the differences between the first data item and the reconstruction of the first data item,

wherein detecting the anomaly comprises detecting the anomaly in the first data item based on the indication generated by the third neural network.

(Claim 31 – amended)
The method of claim [[21]] 23, further comprising: 

deemphasizing one or more of the differences between the given data item and the reconstruction of the given data item, 

wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more deemphasized differences and one or more other ones of the differences between the given data item and the reconstruction of the given data item.  

(Claim 11)
The method of claim 1, further 
comprising:

deemphasizing one or more of the differences between the first data item and the reconstruction of the first data item,

wherein detecting the anomaly comprises detecting the anomaly in the first data item based on the one or more deemphasized differences and one or more other ones of the differences between the first data item and the reconstruction of the first data item.

(Claim 32 – amended)
The method of claim [[21]] 23, further comprising: 

emphasizing one or more of the differences between the given data item and theFiling Date: June 18, 2018 reconstruction of the given data item, 

wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more emphasized differences and one or more other ones of the differences between the given data item and 
the reconstruction of the given data item.  

(Claim 12)
The method of claim 1, further 
comprising:

emphasizing one or more of the differences between the first data item and the reconstruction of the first data item,

wherein detecting the anomaly comprises detecting the anomaly in the first data item based on the one or more emphasized differences and one or more other ones of the differences between the first data item 
and 
the reconstruction of the first data item.

(Claim 33 - amended)
The method of claim [[21]]  23, further comprising: 

deemphasizing one or more of the differences between the given data item and the reconstruction of the given data item; and 

emphasizing one or more other ones of the differences between the given data item and the reconstruction of the given data item, 

wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more deemphasized differences and the one or more emphasized differences.  

(Claim 13)
The method of claim 1, further comprising:


deemphasizing one or more of the differences between the first data item and the reconstruction of the first data item; and


emphasizing one or more other ones of the differences between the first data item and the reconstruction of the first data item,

wherein detecting the anomaly comprises detecting the anomaly in the first data item based on the one or more deemphasized differences and the one or more emphasized differences.

Though the claims in the instant application disclose first, second, and third models, the claim language in the instant application does not explicitly disclose first, second, and third “neural network” models. However, Itou is also directed toward anomaly detection using models that can generate hidden representations, generate reconstructions, and identify anomalies. Itou discloses, “In the learning processing by the learning device 1, a multilayered neural network (Deep Neural Network: DNN), a convolutional neural network (CNN), or a recurrent neural network (RNN) may also be adopted” (Itou, P[0057]). It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the models recited in the instant application to represent neural network models, as disclosed in Itou, to yield predictable results of generating hidden representations of data, generating reconstructions of the data, and detecting anomalies using the first, second, and third models.

Claims 34-36 in the instant application are directed to a system comprising one or more processors executing computer program instructions that, when executed, cause the one or more processors to perform the method recited in claims 21, 23, and 25-26. Therefore, the double patenting rejection made to claims 21, 23, and 25-26 are applied to claims 34-36.

Claims 37-40 in the instant application are directed to a system comprising components recited in claims 21, 23, and 25-27. Therefore, the double patenting rejection made to claims 21, 23, and 25-27 are also applied to claims 37-30.
Examiner notes that the amendments to the claims consist of moving parts of claim 24 into claim 21 along with minor word differences, deleting a limitation from claim 26, and insignificant changes to the other amended claims. Therefore, the non-statutory double patenting rejection is proper and maintained.  


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 21-26 and 34-39 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Itou et al (US 2018/0082150, herein Itou).
Regarding claim 21, 
	Itou teaches a method of facilitating anomaly detection via a multi-model architecture, the method being implemented by one or more processors executing computer program instructions that, when executed, perform the method (Itou, P[0031], “Some or all of the respective functional units of the learning device 1 and the abnormality detection device 1A may be realized by a processor executing a program (software).”),  the method comprising:
obtaining data items that corresponds to a concept (Itou, PP[0016-0017] and FIG. 1, “A learning device 1 performs compression processing and decoding processing (reconstruction processing) of normal data as a preparation for performing abnormality detection of data. … The learning data D1 includes, for example, arbitrary data such as sensor data measured by various types of sensors, operation log data of various types of apparatuses, data of various numerical values, and data of various categories.”); 
providing the data items to a first model to cause the first model to generate hidden representations of the data items from the data items (Itou, P[0020], “The encoder 10 [a first model] compresses the learning data D1 and generates a compressed data and outputs the compressed data to the first identifier 14 and the decoder 12.”); 
providing the hidden representations of the data items to a second model, different from the first model, to cause the second model to generate reconstructions of the data items from the hidden representations of the data items (Itou, P[0020], “The encoder 10 [a first model] compresses the learning data D1 and generates a compressed data and outputs the compressed data to the first identifier 14 and the decoder 12 [example of a second model which is different from the first model].”); 
updating one or more representation-generation-related configurations of the first model based on the data items and the reconstructions of the data items (Itou, PP[0020-0024], “The encoder 10 compresses the learning data D1 and generates a compressed data and outputs the compressed data to the first identifier 14 and the decoder 12. … The decoder 12 generates reconstruction data by decoding the compressed data input from the encoder 10, and outputs the reconstruction data to the second identifier 16. … The second identifier 16 … outputs an identification result (a second identification result) to the encoder 10 and the decoder 12. … Each of the encoder 10 and the decoder 12 adjusts the compressing parameter and the decoding parameter on the basis of the second identification result so that a difference between the reconstruction data and the learning data D1 is reduced. That is, each of the encoder 10 and the decoder 12 adjusts the compressing parameter and the decoding parameter to bring the reconstruction data closer to the learning data D1.” Itou, P[0042], “each of the encoder 10 and the decoder 12 adjusts the compressing parameter and the decoding parameter so that a difference between the reconstruction data generated by the decoder 12 and the learning data D1 on which the compression and decoding processing are not performed is reduce on the basis of the second identification result input from the second identifier 16.”); 
obtaining additional data items that correspond to the concept; (Itou, P[0055],  “When it is determined that the sampling of the learning data D1 is not completed, the encoder 10 samples at least one piece of the remaining learning data D1 and performs the compression and decoding processing, and parameter adjustment processing hereon.” Itou, P[0056], “The encoder 10 performs learning processing on the learning data group again when it is determined that the number of times learning processing has been performed is less than the predetermined number of times.”);
generating hidden representations of the additional data items from the additional data items using the first model (Itou, P[0033], “First, the encoder 10 samples at least one piece of learning data D1 from a plurality of pieces of learning data D1…, compresses the learning data D1 [generates hidden representations], and outputs the compressed data to the first identifier 14 and the decoder 12 (step S101).”);
generating reconstructions of the additional data items from the hidden representations of the additional data items using the second model (Itou, P[0034], “Next, the decoder 12 [the second model] generates reconstruction data by decoding the compressed data input [the hidden representations] from the encoder 10, and outputs the reconstruction data to the second identifier 16 (step S103).”);
training a third model, different from the first and second models, based on (i) the additional data items, (ii) the reconstructions of the additional data items, and (iii) given reference feedback, the third model when trained configured to generate an indication that each additional data item of the additional data items and the reconstruction corresponding to the additional data item are similar (Itoh, FIG. 2 shows additional data items and the reconstructions of the additional data items to a third model, which may be represented by the second identifier, and, Itou P[0048], “Furthermore, after the adjustment processing of parameters, parameters of the first identifier 14 and the second identifier 16 are adjusted to reduce learning losses of the first identifier 14 and the second identifier 16 [which can be represented by equations (9) and (10)].” And, Itou, P[0050], “The second identifier 16 adjusts parameters to minimize the learning loss [see Equation (10), where the elements] correspond to a first dimension and a second dimension of a two-dimensional vector y output [given reference feedback] by the second identifier 16.”); and
assessing differences between the given data item and the reconstruction of the given data item using the third model (Itoh, P[0063], “Next, the second identifier 16 calculates an abnormality degree indicating a degree of the difference between the reconstruction data input from the decoder 12 and then the detection data D2, and performs abnormality detection of data on the basis of this abnormality degree (step S207). The second identifier 16 determines that data is abnormal (a second abnormality) when the abnormality degree is equal to or greater than a predetermined threshold value (a second threshold value)[i.e. assesses differences], and determines that data is normal when the abnormality degree is less than the second threshold value.”)
Regarding claim 22, 
	Itou teaches the method of claim 21, further comprising: 
subsequent to providing the reconstructions of the data items (Itou, PP[0016-0018], “For example, the learning device 1 performs compression and decoding processing of learning data D1, and calculates a learning parameter P1 which adapts to the learning data D1. … An abnormality detection device 1A detects abnormality of detection data (input data) D2 using the learning parameter P1 calculated by the learning device 1, and outputs a detection result R1.” Itou, PP[0058-0059], “an operation of the abnormality detection device 1A of the embodiment will be described. … the encoder 10 compresses the detection data D2 and outputs the compressed data to the first identifier 14 and the decoder 12.”), perform the following operations:
providing a given data item to the first model to cause the first model to generate a hidden representation of the given data item from the given data item (Itou, P[0059], “First, the encoder 10 compresses the detection data D2 and outputs the compressed data to the first identifier 14 and the decoder 12 (step S201).”); and 
providing the hidden representation of the given data item to the second model to cause the second model to generate a reconstruction of the given data item from the hidden representation of the given data item (Itou, P[0062], “Next, the decoder 12 generates reconstruction data by decoding compression data input from the encoder 10, and outputs the reconstruction data to the second identifier 16 (step S205).”), 
wherein no anomaly is detected in the given data item based on differences between the given data item and the reconstruction of the given data item (Itou, P[0063], “Next, the second identifier 16 calculates an abnormality degree indicating a degree of the difference between the reconstruction data input from the decoder 12 and then the detection data D2, and performs abnormality detection of data on the basis of this abnormality degree (step S207). The second identifier 16 determines that data is abnormal (a second abnormality) when the abnormality degree is equal to or greater than a predetermined threshold value (a second threshold value), and determines that data is normal when the abnormality degree is less than the second threshold value.”).
Regarding claim 23, 
	Itou teaches the method of claim 21, further comprising: 
subsequent to providing the reconstructions of the data items (Itou, PP[0016-0018], “For example, the learning device 1 performs compression and decoding processing of learning data D1, and calculates a learning parameter P1 which adapts to the learning data D1. … An abnormality detection device 1A detects abnormality of detection data (input data) D2 using the learning parameter P1 calculated by the learning device 1, and outputs a detection result R1.” Itou, PP[0058-0059], “an operation of the abnormality detection device 1A of the embodiment will be described. … the encoder 10 compresses the detection data D2 and outputs the compressed data to the first identifier 14 and the decoder 12.”), performing the following operations: 
providing a given data item to the first model to cause the first model to generate a hidden representation of the given data item from the given data item (Itou, P[0059], “First, the encoder 10 compresses the detection data D2 and outputs the compressed data to the first identifier 14 and the decoder 12 (step S201).”); 
providing the hidden representation of the given data item to the second model to cause the second model to generate a reconstruction of the given data item from the hidden representation of the given data item (Itou, P[0062], “Next, the decoder 12 generates reconstruction data by decoding compression data input from the encoder 10, and outputs the reconstruction data to the second identifier 16 (step S205).”); and 
detecting an anomaly in the given data item based on differences between the given data item and the reconstruction of the given data item (Itou, P[0063], “Next, the second identifier 16 calculates an abnormality degree indicating a degree of the difference between the reconstruction data input from the decoder 12 and then the detection data D2, and performs abnormality detection of data on the basis of this abnormality degree (step S207). The second identifier 16 determines that data is abnormal (a second abnormality) when the abnormality degree is equal to or greater than a predetermined threshold value (a second threshold value), and determines that data is normal when the abnormality degree is less than the second threshold value.”).
Regarding claim 24, 
	Itou teaches the method of claim 23, 
wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the indication generated by the third model (Itou, P[0063], “Next, the second identifier 16 calculates an abnormality degree indicating a degree of the difference between the reconstruction data input from the decoder 12 and then the detection data D2, and performs abnormality detection of data on the basis of this abnormality degree (step S207). The second identifier 16 determines that data is abnormal (a second abnormality) when the abnormality degree is equal to or greater than a predetermined threshold value (a second threshold value), and determines that data is normal when the abnormality degree is less than the second threshold value.”).
Regarding claim 25, 
	Itou teaches the method of claim 23, 
	wherein the first model is configured to generate additional hidden representations of the data items from the data items subsequent to the updating of the first model (Itou, P[0055], “When it is determined that the sampling of the learning data D1 is not completed, the encoder 10 samples at least one piece of the remaining learning data D1 and performs the compression and decoding processing, and parameter adjustment processing hereon.” Itou, P[0056], “The encoder 10 performs learning processing on the learning data group again when it is determined that the number of times learning processing has been performed is less than the predetermined number of times.”), the method further comprising: 
providing the additional hidden representations of the data items to the second model to cause the second model to generate additional reconstructions of the data items from the additional hidden representations of the data items (Itou, P[0062], “Next, the decoder 12 generates reconstruction data by decoding compression data input from the encoder 10, and outputs the reconstruction data to the second identifier 16 (step S205).”); and 
providing the additional reconstructions of the data items as reference feedback to the first model to cause the first model to assess the additional reconstructions of the data items against the data items, the first model further updating one or more representation-generation-related configurations of the first model based on the first model's assessment of the additional reconstructions of the data items (Itou, P[0042], “each of the encoder 10 and the decoder 12 adjusts the compressing parameter and the decoding parameter so that a difference between the reconstruction data generated by the decoder 12 and the learning data D1 on which the compression and decoding processing are not performed is reduce on the basis of the second identification result input from the second identifier 16.”).
Regarding claim 26, 
	Itou teaches the method of claim 25, further comprising: 
providing the given data item and the reconstruction of the given data item to the third model to cause the third model to assess the differences between the given data item and the reconstruction of the given data item and the reconstruction of the given data item are not similar based on the differences between the given data item and the reconstruction of the given data item (Itou, P[0063], “Next, the second identifier 16 calculates an abnormality degree indicating a degree of the difference between the reconstruction data input from the decoder 12 and then the detection data D2, and performs abnormality detection of data on the basis of this abnormality degree (step S207). The second identifier 16 determines that data is abnormal (a second abnormality) when the abnormality degree is equal to or greater than a predetermined threshold value (a second threshold value), and determines that data is normal when the abnormality degree is less than the second threshold value.”), 
wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the indication generated by the third model (Itou, P[0063], “Next, the second identifier 16 calculates an abnormality degree indicating a degree of the difference between the reconstruction data input from the decoder 12 and then the detection data D2, and performs abnormality detection of data on the basis of this abnormality degree (step S207). The second identifier 16 determines that data is abnormal (a second abnormality) when the abnormality degree is equal to or greater than a predetermined threshold value (a second threshold value), and determines that data is normal when the abnormality degree is less than the second threshold value.”).
Claims 34-36 are directed to a system comprising one or more processors executing computer program instructions that, when executed, cause the one or more processors to perform the method recited in method claims 21, 23, and 25-26. Therefore, the rejection made to method claims 21, 23, and 25-26 are applied to system claims 34-36.
Claims 37-39 are directed to a system comprising elements recited in method claims 21, 23, and 25-26. Therefore, the rejection made to method claims 21, 23, and 25-26 are applied to system claims 37-39.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 27-28, 31-33, and 40 is/are rejected under 35 U.S.C. 103 as being unpatentable over Itou in view of Zhou et al (Anomaly Detection with Robust Deep Autoencoders, herein Zhou).
Regarding claim 27, 
	Itou teaches the method of claim 26, wherein
Thus far, Itou does not explicitly teach the third model generates one or more indications of which portions of the given data item and the reconstruction of the given data item are not similar, and 
wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more indications generated by the third model. 
Zhou teaches the third model generates one or more indications of which portions of the given data item and the reconstruction of the given data item are not similar (Zhou, p. 670, Section 5.3, “As shown in Figure 4 our experiment proceeds as follows.                         
                            λ
                        
                     is used to control the sparsity of                         
                            S
                        
                    . in particular, a small                         
                            λ
                        
                     places a small penalty on                         
                            S
                        
                    , and the RDA [Robust Deep Autoencoder] emphasizes minimizing the reconstruction error by marking many images as anomalous and giving rise to many false-positives.                         
                            λ
                        
                     then can be increased to trade-off false-positives for false negatives.” See also Zhou, p. 672, Figure 4.), and 
Zhou teaches wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more indications generated by the third model (Zhou, p. 670, Section 5.3, “As shown in Figure 4 our experiment proceeds as follows.                         
                            λ
                        
                     is used to control the sparsity of                         
                            S
                        
                    . in particular, a small                         
                            λ
                        
                     places a small penalty on                         
                            S
                        
                    , and the RDA emphasizes minimizing the reconstruction error by marking many images as anomalous and giving rise to many false-positives.                         
                            λ
                        
                     then can be increased to trade-off false-positives for false negatives.” See also Zhou, p. 672, Figure 4.).
	Both Itou and Zhou are directed to anomaly detection using deep autoencoders. It would have been obvious to one ordinary skill in the art before the effective filing date to modify the third model in Itou to include generating one or more indications of which portions of the given data item and the reconstruction of the given data item are not similar, as disclosed in Zhou. Doing so provides the advantage of being able to visually observe anomalies in given data, as shown in Figures 1 and 4 of Zhou. Further, doing so allows for the ability to adjust parameters of the model to trade-off false positives and false negatives (Zhou, p. 670, Section 5.3).
24.	Regarding claim 28, 
	The combination of Itou and Zhou teaches the method of claim 27, wherein
the third model generates one or more additional indications of which portions of the given data item and the reconstruction of the given data item are similar (Zhou, p. 670, Section 5.3, “As shown in Figure 4 our experiment proceeds as follows.                         
                            λ
                        
                     is used to control the sparsity of                         
                            S
                        
                    . in particular, a small                         
                            λ
                        
                     places a small penalty on                         
                            S
                        
                    , and the RDA [Robust Deep Autoencoder] emphasizes minimizing the reconstruction error by marking many images as anomalous and giving rise to many false-positives.                         
                            λ
                        
                     then can be increased to trade-off false-positives for false negatives.” See also Zhou, p. 672, Figure 4, which shows at least four instances of generating indications of which portions of the given data item and the reconstruction of the given data item are similar.), and 
wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more indications and the one or more additional indications generated by the third model (Zhou, p. 670, Section 5.3, “As shown in Figure 4 our experiment proceeds as follows.                         
                            λ
                        
                     is used to control the sparsity of                         
                            S
                        
                    . in particular, a small                         
                            λ
                        
                     places a small penalty on                         
                            S
                        
                    , and the RDA emphasizes minimizing the reconstruction error by marking many images as anomalous and giving rise to many false-positives.                         
                            λ
                        
                     then can be increased to trade-off false-positives for false negatives.” See also Zhou, p. 672, Figure 4.).
Regarding claim 31, 
	The combination of Itou and Zhou teaches the method of claim 23 the method, further comprising:
Thus far, the combination of Itou and Zhou does not explicitly teach deemphasizing one or more of the differences between the given data item and the reconstruction of the given data item, 
wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more deemphasized differences and one or more other ones of the differences between the given data item and the reconstruction of the given data item.
Zhou teaches deemphasizing one or more of the differences between the given data item and the reconstruction of the given data item (Zhou, p. 670, Section 5.3, “As shown in Figure 4 our experiment proceeds as follows.                         
                            λ
                        
                     is used to control the sparsity of                         
                            S
                        
                    . In particular, a small                         
                            λ
                        
                     places a small penalty on                         
                            S
                        
                    , and the RDA emphasizes minimizing the reconstruction error by marking many images as anomalous and giving rise to many false-positives.                         
                            λ
                        
                     then can be increased [deemphasizing one or more of the differences] to trade-off false-positives for false negatives. Accordingly, the optimal                         
                            λ
                        
                     should balance both false-positive and false-negative rates. Thus, we use the F1-score to select the optimal                         
                            λ
                        
                    .” See also Zhou, p. 672, Figure 4.), 
Zhou teaches wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more deemphasized differences and one or more other ones of the differences between the given data item and the reconstruction of the given data item (Zhou, p. 672, Figure 4, “This figure shows how the sparsity of                         
                            S
                             
                        
                    changes with different                         
                            λ
                        
                     values.” Figure 4 shows instances of detected anomalies, including false positives and false negatives.).
	Both Zhou and the combination of Itou and Zhou are directed to anomaly detection using deep autoencoders. It would have been obvious to one ordinary skill in the art before the effective filing date to modify Itou to include deemphasizing one or more differences between the given data item and the reconstruction of the given data item, as disclosed in Zhou, to yield predictable results of adjusting parameters for the model. Doing so provides the advantage of adjusting parameters of the model to trade-off false positives and false negatives (Zhou, p. 670, Section 5.3).
Regarding claim 32, 
The combination of Itou and Zhou teaches the method of claim 23, further comprising: 
Thus far, the combination of Itou does not explicitly teach emphasizing one or more of the differences between the given data item and the reconstruction of the given data item, 
wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more emphasized differences and one or more other ones of the differences between the given data item and the reconstruction of the given data item. 
	Zhou teaches emphasizing one or more of the differences between the given data item and the reconstruction of the given data item (Zhou, p. 670, Section 5.3, “As shown in Figure 4 our experiment proceeds as follows.                         
                            λ
                        
                     is used to control the sparsity of                         
                            S
                        
                    . In particular, a small                         
                            λ
                        
                     places a small penalty on                         
                            S
                        
                    , and the RDA emphasizes minimizing the reconstruction error [emphasizing one or more of the differences] by marking many images as anomalous and giving rise to many false-positives.                         
                            λ
                        
                     then can be increased  to trade-off false-positives for false negatives. Accordingly, the optimal                         
                            λ
                        
                     should balance both false-positive and false-negative rates. Thus, we use the F1-score to select the optimal                         
                            λ
                        
                    .” See also Zhou, p. 672, Figure 4.), 
Zhou teaches wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more emphasized differences and one or more other ones of the differences between the given data item and the reconstruction of the given data item (Zhou, p. 672, Figure 4, “This figure shows how the sparsity of                         
                            S
                             
                        
                    changes with different                         
                            λ
                        
                     values.” Figure 4 shows instances of detected anomalies, including false positives and false negatives.).
	Both Zhou and the combination of Itou and Zhou are directed to anomaly detection using deep autoencoders. It would have been obvious to one ordinary skill in the art before the effective filing date to modify Itou to include emphasizing one or more differences between the given data item and the reconstruction of the given data item, as disclosed in Zhou, to yield predictable results of adjusting parameters for the model. Doing so provides the advantage of adjusting parameters of the model to trade-off false positives and false negatives (Zhou, p. 670, Section 5.3).
Regarding claim 33,
	The combination of Itou and Zhou teaches the method of claim 23, further comprising: 
Thus far, the combination of Itou and Zhou does not explicitly teach deemphasizing one or more of the differences between the given data item and the reconstruction of the given data item; and 
emphasizing one or more other ones of the differences between the given data item and the reconstruction of the given data item, 
wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more deemphasized differences and the one or more emphasized differences.
	Zhou teaches deemphasizing one or more of the differences between the given data item and the reconstruction of the given data item (Zhou, p. 670, Section 5.3, “As shown in Figure 4 our experiment proceeds as follows.                         
                            λ
                        
                     is used to control the sparsity of                         
                            S
                        
                    . In particular, a small                         
                            λ
                        
                     places a small penalty on                         
                            S
                        
                    , and the RDA emphasizes minimizing the reconstruction error by marking many images as anomalous and giving rise to many false-positives.                         
                            λ
                        
                     then can be increased [deemphasizing one or more of the differences] to trade-off false-positives for false negatives. Accordingly, the optimal                         
                            λ
                        
                     should balance both false-positive and false-negative rates. Thus, we use the F1-score to select the optimal                         
                            λ
                        
                    .” See also Zhou, p. 672, Figure 4.); and 
Zhou teaches emphasizing one or more other ones of the differences between the given data item and the reconstruction of the given data item (Zhou, p. 670, Section 5.3, “As shown in Figure 4 our experiment proceeds as follows.                         
                            λ
                        
                     is used to control the sparsity of                         
                            S
                        
                    . In particular, a small                         
                            λ
                        
                     places a small penalty on                         
                            S
                        
                    , and the RDA emphasizes minimizing the reconstruction error [emphasizing one or more of the differences] by marking many images as anomalous and giving rise to many false-positives.                         
                            λ
                        
                     then can be increased  to trade-off false-positives for false negatives. Accordingly, the optimal                         
                            λ
                        
                     should balance both false-positive and false-negative rates. Thus, we use the F1-score to select the optimal                         
                            λ
                        
                    .” See also Zhou, p. 672, Figure 4.), 
Zhou teaches wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the one or more deemphasized differences and the one or more emphasized differences (Zhou, p. 672, Figure 4, “This figure shows how the sparsity of                         
                            S
                             
                        
                    changes with different                         
                            λ
                        
                     values.” Figure 4 shows instances of detected anomalies, including false positives and false negatives.).
	Both Zhou and the combination of Itou and Zhou are directed to anomaly detection using deep autoencoders. It would have been obvious to one ordinary skill in the art before the effective filing date to modify Itou to include deemphasizing and emphasizing one or more differences between the given data item and the reconstruction of the given data item, as disclosed in Zhou, to yield predictable results of adjusting parameters for the model. Doing so provides the advantage of adjusting parameters of the model to trade-off false positives and false negatives (Zhou, p. 670, Section 5.3).
Claim 40 is directed to a system comprising elements recited in claim 27. Therefore, the rejection made to claim 27 is applied to claim 40.
Claims 29-30 is/are rejected under 35 U.S.C. 103 as being unpatentable over Itou in view of Medel et al (Anomaly Detection in Video Using Predictive Convolutional Long Short-Term Memory Networks, herein Medel).
Regarding claim 29, 
	Itou teaches the method of claim 25, further comprising:
Thus far, Itou does not explicitly teach determining pairs such that each of the pairs comprises one of the data items and the additional reconstruction of another one of the data items; 
providing the pairs to a third model to cause the third model to, with respect to each of the pairs, generate an indication of whether the corresponding data item and additional reconstruction of the pair are similar; 
providing given reference feedback to the third model to cause the third model to assess the generated indications against the given reference feedback, the given reference feedback indicating that the corresponding data item and additional reconstruction of each of the pairs are not similar, the third model updating one or more configurations of the third model based on the third model's assessment of the generated indications; and 
providing the given data item and the reconstruction of the given data item to the third model to cause the third model to assess the differences between the given data item and the reconstruction of the given data item, the third model generating an indication that the given data item and the reconstruction of the given data item are not similar based on the differences between the given data item and the reconstruction of the given data item, 
wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the indication generated by the third model.
	Medel teaches determining pairs such that each of the pairs comprises one of the data items and the additional reconstruction of another one of the data items (Medel, p. 3, Section 3, “When using a simple encoder-decoder model, the target values of the model determine the model application. When the target output is the input [one of the data items], the model creates a reconstruction of the input video sequence. When the target output is subsequent frames [the additional reconstruction of another one of the data items], the model learns to predict the future of the video sequence. The simple encoder-decoder model is improved by combining the reconstruction and prediction models into a composite model, as seen in Fig. 3. Both the current and future video sequences are target outputs.”); 
Medel teaches providing the pairs to a third model to cause the third model to, with respect to each of the pairs, generate an indication of whether the corresponding data item and additional reconstruction of the pair are similar (Medel, p. 5, Section 4.3, “The learned models successfully reconstruct the past and predict the future. An example of reconstruction and prediction of both a regular and anomalous video sequence is depicted in Figure 5.”); 
Medel teaches providing given reference feedback to the third model to cause the third model to assess the generated indications against the given reference feedback, the given reference feedback indicating that the corresponding data item and additional reconstruction of each of the pairs are not similar, the third model updating one or more configurations of the third model based on the third model's assessment of the generated indications (Medel, p. 5, Section 4.2, “The cost function of eq. (6) was optimized with RMSProp. … We used a mini-batch of five video sequences and trained the models for up to 25,000 iterations.”); and 
Medel teaches providing the given data item and the reconstruction of the given data item to the third model to cause the third model to assess the differences between the given data item and the reconstruction of the given data item, the third model generating an indication that the given data item and the reconstruction of the given data item are not similar based on the differences between the given data item and the reconstruction of the given data item (Medel, p. 4, Section 3.2, “The quantitative evaluation algorithm considered here is based on a regularity score that is computed from the error values. The regularity score normalizes the error of the reconstruction between zero and one with respect to the other reconstructions from the same video, as different videos may have different notions of abnormality. … Video sequences containing normal events have a higher regularity score since they are similar to the data used to train the model, while sequences containing abnormal events have a lower regularity score.”), 
Medel teaches wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the indication generated by the third model (Medel, p. 4, Section 3.2, “Distinct local minima or scores below a certain threshold from a time series of regularity scores can therefore be used to locate abnormal events.”).
	Both Itou and Medel are directed to anomaly detection using encoder-decoder neural networks. It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method in Itou to include determining pairs and detecting anomalies, as disclosed in Medel. Doing so allows for the use of anomaly detection in the application of predicting the evolution of a video sequence from a small number of input frames (Medel, p. 1, Abstract).
Regarding claim 30, 
	The combination of Itou and Medel teaches the method of claim 21, further comprising:
Thus far, the combination of Itou and Medel does not explicitly teach determining subsets of data items such that each of the data item subsets comprise at least two data items of the data items; 
providing the data item subsets to the third model to cause the third model to, with respect to each of the data item subsets, generate an indication of whether the two data items of the data item subset are similar; 
providing given reference feedback to the third model to cause the third model to assess the generated indications against the given reference feedback, the given reference feedback indicating that the two data items of each of the data item subsets are not similar, the third model updating one or more configurations of the third model based on the third model's assessment of the generated indications; and 
providing the given data item and the reconstruction of the given data item to the third model to cause the third model to assess the differences between the given data item and the reconstruction of the given data item, the third model generating an indication that the given data item and the reconstruction of the given data item are not similar based on the differences between the given data item and the reconstruction of the given data item, 
wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the indication generated by the third model.
	Medel teaches determining subsets of data items such that each of the data item subsets comprise at least two data items of the data items (Medel, p. 3, Section 3, “When using a simple encoder-decoder model, the target values of the model determine the model application. When the target output is the input [one of the data items], the model creates a reconstruction of the input video sequence. When the target output is subsequent frames [the additional reconstruction of another one of the data items], the model learns to predict the future of the video sequence. The simple encoder-decoder model is improved by combining the reconstruction and prediction models into a composite model, as seen in Fig. 3. Both the current and future video sequences are target outputs.”); 
Medel teaches providing the data item subsets to a third model to cause the third model to, with respect to each of the data item subsets, generate an indication of whether the two data items of the data item subset are similar (Medel, p. 5, Section 4.3, “The learned models successfully reconstruct the past and predict the future. An example of reconstruction and prediction of both a regular and anomalous video sequence is depicted in Figure 5.”); 
Medel teaches providing given reference feedback to the third model to cause the third model to assess the generated indications against the given reference feedback, the given reference feedback indicating that the two data items of each of the data item subsets are not similar, the third model updating one or more configurations of the third model based on the third model's assessment of the generated indications (Medel, p. 5, Section 4.2, “The cost function of eq. (6) was optimized with RMSProp. … We used a mini-batch of five video sequences and trained the models for up to 25,000 iterations.”); and 
Medel teaches providing the given data item and the reconstruction of the given data item to the third model to cause the third model to assess the differences between the given data item and the reconstruction of the given data item, the third model generating an indication that the given data item and the reconstruction of the given data item are not similar based on the differences between the given data item and the reconstruction of the given data item (Medel, p. 4, Section 3.2, “The quantitative evaluation algorithm considered here is based on a regularity score that is computed from the error values. The regularity score normalizes the error of the reconstruction between zero and one with respect to the other reconstructions from the same video, as different videos may have different notions of abnormality. … Video sequences containing normal events have a higher regularity score since they are similar to the data used to train the model, while sequences containing abnormal events have a lower regularity score.”), 
Medel teaches wherein detecting the anomaly comprises detecting the anomaly in the given data item based on the indication generated by the third model (Medel, p. 4, Section 3.2, “Distinct local minima or scores below a certain threshold from a time series of regularity scores can therefore be used to locate abnormal events.”).
	Both Medel and the combination of Itou and Medel are directed to anomaly detection using encoder-decoder neural networks. It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method in Itou to include determining pairs and detecting anomalies, as disclosed in Medel. Doing so allows for the use of anomaly detection in the application of predicting the evolution of a video sequence from a small number of input frames (Medel, p. 1, Abstract).
Conclusion
	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to BART RYLANDER whose telephone number is (571)272-8359. The examiner can normally be reached Monday - Thursday 8:00 to 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on 571-270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/B.I.R./Examiner, Art Unit 2124                                                                                                                                                                                                        

/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124