Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This Office Action is in response to the Applicants’ communication filed on 08/12/2022. In virtue of this communication, claims 1-4 and 7-23 are currently pending in the instant application. 
	
Response to Arguments
Applicant's arguments filed on 08/12/2022 have been fully considered but they are not persuasive. 
Applicant argues in part stating “The cited language from Hu indicates that the teacher network q "is constructed" using two inputs: the student network p and structured logic rules. The cited language from Hu provides no evidence from which it can be concluded that Hu discloses "updating the weights of the label generator based on its current weights and the weights of the original network in response to an outcome of the training of the original network." …Further, when considered in totality, Hu does not appear to provide evidence from which it can be concluded that Hu discloses "updating the weights of the label generator based on its current weights and the weights of the original network in response to an outcome of the training of the original network."”
The Examining division respectfully disagrees. Hu shows a general framework capable of enhancing various types of neural networks including RNNs with declarative first-order logic rules. Hu develops an iterative (i.e. updating) distillation method that transfers the structure information of logic rules into the weights of neural networks (see abstract) The “label generator” is equated to the neural network as a whole as is in the instant specification (see background of instant spec.). Further, the original network is parameterized by weights (theta) and “Standard neural network training has bee to iteratively update (theta) to produce the correct labels of training instances (i.e. again, updating weights based on original iteratively). In other words, applying Hu to a RNN and where an original network is parametrized using initial weight (theta) values and then iteratively updated, each update is based the current (i.e. weight to be updated) and the original (i.e. initial parameters that initialized the weight settings) (see section 3.2).  So, in RNNs the output of the current step becomes the input of the next step and so on. This means, at every stage, the model considers both the current input and all of the previous outputs which are based on previous and original weights. Therefore the rejection is maintained at this time. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 13 and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 13 and 20 recite the limitation “…the weighted average…” There is insufficient antecedent basis for this limitation in the claim. Claims 13 and 20 should be dependent on claims 12 and 19 respectively (just as is the similar claim 8 depending on claim 7, since those parent claims initialize “a weighted average.” Appropriate correction is required.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
 	A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

	Claims 1-4 and 10-11, 13, 15-18, 20 and 22-23 are rejected under 35 U.S.C. 103 as being unpatentable over  D1 ZHITING HU ET AL: "Harnessing Deep Neural Networks with Logic Rules", PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (VOLUME 1: LONG PAPERS), 2016, pages 2410-2420, XP055433576, Stroudsburg, PA, USA DOI: 10.18653/v1/P 16-1228, herein after Hu, in view of Wierzynski (US 2017/0024641 A1).   


 	Regarding Claim 1, Hu teaches “A computer-implemented method for training a neural network system (“a framework capable of enhancing general types of neural networks"; “[our framework enables a neural network to learn simultaneously from labeled instances as well as logic rules"; "a natural "side-product" of the integration is the support for semi-supervised learning where unlabeled data is used to better absorb the logical knowledge," page 1, right-hand column, lines 10-11, 27-29, and 33-36) comprising an original neural network (Figure 7 - student Pe(y/x), page 2) and a label generator (Figure 1 - teacher q(y/x), page 2), 
the method comprises: obtaining a number of training cases comprising input data and wherein at least one training case is labeled; and (Figure 7 - unlabeled data / labeled data, page 2), 
training the neural network system by a sequence of training steps where at each training step at least one of the following operations is performed: training the original network by processing a subset of the labeled training cases with labels, ("Emulating the q outputs serves to transfer this knowledge into Po. The new objective is then formulated as a balancing between imitating the soft predictions of q and predicting the true hard labels: Eq.(2)," page 3, right-hand column, lines 14-19); 
updating weights of the label generator based on its current weights and  weights of the original network in response to an outcome of the training of the original network, ("we now proceed to construct the teacher network ge(y/x) at each iteration from pe(y|x)." page 4, left-hand column, lines 10-18; "q is constructed by projecting p into a subspace constrained by the rules," page 3, right-hand column, lines 9-10; "Our goal is to find the optimal q that fits the rules while at the same time staying close to pg." page 4, left-hand column, lines 20-21); and 
each of the operations gets performed at least once during training of the neural network system. 
However, Hu does not explicitly disclose the limitation “generating a label with the label generator for a subset of the training cases and training the original network with the generated label.”
 	In the same field of endeavor Wierzynski discloses a method of transfer learning includes receiving second data and generating, via a first network, second labels for the second data. In one configuration, the first network has been previously trained on first labels for first data. Additionally, the second labels are generated for training a second network (see abstract, fig. 6 and par. 0080-0082). 
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to generate labels and training an original network using the labels as taught by Wierzynski in the system of Hu, in order to simplify training of the neural network models (see Wierzynski par. 0008-0009).  

	Claims 3 and 17 are rejected for the same reasons set forth above because the claims have similar limitations or have been addressed. 

	Regarding Claim 2 Hu teaches the limitations " The computer-implemented method of claim 1, wherein training the original network comprises minimizing a combination of a classification cost between a predicted label by the original network and the original label and a consistency cost between the predicted label by the original network and the generated label by the label generator ("The new objective is then formulated as a balancing between imitating the soft predictions of q and predicting the true hard labels: Eq.(2), where I denotes the loss function selected according to specific applications (e.g., the cross entropy loss for classification) I s(n) raised to (t) is the soft prediction vector of q on x, at iteration t; and pie is the imitation parameter calibrating the relative importance of the two objectives." page 3, right-hand column, lines 15-25). 

	Claims 4 and 18 are rejected for the same reasons set forth above because the claims have similar limitations or have been addressed. 

 	Regarding Claim 10  Hu teaches the limitations " The computer-implemented method of claim 1, wherein the weights of the label generator are initialized to match the weights of the original network” (see 3.2 where again the neural network training iteratively updates the weights, where at the first instance (i.e. before first iteration) the weights have not been updated and match the original network). 

	Claims 15 and 22 are rejected for the same reasons set forth above because the claims have similar limitations or have been addressed. 

 	Regarding Claim 11 Hu teaches the limitations " The computer implemented method of claim 1, wherein generating a label with the label generator comprises mutating at least some outputs of intermediate layers of the label generator” (see 3.2, where Hu shows standard neural network training has been to iteratively update (theta) to produce correct labels of training instances (i.e. updating weights of intermediate layers is equated to mutating output of intermediate layers) and in each iteration q is constructed by projection p_0 into a subspace constrained by the rules, and thus has desirable properties. 

	Claims 16 and 23 are rejected for the same reasons set forth above because the claims have similar limitations or have been addressed. 


Allowable Subject Matter
Claims 7-9, 12-14, and 19- 21 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims (also considering the dependency correction above in the 112 rejection for claims 13 and 20).
 	The following is an examiner’s statement of reasons for allowance: A search was conducted with regard to applicant's claims defining as in claims 7 and 9, “The computer-implemented method of claim 1, wherein updating the weights of the label generator based on its current weights and the weights of the original network in response to an outcome of the training of the original network comprises determining each of the weights of the label generator as a weighted average of a corresponding one of the current weights of the label generator and a corresponding one of the weights of the original network.” “The computer-implemented method of claim 1, wherein the updating the weights of the label generator based on its current weights and the weights of the original network in response to an outcome of the training of the original network comprises updating the weights of the label generator based only on the current values of the weights of the label generator and corresponding ones of the weights of the original network.”
 	Examiner has found prior art in the same field of endeavor in Hu and Wierzynski (see rejection to claim 1 above). 
 
The prior art do not teach “wherein updating the weights of the label generator based on its current weights and the weights of the original network in response to an outcome of the training of the original network comprises determining each of the weights of the label generator as a weighted average of a corresponding one of the current weights of the label generator and a corresponding one of the weights of the original network” or “wherein the updating the weights of the label generator based on its current weights and the weights of the original network in response to an outcome of the training of the original network comprises updating the weights of the label generator based only on the current values of the weights of the label generator and corresponding ones of the weights of the original network.” 
Furthermore, claims 7-9, 12, 14, 19 and 21 would not be obvious to a person of ordinary skill in the art considering the prior art on record and therefore involve an inventive step. When incorporating all the limitations in combination, none of the prior art discloses the features as claimed. 
 	Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID BILODEAU whose telephone number is (571)270-3192.  The examiner can normally be reached Monday-Thursday 8:00am-6:00pm Pacific Standard Time.  
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wesley Kim can be reached at (571) 272-7867.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/David Bilodeau/
Primary Examiner, Art Unit 2648