DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 6/11/2021 has been entered.
 
This action is responsive to the original application filed on 6/28/2017 and the Remarks and Amendments filed on 5/10/2021.  Acknowledgement is made with regards to priority claimed to Japanese Application No. JP2016131030 filed on 6/30/2016 and Japanese Application No. JP2017118841 filed on 6/16/2017.

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 3-8, and 11-18 are rejected under 35 U.S.C. § 103 as being obvious over Lillicrap et al. (US 20160162781 A1, hereinafter “Lillicrap”) in view of Bilenko et al. (US 20120158623 A1, hereinafter “Bilenko”) and Yosinski et al. (Yosinski et al., “Understanding Neural Networks Through Deep Visualization”, Jun. 22, 2015, Deep Learning Workshop, 31st International Conference on Machine Learning, pp. 1-12, hereinafter “Yosinski”).

Regarding claim 1, Lillicrap discloses perform a first calculation unit to obtain an output value of a first neural network for input data in correspondence with each category of a plurality of categories; ([0009]; “receiving a generated output at the output layer”, suggesting obtaining an output value of a first neural network; and [0007]; “a method of training a neural network having at least an input layer, a hidden layer and an output layer”, further suggesting that the output value is one received form a neural network; and [0058]; “It will be apparent that a system, such as a computer, which has a neural network trained in this manner may have many applications. An example is shown in FIG. 10, in which a 784-1000-10 network with nodes having a sigmoidal response function was trained to categorise handwritten digits”, suggesting an output is for input data in correspondence with each category in a plurality of categories such as categories of handwritten digits; and Figure 2, Element 23; the element discloses performing a first calculation to obtain an output value of a first neural network, the output value of the first neural network is obtained through the first iteration of the training)
perform a second calculation to obtain an output value of a second neural network for the input data in correspondence with each category, the second neural network being generated by changing a designated unit in the first neural network; ([0011]-[0012]; “(d) for at least one pair of the layers, generating a change matrix, the change matrix being the product of a fixed random feedback weight matrix and the error vector, and [0012] (e) modifying the forward weight matrix for the at least one pair of the layers in accordance with the change matrix” the change matrix containing values that affect the weight of the refined neural network; and [0017]; “The method may comprise iteratively performing steps (a) to (e) for a plurality of input values”, which suggests that the change matrix is applied to the first neural network resulting in a second neural network from which a second output is obtained from step (b); and Figure 2, Element 23; the figure discloses performing a second calculation to  obtain an output value (23) of a second neural network for the input data, the second neural network being generated by changing a designated unit or weight (contained in the weight matrix) of the first neural network.  Note that the second neural network results after modifying the weight matrix in step 26 in the first iteration.  Upon the training being determined to be not complete, the output of the second neural network is then obtained at the second pass of step 23.  Again, the changing of the designated unit in the first neural network is the changing or modification of the weight matrix at step 26 in the first iteration of training)
perform a third calculation to obtain, for each category, change information representing a change between the output value obtained by the first calculation and the output value obtained by the second calculation; and ([0011]; “for at least one pair of the layers, generating a change matrix, the change matrix being the product of a fixed random feedback weight matrix and the error vector”, which discloses that the change matrix comprises change information between the output values of the two neural networks, where each neural network is produced a respective iteration of the training procedure.  Note that the change matrix, under a broadest reasonable interpretation of the claim language, includes change information that at least comprises and represents a change between the output values of the first calculation and the second calculation in that an error vector is computed, and this error vector is included in the change matrix; and [0046]; “at step 25 a change matrix calculated from the product of the error vector and the random weight matrix”; and Figure 2, Element 25; the figure discloses, under a broadest reasonable interpretation of the claim language, performing a third calculation to obtain change information in the form of a change matrix that represents at least a change between the output value obtained by the first calculation and the output value obtained by the second calculation.  Note that in the first iteration of the training procedure, the change matrix 25 contains at least change information from the output of the first calculation, and upon a second iteration, the change matrix contains change information representing a change between the output value of the first calculation and the output value obtained by the second calculation (which, upon the second iteration, is the received output at 23)).
Lillicrap fails to explicitly disclose [a]n information processing apparatus comprising: one or more processors connected to one or more memories storing a program, the one or more processors being configured to:; output, to a display device, an image of a training environment related to evaluation data;  output information representing contribution of the designated unit to the display device based on the change information obtained by the third calculation, and an indication of a related object.
Bilenko discloses [a]n information processing apparatus comprising: one or more processors connected to one or more memories storing a program, the one or more processors being configured to: ([0019]; “Furthermore, the claimed subject matter may be implemented as a method, apparatus, or article of manufacture using standard programming and/or engineering techniques”; and [0076]
output information representing contribution of the designated unit to the display device based on the change information obtained by the third calculation ([0029]; “The visualization-based system for debugging accuracy 100 may include a visualization tool 116 and various components of a machine learning system. As shown, the components of the machine learning system may include training data 102, a feature generation process 104, featurized training dataset 106, a training process 108, a trained model 110, a testing process 112, and results 114”, suggesting the display device that represents output information representing contribution of the designated unit; and [0033]; “The training process 108 may be based on executing a machine learning algorithm, whereby parameters of the model may be tuned for accuracy”, suggesting that the visualization tool uses change information in the form of parameters that are tuned for accuracy; and [0035]; “The visualization tool 116 may use the training data 102, the featurized training dataset 106, the trained model 110, and the results 114 to generate visualizations”, further suggesting the output unit in the form of a visualization tool).
Lillicrap and Bilenko are analogous art because both are concerned with machine learning applications.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in machine learning to combine the output unit of Bilenko with the calculation units of Lillicrap to yield the predictable result of an output unit configured to output information representing contribution of the designated unit to a display device based on the change information obtained by the third calculation unit. The motivation for doing so would be to provide an interactive representation of training instances and results of a machine learning system (Bilenko; [0007]).
Yosinski discloses output, to a display device, an image of a training environment related to evaluation data (Abstract; “We introduce two such tools here. The first is a tool that visualizes the activations produced on each layer of a trained convnet as it processes an image or video (e.g. a live webcam stream). We have found that looking at live activations that change in response to user input helps build valuable intuitions about how convnets work. The second tool enables visualizing features at each layer of a DNN via regularized optimization in image space.”, which discloses the use of a visualization tool that outputs to a display device an image of a training environment related to evaluation data; and Page 2, Column 1; “The first tool is software that interactively plots the activations produced on each layer of a trained DNN for user-provided images or video. Static images afford a slow, detailed investigation of a particular input, whereas video input highlights the DNNs changing responses to dynamic input. At present, the videos are processed live from a user’s computer camera, which is especially helpful because users can move different items around the field of view, occlude and combine them, and perform other manipulations to actively learn how different features in the network respond. The second tool we introduce enables better visualization of the learned features computed by individual neurons at every layer of a DNN. Seeing what features have been learned is important both to understand how current DNNs work and to fuel intuitions for how to improve them”, which further discloses the outputting of the image of the training environment; and Page 3, Figure 1 and Description; the figure discloses the outputting of the training image)
an indication of a related object (Page 3, Figure 1; the figure discloses, under a broadest reasonable interpretation of the claim language, an indication of a related object such as a cat face; and Page 7, Figure 4;  the figure, again under a BRI, discloses the indications or visualizations of related objects such as a gorilla).
Lillicrap, Bilenko, and Yosinski are analogous art because all are concerned with machine learning applications.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in machine learning to combine the training image and indication of related objects of Yosinski with the apparatus of Lillicrap and display device of Bilenko to yield the predictable result of output, to a display device, an image of a training environment related to evaluation data; and output information representing contribution of the designated unit to the display device based on the change information obtained by the third calculation, and an indication of a related object in the image. The motivation for doing so would be to visualize and interpret neural nets (Yosinski; Abstract).


Regarding claim 17, it is a method claim corresponding to the steps of claim 1, and is rejected for the same reasons as claim 1.

Regarding claim 18, it is a non-transitory computer-readable storage medium claim corresponding to the steps of claim 1, and is rejected for the same reasons as claim 1

Regarding claim 3, the rejection of claim 1 is incorporated and Lillicrap further discloses wherein the second neural network is a neural network generated by adding a predetermined value to each element in the designated unit of the first neural network ([0012]; “modifying the forward weight matrix for the at least one pair of the layers in accordance with the change matrix”, suggesting adding a predetermined value to each element in the designated unit (modifying the forward weight matrix for the at least one pair of the layers) of the first neural network; and [0023]; “modifying each forward weight matrix in accordance with the respective change matrix”).

Regarding claim 4, the rejection of claim 1 is incorporated and Lillicrap further discloses wherein each of the first neural network and the second neural network is a neural network including a plurality of layers, and ([0007]; “According to a first aspect of the invention there is provided a method of training a neural network having at least an input layer, a hidden layer and an output layer, and a plurality of forward weight matrices encoding connection weights between successive pairs of layers, the method comprising the steps of”, suggesting the NNs with a plurality of layers)
the second calculation uses, as a processing result by a lower layer under a layer to which the designated unit belongs in the second neural network, a processing result by the lower layer obtained when the first calculation unit obtains the output value ([0017]-[0018]; “The method may comprise iteratively performing steps (a) to (e) for a plurality of input values.  [0018] Step (e) may comprise modifying the forward weight matrix encoding connection weights between the pair of layers comprising the input layer and the hidden layer”, which suggests the second calculation unit uses, as a processing result by a lower layer (hidden layer) under a layer to which the designated unit belongs in the second neural network, a processing result by the lower layer obtained when the first calculation unit obtains the output value (which occurs after the first iteration that renders an output value from the first calculation unit).

Regarding claim 5, the rejection of claim 1 is incorporated and Lillicrap further discloses wherein the third calculation unit obtains, as the change information, a difference between the output value obtained by the first calculation and the output value obtained by the second calculation ([0013]; “The change matrix may be the cross product of the fixed random feedback weight matrix and the error vector”, suggesting change information contained in the change matrix that is a difference between the output value obtained by the first calculation unit and the output value obtained by the second calculation unit; and [0046]; “At step 24 an error vector is calculated from the difference between the expected output and the received output, and at step 25 a change matrix calculated from the product of the error vector and the random weight matrix”).

Regarding claim 6, the rejection of claim 1 is incorporated and Lillicrap further discloses wherein the third calculation obtains the change information based on the output value obtained by the first calculation and information used for the change ([0013]; “The change matrix may be the cross product of the fixed random feedback weight matrix and the error vector”, suggesting wherein the third calculation unit obtains the change information based on the output value obtained by the first calculation unit and information used for the change (cross product of the fixed random feedback weight matrix and the error vector); and [0046]; “At step 24 an error vector is calculated from the difference between the expected output and the received output, and at step 25 a change matrix calculated from the product of the error vector and the random weight matrix”).

Regarding claim 7, the rejection of claim 1 is incorporated and Lillicrap further discloses wherein the second neural network is each of neural networks generated by sequentially changing a plurality of designated units in the first neural network ([0046]; “At step 26 the connection weights of a weight matrix in the network are modified, for example by adding the change matrix and the weight matrix . . . At step 27, the network is tested to check whether the training is complete, for example when an error value is below a suitable threshold. If not, steps 22 to 26 are repeatedly performed for a plurality of inputs and corresponding expected outputs until step 27 is passed”, suggesting the sequential changing of designated units (connection weights).  This is done sequentially until a threshold is met).

Regarding claim 8, the rejection of claim 1 is incorporated but Lillicrap fails to explicitly disclose wherein for each category, a predetermined number of change information in descending order and information representing the designated unit corresponding to the change information are outputted.
Bilenko discloses wherein for each category, a predetermined number of change information in descending order and information representing the designated unit corresponding to the change information are outputted ([0033]; “The training process 108 may be based on executing a machine learning algorithm, whereby parameters of the model may be tuned for accuracy”, suggesting that the visualization tool uses change information in the form of parameters that are tuned for accuracy; and Figure 5B;  the figure discloses an output unit in the form of a visualization tool that outputs, for each category, a predetermined number of change information (changed false positives or changed false negatives) in descending order and information representing the designated unit corresponding to the change information).
The motivation to combine Lillicrap and Bilenko is the same as discussed above with respect to claim 1.


Regarding claim 11, the rejection of claim 1 is incorporated but Lillicrap fails to explicitly disclose information representing a feature of the input data corresponding to the designated unit to the display device is output.
Bilenko discloses information representing a feature of the input data corresponding to the designated unit to the display device is output ([0029]; “FIG. 1 is a block diagram of a visualization-based system for debugging accuracy 100 for machine learning systems in accordance with the claimed subject matter. The visualization-based system for debugging accuracy 100 may include a visualization tool 116 and various components of a machine learning system. As shown, the components of the machine learning system may include training data 102, a feature generation process 104, featurized training dataset 106, a training process 108, a trained model 110, a testing process 112, and results 114”, suggesting an output unit (visualization tool) that further outputs information representing a feature of the input data corresponding to the designated unit (training data) to the display device).
The motivation to combine Lillicrap and Bilenko is the same as discussed above with respect to claim 1.

Regarding claim 12, the rejection of claims 1 and 11 are incorporated but Lillicrap fails to explicitly disclose receive selection of a unit by a user, wherein the unit selected by the user is set to the designated unit and causes the display device to display the information representing the feature of the input data corresponding to the unit.
Bilenko discloses receive selection of a unit by a user, wherein the unit selected by the user is set to the designated unit and causes the display device to display the information representing the feature of the input data corresponding to the unit ([0028]; “In some embodiments, the visualization-based system for debugging accuracy may include a single-suite tool that enables a user to analyze the various components of the machine learning system. Based on this analysis, the user may make corresponding modifications. Such embodiments may improve the practice of learning system deployment and accuracy debugging”).
The motivation to combine Lillicrap and Bilenko is the same as discussed above with respect to claim 1.

Regarding claim 13, the rejection of claims 1, 11, and 12 are incorporated but Lillicrap fails to explicitly disclose receive selection of a category by the user, wherein the display device is caused to display the information representing the feature of the input data corresponding to the unit and the category selected by the user.
Bilenko discloses receive selection of a category by the user, wherein the display device is caused to display the information representing the feature of the input data corresponding to the unit and the category selected by the user ([0028]; “In some embodiments, the visualization-based system for debugging accuracy may include a single-suite tool that enables a user to analyze the various components of the machine learning system. Based on this analysis, the user may make corresponding modifications. Such embodiments may improve the practice of learning system deployment and accuracy debugging”; and [0036]; “Using this data, the visualization tool 116 may be used to improve the learning accuracy by enabling the user to analyze each of three components of the machine learning system. Issues with the training data 102, the featurized training dataset 106, and the trained model 110 may be identified, and modified accordingly”); see also Figures 5A-5G).
The motivation to combine Lillicrap and Bilenko is the same as discussed above with respect to claim 1.

Regarding claim 14, the rejection of claim 1 is incorporated but Lillicrap fails to explicitly disclose wherein information representing an element in the input data corresponding to the designated unit is output and the display device is caused to identifiably display the element in the input data.
Bilenko discloses wherein information representing an element in the input data corresponding to the designated unit is output and the display device is caused to identifiably display the element in the input data ([0029]; “FIG. 1 is a block diagram of a visualization-based system for debugging accuracy 100 for machine learning systems in accordance with the claimed subject matter. The visualization-based system for debugging accuracy 100 may include a visualization tool 116 and various components of a machine learning system. As shown, the components of the machine learning system may include training data 102, a feature generation process 104, featurized training dataset 106, a training process 108, a trained model 110, a testing process 112, and results 114”, suggesting an output unit further outputs information representing an element in the input data (training data) corresponding to the designated unit and causes the display device to identifiably display the element in the input data).
The motivation to combine Lillicrap and Bilenko is the same as discussed above with respect to claim 1.

Regarding claim 15, the rejection of claims 1 and 14 are incorporated but Lillicrap fails to explicitly disclose when an importance for the element displayed on the display device is input, relearn the first neural network using a learning method with importance using the importance.
Bilenko discloses when an importance for the element displayed on the display device is input, relearn the first neural network using a learning method with importance using the importance ([0030]; “The training data 102 may be transformed into the featurized training dataset 106 by the feature generation process 104. The featurized training dataset 106 may include one instance for each instance in the training data 102. For example, the featurized training dataset 106 may include one instance for each sample email in the training data 102. Herein, the instances in the featurized training dataset 106 are also referred to as training instances”, suggesting an importance for the element displayed on the on the display device in the form of a featured training dataset; and [0033]; “The featurized training dataset 106 may be input to the training process 108 to produce the trained model 110. The training process 108 may be based on executing a machine learning algorithm, whereby parameters of the model may be tuned for accuracy”, suggesting relearning the first NN using a learning method with the importance or featurized training data).
The motivation to combine Lillicrap and Bilenko is the same as discussed above with respect to claim 1.

Regarding claim 16, the rejection of claim 1 is incorporated and Lillicrap further discloses wherein the unit is one of a feature map and a neuron of a neural network ([0052]; “Each trace corresponds to a single neuron in the hidden layer of a 3-layer network and shows the angle between the forward weights vector and fixed backward weights vector for that neuron”, suggesting a neuron of a neural network; and [0052]; “a simple 3-layer network with 1000 hidden neurons trained on the MNIST dataset”).

Claim 2 is rejected under 35 U.S.C. § 103 as being obvious over Lillicrap in view of Bilenko and Yosinski and further in view of Umeda (US 20170206450 A1, hereinafter “Umeda”).

Regarding claim 2, the rejection of claim 1 is incorporated but Lillicrap fails to explicitly disclose wherein the second neural network is a neural network generated by changing all elements in the designated unit of the first neural network to 0.
Umeda discloses wherein the second neural network is a neural network generated by changing all elements in the designated unit of the first neural network to 0 ([0012]; “calculating a first output error between a label and an output in a case where dropout in which values are replaced with 0 is executed for a last layer of a first channel among plural channels in a parallel neural network”, suggesting changing all elements of a designated unit (values of a last layer of a first channel of a NN) of a first neural network to 0; and [0057]; “Dropout is processing to replace values of target nodes with 0 during feedforward, and is used for solving a problem of over learning in a DN”).
Lillicrap, Bilenko, Yosinski, and Umeda are analogous art because all are concerned with machine learning applications.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in machine learning to combine the changing of elements of a neural network to 0 as taught by Umeda with the apparatus of Lillicrap, Bilenko, and Yosinski to yield the predictable result of wherein the second neural network is a neural network generated by changing all elements in the designated unit of the first neural network to 0. The motivation for doing so would be to improve precision of classification by a parallel neural network (Umeda; [0011]).

Allowable Subject Matter

Claims 9 and 10 are allowed.

Response to Arguments

Applicant’s arguments and amendments, filed on 5/10/2021, with respect to the 35 USC § 103 rejection of claims 1-8 and 11-18 have been considered but are but are moot because the arguments do not apply to any of the references being used in the current rejection to reject independent claims 1, 17, and 18.  Lillicrap, Bilenko, Yosinski are now being used to render claims 1, 17, and 18 obvious under 35 USC § 103.

Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Brent Hoover whose telephone number is (303)297-4403.  The examiner can normally be reached on Monday - Friday 9-5 MST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/BRENT JOHNSTON HOOVER/Examiner, Art Unit 2125