DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-28 are presented for examination.

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.  However, Examiner notes that the certified copy filed of German application number DE.202018104373.0 has not been accompanied by a translation.  Therefore, priority has not been perfected, and until such a translation is furnished, Examiner will be entitled to apply art from any point up to the actual US filing date of May 24, 2019.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on May 24, 2019 and August 20, 2019 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.

Drawings
The drawings are objected to because (a) unlabeled boxes 31-33, 41-45, and 51-53 should be provided with descriptive labels; and (b) reference character 13 appears in the drawings but not the specification.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
The disclosure is objected to because of the following informalities: 
On page 1:
On line 2, “device, which” should be “device which”.
The sentence ending on line 11 should end with a period.
On page 2, line 25, “one each of the layers” should be “each of the layers”.
On page 4, line 6, “respectively previous layer” should be “previous layer”.
On page 6:
The paragraph spanning lines 7-9 is unintelligible.
On line 13, “in particular” is unnecessary.
On line 26, “such as, for example” should be merely “such as”.
On page 9, line 23, “counter is incremented” should be merely “counter incremented”.
Appropriate correction is required.
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
The following title is suggested: “Device for Performing Streaming Rollout of Recurrent Neural Networks.”


Claim Objections
Claims 1-28 are objected to as being so verbose and full of superfluous language and ambiguities as to impede the reader’s understanding of what is being claimed.  Examples of this language include, but are not limited to:
It is not entirely clear in claim 1 what is “being controlled as a function of the predefinable rollout.”  The output?  The calculation?  The input or the function?  For purposes of examination, it will be assumed that the calculating is what is controlled as a function of the predefinable rollout.
The language “wherein the stored commands are designed in such a way that the method carried out by the computer, when those commands are executed on the computer, runs in such a way that” is repeatedly used.  This language could be replaced simply with the word “wherein”.
The term “in particular” is frequently placed in seemingly random locations in the claims.  For example, in claim 3, “step-wise, in particular, in succession” could be simply “step-wise and in succession” without loss of meaning; similarly, “of the rollout, in particular, in each case at a predefinable point in time” could be merely “of the rollout at a predefinable moment in time”.
Claim 1 recites “each connection,” but since “connections” are not previously recited, it is not clear to what the “connections” are referring.  To the extent that Applicant is referring to the connections between layers, Examiner recommends amending the language to read “assigning to each of the layers or each of a plurality of connections between the layers a control variable”.
Claim 3 recites “the calculation of the machine learning system”.  However, the only calculation recited by the machine learning system in claim 1 is the calculation of the output variable.  If this was what was intended, Examiner recommends amending the claims to read “when controlling the calculation of the output variable of the machine learning system”.
Claim 17 recites the limitation “those connections that connect a first layer with a second layer and the first layer and the second layer are also directly connected with the aid of at least two connections”.  The only other place in the claim set where such language occurs is in claim 4, on which claim 17 is not dependent.  For purposes of examination, it will be assumed that claim 17 is dependent on claim 4 rather than on claim 14.
Claim 24 recites that “the classification takes place image element-wise, is, in particular, segmented.”  Applicant presumably means either that “the classification takes place image element-wise and is segmented” or that “the classification that takes place image element-wise is segmented”.  For purposes of examination, the former interpretation will be adopted.
Claim 25 vacillates between the simple present (“the input layer is a detected sensor variable”) and the gerund form (“the stored commands being designed in such a way”).
Examiner recommends rereading the entire claim set to identify any other potential ambiguities or superfluous language.  To the extent not specifically objected to above, the dependent claims are each objected to for dependency on objected-to claim 1. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2-3, 11, and 15-16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The last clause of claim 2 is nonsensical. First, it is unclear whether the claimed “step one” is the first step of the sequence that is executed “step by step”, the first of the “steps” of the method performed by the device of claim 1 (viz., assigning the predefinable rollout to the machine learning system), or one of the steps involved in ascertainment of the output variable according to the sequence.  Second, it is unclear whether “its” refers back to the “layers” (in which case it is grammatically incorrect and should be “their”), the machine learning system, or something else.  Moreover, the claim states that “the sequence characteriz[es] a sequence of ascertaining the intermediate variables or layers,” but the sequence referred to in claim 1 is also one in which “the layers each ascertain an intermediate variable”, leading to confusion about how many sequences are required by claim 2.  For purposes of examination, Examiner will assume that all recitations of a “sequence” refer to the same sequence.  The claim as a whole may mean that one layer determines its output per step.  For purposes of examination, this interpretation will be adopted.
In claim 3, the language “one each of the layers ascertains step-wise, in particular, in succession, the intermediate variable according to the sequence of the rollout, in particular, in each case at a predefinable point in time of a sequence of points in time, those layers that ascertain their intermediate variables regardless of the sequence each ascertaining their intermediate variables, in each case at each step, in particular at the respective predefinable points in time” is ungrammatical, confusing, and may be interpreted in several different ways, rendering the claim indefinite.  Examiner interprets it to mean roughly that both the hidden layer outputs that are calculated sequentially and those that are calculated non-sequentially are calculated at one of a sequence of timesteps, but it could also plausibly mean that the hidden layer outputs of the sequentially calculated layers are calculated “in succession” and that the hidden layer outputs of the non-sequentially calculated layers are calculated at one of a sequence of timesteps but not necessarily in succession.  Other interpretations not imagined by Examiner are also possible.  For purposes of examination, the first of these interpretations will be adopted.
Claim 11 recites the limitation "the step-wise ascertainment".  There is insufficient antecedent basis for this limitation in the claim.  In addition, the claim recites that “when the machine learning system is provided an input variable for the first time, it is checked after each step” whether the layers that determine their intermediate variable regardless of the sequence are each provided a previous layer’s intermediate variable.  It is unclear from the language whether this checking occurs “after each step” (of which there may be multiple) or only “when the machine learning system is provided an input variable for the first time” (which only occurs once).  Examiner interprets the language as meaning that the checking occurs at each step, including the first time an input variable has been provided.
Claims 15 and 16 recite the limitation "the subsequent time step".  There is insufficient antecedent basis for this limitation in the claims.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim 1 is rejected under 35 U.S.C. 102(a)(1) as being anticipated by Fischer et al., “The Streaming Rollout of Deep Networks – Towards Fully Model-Parallel Execution,” in arXiv preprint arXiv:1806.04965v1 (2018) (“Fischer”).
Regarding claim 1, Fischer discloses “[a] device configured to operate a machine learning system, the machine learning system including a plurality of layers, which are connected with the aid of connections (Fischer Fig. 1(a) shows a recurrent neural network (RNN) containing nodes [layers] connected by edges [connections]), the device comprising: 
a machine-readable memory element, on which commands are stored which, when executed by a computer, ensure that the computer carries out a method (Fischer p. 1, second paragraph of sec. 1 discloses that recurrent neural networks (RNNs) leverage temporal context and that RNNs may be rolled out through time disentangling the recursive dependencies and transforming the recurrent network into a feed-forward network [note that RNNs run on a computer and are stored on machine-readable memory elements]) that includes the following steps: 
assigning to the machine learning system a predefinable rollout, which characterizes a sequence, according to which the layers each ascertain an intermediate variable (Fischer sec. 1, first three paragraphs and Fig. 1 disclose that an RNN [machine learning system] may be rolled out through time by transforming the recurrent network into a feed-forward network; the rollout may be sequential, in which time steps are bridged only if necessary, or streaming, in which nodes of a time step are computationally disentangled and computed in parallel [i.e., the rollout is predefined by one of these two modalities]; paragraph spanning pp. 4-5 discloses that states of the rollout window encode which nodes have been computed so far and update steps determine the next state [set of intermediate variables] based on the previous state), 
when assigning the rollout, assigning to each connection or each layer a control variable, which characterizes whether the intermediate variable of the respective subsequent connected layers is ascertained according to the sequence or regardless of the sequence (Fischer Fig. 2 and accompanying caption and p. 4, definition of “rollout pattern and window” disclose that the networks may have different rollout patterns, each of which maps a set of edges [connections] to a bit [control variable] that determines whether an edge causes information to stream through time [i.e., ascertain the intermediate variable regardless of the sequence] or has sequential dependencies upon nodes inside a frame [i.e., determines the intermediate variable according to the sequence]), and 
calculating an output variable of the machine learning system as a function of an input variable of the machine learning system being controlled as a function of the predefinable rollout (Fischer p. 6, last paragraph before sec. 4 discloses that the inference of rollout windows starts with an initial state [input variable] and that successive applications of the update step updates all nodes until the fully updated state [output variable] is reached).”1

Claims 1-17, 19-20, and 22-28 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Carreira et al., “Massively Parallel Video Networks,” in arXiv preprint arXiv:1806.03863 (2018) (“Carreira”).  Note that, as regards claim 1, this rejection is an alternative to the one given above.
Regarding claim 1, Carreira discloses “[a] device configured to operate a machine learning system, the machine learning system including a plurality of layers, which are connected with the aid of connections (directed graph may be obtained by unrolling a video model [machine learning system] with n layers over time where layers of the network are represented by nodes and the activations transferred between layers are represented by edges [connections] of the graph – Carreira, sec. 3, section entitled “Depth-parallel networks”), the device comprising: 
a machine-readable memory element, on which commands are stored which, when executed by a computer, ensure that the computer carries out a method (Carreira appendix C discloses that certain pipelined models are memory-intensive [suggesting the existence of a memory element containing instructions to implement the network]) that includes the following steps: 
assigning to the machine learning system a predefinable rollout, which characterizes a sequence, according to which the layers each ascertain an intermediate variable (directed graph may be obtained by unrolling a video model with n layers over time where layers of the network are represented by nodes and the activations transferred between layers are represented by edges of the graph; the model may be a depth-sequential model in which the input to each layer [intermediate variable] is the output of the previous layer at the same time step, or a depth-parallel model in which every layer immediately starts processing the next input available without waiting for the whole network to finish computation for the current frame [so the depth-sequential and the depth-parallel rollouts are each “predefinable rollouts”] – Carreira, sec. 3, section entitled “Depth-parallel networks”), 
when assigning the rollout, assigning to each connection or each layer a control variable, which characterizes whether the intermediate variable of the respective subsequent connected layers is ascertained according to the sequence or regardless of the sequence2 (in basic depth-sequential video models, the input to each layer [intermediate variable] is the output of the previous layer at the same time step, and the network outputs a prediction only after all the layers have processed in sequence the current frame [i.e., the intermediate variable is ascertained according to the sequence]; in another design, every layer in the network processes its input, passes the activations to the next layer, and immediately starts processing the next input variable, without waiting for the whole network to finish computation for the current frame [i.e., the intermediate variable is ascertained regardless of the sequence]; this is achieved by substituting in the unrolled graph the vertical edges by diagonal ones, so the input to each layer is still the output from the previous layer, but from the previous time step – Carreira, sec. 3, section entitled “Depth-parallel networks”; compare also Fig. 1(a) (showing a depth-sequential model) to Fig. 1(b) (showing a depth-parallel model); note also that the claimed “control variable” of each layer/connection is the determiner of whether that layer will be processed in a depth-sequential or depth-parallel fashion, see Fig. 3 (disclosing hybrid models in which k consecutive layers are processed sequentially [according to the sequence] and the subnetworks so created are connected in parallel [regardless of the sequence])), and 
calculating an output variable of the machine learning system as a function of an input variable of the machine learning system being controlled as a function of the predefinable rollout (Carreira sec. 3, section entitled “Depth-parallel networks,” discloses that in basic depth-sequential video models, the network outputs [calculates] a prediction [output variable] only after all the layers have processed in sequence the current frame; on the other hand, in depth-parallel networks, every layer processes its input [input variable], passes the activations to the next layer, and immediately starts processing the next input available, without waiting for the whole network to finish computation for the current frame [so the temporal relationship between input and output is dependent upon whether the rollout of the network is depth-sequential or depth-parallel]).”

Regarding claim 2, Carreira discloses that “when assigning the rollout, each connection and each layer is assigned a control variable, the sequence characterizing a sequence of ascertaining the intermediate variables or layers (Carreira sec. 3, subsection entitled “Depth-parallel networks” and Fig. 1 and accompanying caption disclose that in basic depth-sequential video models, the input to each layer is the output of the previous layer at the same time step, and the network outputs a prediction only after all the layers have processed in sequence the current frame [i.e., every node and layer contains a variable that characterizes that the output of each node is passed sequentially to the next node at the same time step]; in a depth-parallel model, every layer in the network processes its input, passes the activations to the next layer, and immediately starts processing the next input available, without waiting for the whole network to finish computation for the current frame; this is achieved by substituting in the unrolled graph the vertical edges by diagonal ones, so the input to each layer is still the output from the previous layer at the previous time step [i.e., each layer and connection has a variable that characterizes that the output/intermediate variable of each layer will be calculated as a function of the output of the previous layer at the previous time step]), and the sequence [is] executed step by step, per step one of the layers ascertaining its output variable according to the sequence (Carreira sec. 3, subsection entitled “Depth-parallel networks” and Fig. 1 and accompanying caption disclose that in depth-sequential video models, the input to each layer is the output of the previous layer at the same time step, and the network outputs a prediction only after all the layers have processed in sequence the current frame; in a depth-parallel network, the input to each layer is the output from the previous layer, but from the previous time step [i.e., one layer computes its output per time step and passes its output to the next layer at the next time step]).” 

Regarding claim 3, Carreira discloses that “when controlling the calculation of the machine learning system, … each of the layers ascertains step-wise [and] in succession[] the intermediate variable according to the sequence of the rollout … in each case at a predefinable point in time of a sequence of points in time (Carreira Fig. 1(a) and accompanying text and sec. 3, subsection entitled “Depth-parallel networks,” show that in a basic image model [i.e., model that determines intermediate variables according to the sequence], the second layer determines its output/intermediate variable for time 0 from the input at time 0, the third layer determines its output/intermediate variable for time 0 from the output of the second layer at time 0 [predefinable point in time], etc. in succession, and that the same is true at times 1, 2, 3, and 4 [sequence of points in time]), those layers that ascertain their intermediate 74495729.114variables regardless of the sequence each ascertaining their intermediate variables, in each case at each step, … at the respective predefinable points in time (Carreira Fig. 1(b) and accompanying text and sec. 3, subsection entitled “Depth-parallel networks,” show that in a depth-parallelization model [i.e., model that determines intermediate variables regardless of the sequence], the second layer determines its output/intermediate variable for time -2 from the input at time 0, the third layer determines its output/intermediate variable for time -1 from the output of the second layer at time -2 [predefinable point in time], etc. in succession, and that the intermediate variables corresponding to inputs at times 1, 2, 3, and 4 are determined similarly).”3  

Regarding claim 4, Carreira discloses that “the machine learning system includes at least one skip connection, which connects a first layer to a second layer (Carreira Fig. 1 and accompanying text disclose that it is possible to train the network to anticipate the correct output to reduce latency, and that this task can be made easier if the model has skip-connections; Fig. 1(d) shows a skip connection from a first layer to a fourth layer [second layer])[,] and the first layer and the second layer are also directly connected with the aid of at least two connections (Carreira Fig. 1(d) shows that the first layer is connected to the fourth layer [the “second layer” of the claim] via connections between the first and second layers, second and third layers, and third and fourth layers; note that, though the connections are between layers at different time steps, they are still connections between consecutive layers).”

Regarding claim 5, Carreira discloses that “the machine learning system includes at least one recurrent connection (Carreira sec. 3.1 discloses that an internal state may be created using any spatial recurrent module such as convolutional versions of vanilla RNNs or LSTMs [which contain recurrent connections]; see also sec. 4 (disclosing that some classification models are trained recurrently)).”  

Regarding claim 6, Carreira discloses that “those layers that ascertain their intermediate variables regardless of the sequence, ascertain their intermediate variables as a function of a … chronologically preceding intermediate variable … of a chronologically preceding calculation step, of the previous layer (Carreira Fig. 1(b) shows that, for a depth-parallel network, the input for each layer of the unrolled network is the output of the previous layer at an earlier timestep; for instance, the input to the third layer of the network at time y – 2 is the output of the second layer at time y – 3 [i.e., the output/intermediate variable was calculated at the previous layer in a chronologically preceding timestep]), [and] those layers that ascertain their intermediate variable according to the sequence ascertain[] their intermediate variable as a function of a … chronologically instantaneous intermediate variable … of an instantaneous calculation step, of the preceding layer (Carreira Fig. 1(a) discloses that, for a depth-sequential model, the input of a layer is the output [intermediate variable] of the previous layer at the same [instantaneous] time step).”

Regarding claim 7, Carreira discloses that “the machine learning system does not include a closed path4 (see Carreira Figs. 1, 3, and 4-5 and note that none of the models depicted contains a closed loop in which the output of a node is fed back into the node, nor do they contain any direct connection between the input node and the output node).”  

Regarding claim 8, Carreira discloses that “the intermediate variables of those layers, that ascertain their intermediate variable regardless of the sequence, are each ascertained in parallel (Carreira sec. 3, section entitled “Depth-parallel networks” discloses that in depth-parallel networks that substitute in the unrolled graph the vertical edges by diagonal ones, thereby making the input to each layer the output from the previous layer from the previous time step, it is possible, given enough computing cores, to process all layers at one time step in parallel).” 

Regarding claim 9, Carreira discloses that “the ascertainment in parallel of the intermediate values is carried out on processor cores connected in parallel (Carreira sec. 3, section entitled “Depth-parallel networks” discloses that in depth-parallel networks that substitute in the unrolled graph the vertical edges by diagonal ones, thereby making the input to each layer the output from the previous layer from the previous time step, it is possible, given enough computing cores, to process all layers at one time step in parallel; see also sec. 5.6, first two paragraphs (disclosing that in models implemented using TensorFlow, TensorFlow can use parallelism to run multiple operations in parallel and parallelize a single operation)).”  

Regarding claim 10, Carreira discloses that “the intermediate variables of those layers, that ascertain their intermediate variable regardless of the sequence, are ascertained asynchronously (Carreira Fig. 1(b) shows, for instance, that input I0 is input to the first layer of the network at time y – 3, and the output of that layer is sent to the second layer at time y – 2, whose output is sent to the third layer at time y – 1, etc. [i.e., the outputs of the hidden layers/intermediate variables are calculated asynchronously]).”  

Regarding claim 11, Carreira discloses that “when the machine learning system is provided an input variable for the first time, it is checked after each step during the step-wise ascertainment according to the sequence of the output variable of the machine learning system, whether those layers that ascertain their intermediate variable regardless of the sequence are each provided an already ascertained intermediate variable of a previous layer (Carreira sec. 3, subsection entitled “Depth-parallel networks” and Fig. 1 and accompanying caption disclose that in a depth-parallel network, every layer in the network processes its input, passes the activations to the next layer, and immediately starts processing the next input available without waiting for the whole network to finish computation for the current frame; this is accomplished by substituting in the unrolled graph the vertical edges by diagonal ones, so the input to each layer is still the output from the previous layer from the previous time step [i.e., each layer that calculates its hidden layer output non-sequentially does so by first checking whether a previous layer’s output has been provided to it as input, and if so, calculating its own output based thereon; this is true at every timestep at which the layers receive the previous layer’s output from a previous timestep, including the timesteps immediately following the receipt of the initial input]).” 
 
Regarding claim 12, Carreira discloses that “a plurality of the control variables of the predefinable rollout characterize that respective intermediate variables are ascertained regardless of the sequence (Carreira Fig. 3 and sec. 3, subsection entitled “Levels of parallelism” disclose a semi-parallel model with three subnetworks of two layers, the layers of each subnetwork being traversed in sequence, and each subnetwork running in parallel with each other [i.e., the two sets of connections between subnetworks each characterize that the first intermediate variable of the second subnetwork and the first intermediate variable of the third network, respectively, are ascertained regardless of the sequence]; see also Fig. 1(b) (depicting a fully parallel network in which every intermediate variable is determined regardless of the sequence)).”  

Regarding claim 13, Carreira discloses that “during the calculation of the machine learning system, the machine learning system is provided a sequence of input variables … of an input layer of the machine learning system, in direct succession, in each case, at a time step of a sequence of time steps (Carreira Fig. 1 (a)-(d) discloses that at each timestep y0-y4 (y-3 to y1 in the case of Fig. 1(b)), one of the inputs I0-I4 of the input sequence is input to the system (e.g., in Fig. 1(a), I0 is input at time y0, and in direct succession, input I1 is input at time y1, etc.)), each layer ascertaining at each time step as a function of an input variable, the respective intermediate variable, which in each case is assigned to one of the input variables (Carreira Fig. 1(a)-(d) and accompanying caption and sec. 3, subsection entitled “Depth-parallel networks,” disclose that all of the video models send their input to a subsequent layer for calculation of an intermediate variable associated with that input variable, with the main difference among them being to which time step the output of the previous layer belongs).”

Regarding claim 14, Carreira discloses that “the machine learning system is assigned a plurality of different rollouts, in each case the calculation of the machine learning system being controlled as a function of the assigned rollouts (Carreira sec. 3, subsection entitled “Levels of parallelism” discloses that there exist models between the two extremes of fully sequential and fully parallel; these semi-parallel models are produced by grouping together contiguous layers into sequential blocks of k layers called parallel subnetworks, each of which runs independently of the others [i.e., the calculation of each parallel subnetwork is a function of the number of layers per subnetwork the rollout assigns]; Fig. 3 shows one example in which k = 2 and another in which k = 3 [so there is a plurality of different rollouts]), the controlled calculations of the machine learning system 74495729.116being compared with at least one predefinable comparison criterion, the predefinable rollout being selected as a function of the comparison of the rollouts (Carreira sec. 3, subsection entitled “Levels of parallelism” discloses that the existence of the semi-parallel models makes it possible to trade of accuracy and efficiency [comparison criteria; i.e., in selecting the appropriate level of parallelism in the rollout, the model that gives the best tradeoff between accuracy and efficiency for the specific task to be performed is selected]).”  

Regarding claim 15, Carreira discloses that “in the case of one of the rollouts, all connections and layers are each assigned the same control variable, so that each of the output variables is ascertained regardless of the sequence … in the subsequent time step (Carreira Fig. 1 and accompanying text and sec. 3, subsection entitled “Depth-parallel networks,” disclose that in a depth-parallel network, every layer of the network processes its input, passes the activations to the next layer, and immediately starts processing the next input variable without waiting for the whole network to finish computation for the current frame; this is achieved by making the input to each layer the output from the previous layer from the previous time step [i.e., every node in the network is configured to pass its output to the next layer at a subsequent time step, or each output variable is ascertained regardless of the sequence; the “control variable” here is the depth-parallel architecture, which applies to every layer and every connection because each and every node sends its output to a subsequent layer at a subsequent time step]).”  

Regarding claim 16, Carreira discloses that “in the case of one of the rollouts, all connections or layers are each assigned the same control variable, so that each of the output variables is ascertained regardless of the sequence … in the subsequent time step (Carreira Fig. 1 and accompanying text and sec. 3, subsection entitled “Depth-parallel networks,” disclose that in a depth-parallel network, every layer of the network processes its input, passes the activations to the next layer, and immediately starts processing the next input variable without waiting for the whole network to finish computation for the current frame; this is achieved by making the input to each layer the output from the previous layer from the previous time step [i.e., every node in the network is configured to pass its output to the next layer at a subsequent time step, or each output variable is ascertained regardless of the sequence; the “control variable” here is the depth-parallel architecture, which applies to every layer and every connection because each and every node sends its output to a subsequent layer at a subsequent time step]; note that “or” is being construed as an inclusive or).”

Regarding claim 17, Carreira discloses that “when assigning the rollout, those connections that connect a first layer with a second layer and the first layer and the second layer are also directly connected with the aid of at least two connections, are assigned the control variable, so that the intermediate variable of the second layer is ascertained regardless of the sequence (Carreira Fig. 1(d) and accompanying text disclose that the model may be one of predictive depth-parallelization in which the network is trained to anticipate the correct output and the model additionally has skip connections [connections that directly connect the input layer [first layer] and the output layer [second layer], the input and output layers also being connected via three connections from the input to the first hidden layer, the first hidden layer to the second hidden layer, and the second hidden layer to the output]; note that all connections between consecutive layers in this architecture are between the output of the previous layer at a previous time step and the input of a subsequent layer at a current time step [so the intermediate variables of all layers, including the output layer, are ascertained regardless of the sequence]; here the “control variable” is a variable, applying to the entire network and thus to the connections, that assigns the predictive depth-parallel architecture with skip connections).”  

Regarding claim 19, Carreira discloses that “the rollouts are compared with one another based on the predefinable comparison criterion (Carreira sec. 3, subsection entitled “Levels of parallelism” discloses that semi-parallel models that are neither fully sequential nor fully parallel may be selected, which makes it possible to trade off accuracy and efficiency [predefinable comparison criteria]), the predefinable criterion as a function of the control of the machine learning system being ascertained as a function of the respectively assigned rollout (Carreira sec. 3, subsection entitled “Levels of parallelism” discloses that semi-parallel models that are neither fully sequential nor fully parallel may be selected, which makes it possible to trade off accuracy and efficiency [predefinable comparison criteria, which are determined as a function of the rollout]), the predefinable criterion including one or a plurality of the following listed comparison criteria:  
74495729.117- a first variable, which characterizes a number of time steps required in order, starting with a first time step at which the input layer is provided the input variable, to ascertain the output variable up to a second time step, the output layer not being connected to any additional layer (Carreira sec. 3, subsection entitled “Latency and throughput,” discloses that the computational latency [first variable] is defined as the time delay [number of timesteps] between the moment a frame is fed into the network and the moment when the network outputs a prediction [output variable] for that frame; for a sequential model, throughput is the inverse of computational latency, whereas for the depth-parallel models, the model may make predictions at the rate of its slowest layer [i.e., the computational latency may be used to determine whether to select a sequential or a parallel rollout]; see also Fig. 1 (showing the input layers being fed inputs I and output layers outputting the outputs y, the output layer not being connected to any additional layer)), 
- a second variable, which characterizes how many output variables the machine learning system ascertains within a predefinable number of time steps (Carreira sec. 3, subsection entitled “Latency and throughput” discloses that throughput [second variable] is defined as the output rate of a network, or for how many frames the network outputs predictions in a time unit [predefinable number of time steps = 1]; for a sequential model, throughput is the inverse of computational latency, whereas for the depth-parallel models, the model may make predictions at the rate of its slowest layer [i.e., the throughput may be used to determine whether to select a sequential or a parallel rollout])[,] 
- a third variable, which characterizes how reliable … an accuracy of the output variable[s] of the machine learning system [is], the output variables of the machine learning system [being] with the aid of the respective rollout (Carreira sec. 3, subsection entitled “Levels of parallelism” discloses that semi-parallel models that are neither fully sequential nor fully parallel may be selected, which makes it possible to trade off accuracy [third variable] and efficiency [note that the accuracy is with respect to the output variables, see Table 1]), 
- a fourth variable, which characterizes a period of time after which a start-up phase is completed, or the classification accuracy has reached a maximum value, 
- a fifth variable, which characterizes how many connections … in direct succession[] include the same control variable (Carreira sec. 3, subsection entitled “Levels of parallelism” and Fig. 3 disclose that the semi-parallel models may have contiguous layers grouped together in blocks of k layers called parallel subnetworks [k = variable that determines how many consecutive layers are connected sequentially, or include a control variable for sequential connection]).”  

Regarding claim 20, Carreira discloses that “the rollouts are also compared with a rollout based on the predefinable comparison criterion, in which all control variables provide the processing of the results according to the sequence (Carreira sec. 3, subsection entitled “Levels of parallelism,” discloses that a set of semi-parallel models exist in between the fully parallel models and the fully sequential models [i.e., those in which all variables provide processing of results according to the sequence], which makes it possible to trade off accuracy and efficiency [predefinable comparison criteria; note that accuracy and efficiency of the semi-parallel models are compared both to each other and to the fully sequential and fully parallel models]; see also sec. 3, subsection entitled “Latency and throughput” (disclosing that computational latency and throughput are also criteria that may be used to compare models to each other), Fig. 1 and sec. 3, subsection entitled “Depth-parallel networks” (comparing depth-sequential models with depth-parallel models)).”

Regarding claim 22, Carreira discloses that “the layers of the machine learning system are in each case a layer of a deep neuronal network (Carreira sec. 3, section entitled “Pipelined operations and temporal receptive field” discloses that in a standard neural network, the temporal receptive field of a layer is a subset of the temporal receptive field of the next deeper layer in the network; stacked convolutions and temporal pooling layers may be used to increase the visual field for deeper layers [i.e., each layer is a layer of a deep neural network such as a recurrent CNN]).”  

Regarding claim 23, Carreira discloses that “the machine learning system classifies the input variable, in particular, an image sequence (Carreira sec. 4, first sentence discloses that the principles of depth-parallelization can be applied starting from two popular image classification models [i.e., the model is used for classifying an input variable]; sec. 3, first paragraph discloses that a directed graph may be obtained by unrolling a video model with n layers over time and that video processing can be parallelized by processing different frames [of the image sequence comprising the video] in different computing cores).”  

Regarding claim 24, Carreira discloses that “the classification takes place image element-wise [and] is … segmented (Carreira Fig. 1 and accompanying caption and section entitled “Latency and throughput” disclose that two properties of the network are computational latency, or the time delay between the moment when a frame [image element] is fed to the network and the moment when the network outputs a prediction for that frame, and throughput, or the number of frames for which the network outputs predictions in a time unit [i.e., the prediction/classification is segmented by image frame]).”  

Regarding claim 25, Carreira discloses that “the input variable of the input layer is a detected sensor variable (Carreira sec. 3, subsection entitled “Depth-parallel networks” discloses that, for instance, in depth-sequential video models, the input to each layer is the output of the previous layer at the same time step, and the network outputs a prediction only after all the layers have processed in sequence the current frame [so the input data consist of input variables derived from a video frame captured [detected] by a camera [sensor]]) and … a control variable is ascertained as a function of the calculation of the machine learning system (Carreira Fig. 3 and sec. 3, subsection entitled “Levels of parallelism,” disclose that between fully-sequential and fully-parallel models, there is a space of semi-parallel models in between, which makes it possible to trade off accuracy and efficiency [i.e., the choice of architecture, or the choice of variables determining whether each connection is depth-sequential or depth-parallel, depends on which model calculates the results most accurately and/or efficiently]).”  

Regarding claim 26, Carreira discloses that “the device is used for training the machine learning system (Carreira Fig. 1 and accompanying caption disclose that the it is possible to train the depth-parallel network to anticipate the correct output to reduce the latency [i.e., the device trains the system]; see also sec. 3.3 (describing the training process)).” 
 
Regarding claim 27, Carreira discloses that “the device is used for a real time processing of a video with the aid of the machine learning system (Carreira Fig. 1 and accompanying caption disclose that the throughput of a basic sequential image model can be increased for real-time video processing using depth-parallelization).”  

Regarding claim 28, Carreira discloses that “the device is for controlling a calculation of the machine learning system (Carreira sec. 3, section entitled “Depth-parallel networks” discloses that in depth-sequential video models, the input to each layer is the output of the previous layer at the same time step and the network outputs a prediction only after all the layers have processed in sequence the current frame and that, by contrast, in depth-parallel networks, the input to each layer is the output from the previous layer from the previous time step [output = calculation of the machine learning system, controlled by a device]).”

Allowable Subject Matter
Subject to resolution of the above-mentioned claim objections, claims 18 and 21 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN C VAUGHN whose telephone number is (571)272-4849. The examiner can normally be reached M-R 7a-5:30p ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar, can be reached at 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/RYAN C VAUGHN/             Examiner, Art Unit 2125


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 Examiner notes that the applied reference, disclosed less than one year before the actual filing date, contains an inventor in common with the instant application.  Thus, Applicant may overcome this rejection by, inter alia, providing an affidavit or declaration under 37 CFR § 1.130(a) explaining the roles of additional authors Köhler and Pfeil and clearly asserting that the subject matter relied upon in the reference was the work of the instant inventor.  See MPEP §§ 2153.01(a), 2155.01.
        2 The term “regardless of the sequence” here is being construed to mean, in accordance with the definition given by the specification, that the calculations of the intermediate variables of the layers takes place decoupled from the sequence  See specification p. 2.
        3 As noted above, Examiner is interpreting the claim to mean roughly that both in the model in which the intermediate variables are determined in sequence and in the model in which the intermediate variables are determined in parallel, the intermediate variables are each determined at one of a sequence of timesteps.
        4 In accordance with the definition given on p. 3 of the specification, the term “closed path” is being construed to mean that the beginning and the end of the path are connected to one another.  However, since the network depicted in Fig. 1 connects beginning node a to ending node d via nodes b and c, Examiner construes this language to mean that the beginning and the end of the path are directly connected to one another.