DETAILED ACTION
Currently claims 1-20 are pending for application 16/281737 filed 12 June 2019. All references in the IDS have been considered. It is noted that a translated version of the  foreign priority document (KR10-2018-0020005) has not been filed to establish the 2 February 2018 priority date.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Each of claims 6 and 16 is objected to because of the following informalities:  
Each of claims 6 and 16 recites “having a bit value of 1 a pattern of bits in the dropout information having …” which should read instead having a bit value of 1, a pattern of bits in the dropout information having …” 
Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are:
encoder in claim 1.
selector in claim 10.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101. because the claims are directed to an abstract idea; and because the claims as a whole, considering all claim elements both individually and in combination, do not amount to significantly more than the abstract idea, see Alice Corporation Pty. Ltd. v. CLS Bank International, et al, 573 U.S. (2014).
As an initial matter, according to the first part of the Alice analysis (Step 1), the claims were determined to be directed to one of the four statutory categories: an article of manufacture, a method/process (claims 11-20), a machine/system/product (claims 1-10), and/or a composition of matter.
Secondly, based on the claims being determined to be within one of the four categories (i.e., process, machine, manufacture, or composition of matter) it must be determined if the claims are directed to a judicial exception (i.e., law of nature, natural phenomenon, and abstract idea) (Step 2A). This step consists of a two-prong inquiry: (1) Does the claim recites an abstract idea, law of nature, or natural phenomenon? and (2) Does the claim recite additional elements that integrate the judicial exception into a practical application?
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite mathematical concepts. This judicial exception is not integrated into a practical application because it fails to integrate the judicial exception into a practical application and generic recited computer elements do not add meaningful limitations The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception as discussed in the following analysis.
Regarding independent claims 1 and 11 the following analysis shows that the limitations recite the judicial exception of an abstract idea in the mathematical concepts and mental processes groups and do not recite additional elements that integrate the judicial exception into a practical application.

Claim 1 does not satisfy the two-Prong Test as explained in the analysis of each limitation below:
Step 2A
Prong 1:
An encoding … comprising: … a random number sequence generated by a random number generator;    (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of generating a random number sequence using a random number generator.  The mere recitation of a generic computer device to perform this generation of a proposal does not take the claim limitation out of the mathematical concepts group.
and an … to … dropout information of a …, the dropout information indicating a ratio between connected edges and disconnected edges of a plurality of edges included in a layer of the deep neural network, (Yes)  The claim, under its broadest reasonable interpretation, recites information used to perform functions involving mathematical concepts (i.e., the mathematical steps of generating edge sequences). The mere recitation of a generic computer device to process this information does not take the claim limitation out of the mathematical concept group. 
generate an edge sequence indicating connection or disconnection of the plurality of edges based on the dropout information and the random number sequence,  - (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of generating an edge sequence with connections or disconnections according to the random number sequence. The mere recitation of a generic computer device to perform this selection does not take the claim limitation out of the mathematical concepts group. 
 and … the edge sequence for reconfiguring the connection or disconnection of the plurality of edges. (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of configuring the connection or disconnections of edges according to a mathematical output. The mere recitation of a generic computer device to perform this reconfiguration does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim recites additional elements:
An encoding apparatus comprising: a memory storing … encoder configured - The encoding apparatus, including the encoder circuit with memory used to receive dropout information and to perform the mathematical steps of generating edge sequences for processing in a neural network layer are recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer component. As noted above, the “encoder” in the claims is being interpreted as a generic placeholder without the recitation of sufficient accompanying structure to perform the function a review of the specification shows that the following appears to be the corresponding structure described in the specification ([0016, 0059] “The encoder may generate the edge sequence based on a pattern of bits in the random number sequence having a bit value of 0 and bits in the random number sequence having a bit value of 1 a pattern of bits in the dropout information having a bit value indicating a connected edge and bits in the dropout information having a bit value indicating a disconnected edge., Referring to FIG. 1A, the encoding apparatus 100 includes a memory 110 and an encoding circuit 130 or encoder. In addition, the encoding apparatus 100 may be connected to a learning circuit 120 performing learning of a deep neural network. The encoding apparatus 100 may receive information output during an operation process of the learning circuit 120 and may also transmit information generated in the encoding apparatus 100 to the learning circuit 120.”)
Receive … output The accessing and outputting of data are data gathering operations which are an insignificant extra-solution activities that do not take the claim out of the mental processes and methods of organizing human behavior groups (see MPEP 2106.05(g)).
deep neural network … The deep neural network that has dropout information used in the performance of the mathematical steps are recited at a high level of generality that merely generally links the judicial exception to a particular technological environment. 
None of these four additional elements integrate the judicial exception into a practical application because the computing devices and the training of a machine learning model are recited at a high level of generality and correspond to generic computer functions.  
In addition, according to the second part of the Alice/Mayo test (step 2B), it must be determined if the claim as a whole recite something significantly more than the judicial exception, when considered both individually and as an ordered combination. The recitation in the preamble is insufficient to transform a judicial exception to a patentable invention because the preamble elements are recited at a high level of generality that simply linked to a field of use, see MPEP 2106.05(h). The examiner further notes that the claim limitation(s) below are deemed insufficient to transform a judicial exception to a patentable invention, as described in the analysis that follows below:
The elements in the limitations below are insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Generic computer (mobile computing device) implemented method, processing resources as noted above.
Receive … output It is first noted that, in addition to being a functional step performed by generic computer components,  the extra-solution of data gathering (receiving/outputting data) is acknowledged to be well-understood, routine, conventional activity (see, e.g., court recognized WURC examples in MPEP 2106.05(g)(3)). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.
deep neural network … as noted above. 
As discussed in the step 1, 2A Prongs 1 and 2, and 2B analyses, claim 1 limitations examined individually or as an ordered combination recites no meaningful limitations that amount to significantly more than the exception itself. In particular, there are no indication that the combination of elements improves the functioning of a computer or improves another technology. Therefore, when looking at the claim elements individually or an ordered combination, claim 1 does not recite identified elements deemed by the courts as "significantly more”.
Independent claim 11 recites similar elements analyzed in claim 1 above and is rejected for the same reasons as claim 1. Specifically, according to the second part of the Alice/Mayo test (step 2B), it must be determine if the claim as a whole recite something significantly more than the judicial exception, when considered both individually and as an ordered combination. The recitation in the preamble is insufficient to transform a judicial exception to a patentable invention because the preamble elements are recited at a high level of generality that simply linked to a field of use, see MPEP 2106.05(h). The examiner further notes that the claim limitations below are deemed insufficient to transform a judicial exception to a patentable invention, as described in the analysis that below
The elements in the limitations below are insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g): Generic computer components recited at a high level of generality, namely,
encoding method of an encoding apparatus:
As discussed in the step 1, 2A Prongs 1 and 2, and 2B analyses, limitations of each of claim 8, examined individually or as an ordered combination recites no meaningful limitations that amount to significantly more than the exception itself. In particular, there are no indication that the combination of elements improves the functioning of a computer or improves another technology. Therefore, when looking at the claim elements individually or an ordered combination, claim 11 does not recite identified elements deemed by the courts as "significantly more”.
Furthermore, regarding the dependent claims 2-10 which are dependent on claim 1, the disclosed limitations does not recite identified elements deemed by the courts as "significantly more”. The examiner notes that the dependent claims elements that are deemed insufficient to transform a judicial exception to a patentable invention and are considered part of the abstract idea as noted below:
Claim 2:
Step 2A
Prong 1 (Yes):
wherein the random number sequence is based on … of the random number generator (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of generating a random number sequence using the clock signal of the random number generator. The mere recitation of a generic computer device to perform these steps does not take the claim limitation out of the mathematical concepts groups.
Prong 2 (No): The claim recites additional elements:
a clock signal - The clock signal used by the random number generator used to perform the mathematical steps of generating a random number is recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer component.
These additional elements do not integrate the judicial exception into a practical application because the computing devices are recited at a high level of generality and correspond to generic computer functions.
Step 2B: 
The elements in the limitations below are insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Generic computing components as noted above. In addition, it is noted that the extra-solution activity of using a clock signal to generate a random number is routine and well known (see, for example Yeoh et al. (“A Hardware-Oriented Dropout Algorithm for Efficient FPGA Implementation”, ICONIP 2017, Part VI, LNCS 10639, 2017, pp. 821-829) – viz. ([p. 823, Section 3, Table 2] To calculate Eq. (3) on an FPGA, a hardware RNG and a comparator should be implemented. However, for hardware implementation, the biggest problem is that the RNGs in FPGA consume a large number of resources [10]. Another issue is parallelism. Equation (3) is repeated to generate the whole dropout mask for all i. This looping process is not favorable to the FPGA implementation because it is serial processing and slows down the process. Multiple RNGs are required for parallel processing, which massively increases the consumption of FPGA resources., In Table 2, we observe that with the proposed method, the number of required registers and look-up tables (LUTs) were much less in comparison to ordinary RNG-based methods. For the serial implementation, it took less resources; however, the number of clock cycles was increased, which caused the FPGA advantage to be lost.) in which a new method of generating random signals using a clock cycle of a random number generator is compared to more conventional methods for generating that sequence, also using a clock cycle.)
Claim 3:
Step 2A
Prong 1 (Yes):
wherein a size of the random number sequence is based on a number of the plurality of edges in the layer of the …. (Yes)  The claim, under its broadest reasonable interpretation, recites additional details of the mathematical step of generating a random number sequence  according to the number of edges in a layer of a neural network.  The mere recitation of a generic computer device to perform this generation o does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim recites additional elements:
deep neural network - The deep neural network that has dropout information used in the performance of the mathematical steps are recited at a high level of generality that merely generally links the judicial exception to a particular technological environment.  
These additional elements do not integrate the judicial exception into a practical application because the computing devices are recited at a high level of generality and correspond to generic computer functions.
Step 2B: 
The element in the limitations below is insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Generic computer implemented method, processing resources as noted above. 
deep neural network – as noted above.
Claim 4:
Step 2A
Prong 1 (Yes):
wherein the dropout information, the random number sequence, and the edge sequence have a bit width formed of binary numbers.   (Yes)  The claim, under its broadest reasonable interpretation, recites additional details of the mathematical step of generating a random number sequence  according to the drop out information in which the random number sequence and edge sequence are binary numbers with a bit width.  The mere recitation of a generic computer device to perform this generation o does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim does not recite additional elements:
Step 2B
The claim does not recite additional elements that the courts have identified as “significantly more” for the same reasons as pointed out in claim 2.

Claim 5:
Step 2A
Prong 1 (Yes):
wherein the … is further configured to generate the edge sequence based on a first ratio of bits in the random number sequence having a bit value of 0 to bits in the random number sequence having a bit value of 1, and a 47second ratio of bits in the dropout information having a bit value indicating a connected edge to bits in the dropout information having a bit value indicating a disconnected edge. (Yes)  The claim, under its broadest reasonable interpretation, recites the generation of an edge sequenc3e based on a ratio of bits with 0 or 1 value in the edge sequence and in the dropout information.  The mere recitation of a generic computer device to perform this generation o does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim recites additional elements:
encoder …  The encoding apparatus, including the encoder circuit used to receive dropout information and to perform the mathematical steps of generating edge sequences according to ratios of bits with 1 and 0 are recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer component.
This additional element does not integrate the judicial exception into a practical application because the computing devices are recited at a high level of generality and correspond to generic computer functions.  
Step 2B: 
The element in the limitations below is insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Generic computer implemented method, processing resources as noted above. 
Claim 6:
Step 2A
Prong 1 (Yes):
wherein the … is further configured to generate the edge sequence based on a pattern of bits in the random number sequence having a bit value of 0 and bits in the random number sequence having a bit value of 1 a pattern of bits in the dropout information having a bit value indicating a connected edge and bits in the dropout information having a bit value indicating a disconnected edge. (Yes)  The claim, under its broadest reasonable interpretation, recites the generation of an edge sequenc3e based on a pattern of bits with 0 or 1 value in the edge sequence.  The mere recitation of a generic computer device to perform this generation o does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim recites additional elements:
encoder …  The encoding apparatus, including the encoder circuit used to receive dropout information and to perform the mathematical steps of generating edge sequences according to the pattern  of bits with 1 and 0 is recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer component.
This additional element does not integrate the judicial exception into a practical application because the computing devices are recited at a high level of generality and correspond to generic computer functions.  
Step 2B: 
The element in the limitations below is insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Generic computer implemented method, processing resources as noted above.  

Claim 7:
Step 2A
Prong 1 (Yes):
wherein the size of the random number sequence is equal to a quantity of edges in the layer of the ….   (Yes)  The claim, under its broadest reasonable interpretation, recites additional details of the mathematical step of generating a random number sequence  according to the correspondence between the size of the random number sequence and a number of edges in the layer of the neural network.  The mere recitation of a generic computer device to perform this generation o does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim recites additional elements:
deep neural network - The deep neural network that has dropout information used in the performance of the mathematical steps are recited at a high level of generality that merely generally links the judicial exception to a particular technological environment.  
These additional elements do not integrate the judicial exception into a practical application because the computing devices are recited at a high level of generality and correspond to generic computer functions.
Step 2B: 
The element in the limitations below is insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Generic computer implemented method, processing resources as noted above. 
deep neural network – as noted above.
Claim 8:
Step 2A
Prong 1 (Yes):
wherein the edge sequence is a basis for a dropout operation in the layer of the …. (Yes)  The claim, under its broadest reasonable interpretation, recites additional details of the mathematical step of generating a random number sequence that is used to perform a dropout operation in a layer of a neural network (also a mathematical calculation step).  The mere recitation of a generic computer device to perform this generation o does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim recites additional elements:
deep neural network - The deep neural network that has dropout information used in the performance of the mathematical steps are recited at a high level of generality that merely generally links the judicial exception to a particular technological environment.  
These additional elements do not integrate the judicial exception into a practical application because the computing devices are recited at a high level of generality and correspond to generic computer functions.
Step 2B: 
The element in the limitations below is insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Generic computer implemented method, processing resources as noted above. 
deep neural network – as noted above.
Claim 9:
Step 2A
Prong 1 (Yes):
wherein the … is further configured to … weights of the plurality of edges in the layer of the …, perform a pruning operation based on a result of comparing the weights with a preset threshold weight, and generate the edge sequence to indicate connection or disconnection of the plurality of edges of the layer of the … based on the pruning operation.   (Yes)  The claim, under its broadest reasonable interpretation, recites the generation of an edge sequence based on the mathematical operation of pruning weights of edges in a layer of a neural network based on the mathematical operation of comparing a weight with a threshold. The mere recitation of a generic computer device to perform this generation o does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim recites additional elements:
encoder …  The encoding apparatus, including the encoder circuit used to receive dropout information and to perform the mathematical steps of generating edge sequences according to the weight-based pruning operation is recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer component.
obtain The accessing/obtaining and outputting of data (weights) is a data gathering operation which is an insignificant extra-solution activities that do not take the claim out of the mental processes and methods of organizing human behavior groups (see MPEP 2106.05(g)).
deep neural network … The deep neural network that has dropout information used in the performance of the mathematical steps are recited at a high level of generality that merely generally links the judicial exception to a particular technological environment. 
This additional element does not integrate the judicial exception into a practical application because the computing devices are recited at a high level of generality and correspond to generic computer functions.  
Step 2B: 
The element in the limitations below is insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Generic computer implemented method, processing resources as noted above.  
obtain It is first noted that, in addition to being a functional step performed by generic computer components,  the extra-solution of data gathering (receiving/outputting/obtaining data) is acknowledged to be well-understood, routine, conventional activity (see, e.g., court recognized WURC examples in MPEP 2106.05(g)(3)). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.
deep neural network … as noted above. 
Claim 10:
Step 2A
Prong 1 (Yes):
further comprising a … configured to … and … the selected signal, wherein the … is further configured to … an operation result from the … to determine whether overflow has occurred in the operation result and 48perform a dynamic fixed point operation of modifying an expressible range of information used in the … based on whether overflow has occurred.   (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of selecting a signal for output (i.e., performing mathematical comparison operations for selecting a signal for output) and of determining if overflow has occurred (a counting operation) and of performing a dynamic fixed point operation to modify the dynamic/expressible range of information according to the occurrence of the result of the overflow calculation. The mere recitation of a generic computer device to perform these steps does not take the claim limitation out of the mathematical concepts group. 
Prong 2 (No): The claim recites additional elements:
… selector …  The selector apparatus, used to perform the mathematical operation of selecting a signal according to mathematical comparative operations data gathering steps of obtaining and outputting a selected signal is recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer component. As noted above, the “selector” in the claims is being interpreted as a generic placeholder without the recitation of sufficient accompanying structure to perform the function; a review of the specification shows that the following appears to be the corresponding structure described in the specification ([0166, 0168, 0175] “Referring to FIG. 14, the selector 1440 may obtain certain information from each of the register 1442 and the random number generation circuit 1444 and selectively output the two pieces of information. According to an embodiment of the disclosure, the selector 1440 may include a multiplexer that obtains a certain selection signal of n bits (n > 0) … That is, the selector 1440 may selectively output input information of 2^n types that may be determined based on an n-bit selection signal., In detail, in order for the learning circuit 1430 to perform the above-described dropout operation, the selector 1440 may operate such that a signal generated in the random number generation circuit 1444 is output., 42According to an embodiment of the disclosure, the encoding apparatus 1500 connected to the selector 1540 may selectively use at least one of an operation of determining a second edge sequence based on a result of comparing a first edge weight stored in the register 1542 with a preset threshold weight, an operation of determining a second edge sequence based on a result of comparing a first edge sequence stored in the memory 1510 with a random number sequence obtained by using the random number generation circuit 1546, or a learning process performed using the counter 1544 calculating a number of times of overflow of an intermediate calculation result stored in the register 1542.”)
… select one of a plurality of types of input signals … output and receive The selection of a signal for outputting and the reception of that signal are data gathering operations which are insignificant extra-solution activities that do not take the claim out of the mental processes and methods of organizing human behavior groups (see MPEP 2106.05(g)).
deep neural network … The deep neural network that has dropout information used in the performance of the mathematical steps are recited at a high level of generality that merely generally links the judicial exception to a particular technological environment. 
This additional element does not integrate the judicial exception into a practical application because the computing devices are recited at a high level of generality and correspond to generic computer functions.  
Step 2B: 
The element in the limitations below is insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Generic computer implemented method, processing resources as noted above.  
… select one of a plurality of types of input signals … output …receive It is first noted that, in addition to being a functional step performed by generic computer components,  the extra-solution of data gathering (receiving/outputting/obtaining data) is acknowledged to be well-understood, routine, conventional activity (see, e.g., court recognized WURC examples in MPEP 2106.05(g)(3)). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.
deep neural network … as noted above


Therefore, as a whole claims 2-10 do not recite what have the courts have identified as "significantly more”.

Dependent claims 12-20 which depend from claim 11 are rejected for the same reasons as indicated above, respectively, for claims 2-10, respectively.  

In summary, as shown in the analysis above, claims 1-20 do not provide any additional elements that when considered individually or as an ordered combination, amount to significantly more than the abstract idea identified. Therefore, as a whole claims 1-20 do not recite what have the courts have identified as "significantly more”. In particular, there is no indication that the combination of elements improves the functioning of a computer or improves another technology when claims are considered individually or as an ordered combination.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.



Claims 1 and 11 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Felix et al. (US2019/0121639, Filed 1 February 2018 with Foreign application Priority Date of 20 October 2017), hereinafter referred to as Felix. 

In regards to claim 1, Felix teaches An encoding apparatus comprising: a memory storing a random number sequence generated by a random number generator; ([0009, 0010, 0031, 0046, Figure 2] According to an aspect of the invention there is provided an execution unit for executing a computer program comprising a sequence of instructions , the sequence including a masking instruction, wherein the executed unit is configured to execute the masking instruction which when executed by the execution unit masks randomly selected values from a source operand having n values …., The execution unit may comprise a hardware pseudo random number generator (HPRNG) configured to generate a randomised bit string from which random bit sequences are derived for randomly selecting the values to be masked., The instruction is referred to herein as the rmask instruction. The execution unit 2 forms part of a pipeline 4 in a processing unit . The processing unit comprises an instruction fetch unit 6 which fetches instruction from an instruction memory., The above syntax means that bit wpb [ 0 ] is assigned the value ‘ l ’ if cf0 < scr1 . Here , cfo is a 16 - bit unsigned random value , which can have values in the range { 0 . . . 65535 } ; src1 is a 32 - bit wide operand but only the least significant 17 - bits are used for rmask ;, wherein an apparatus (Figure 2) generates a sequence of random numbers that determine the masking or masking of (edge connection) values in a layer of a neural network (an encoding operation) in which that set of random numbers is stored in a (instruction) memory for use in that masking/unmasking operation such that the random number sequence is the sequence of bit fields cf  used to determine the masking instruction or, alternatively the wpb’s that express the masking instruction.), and an encoder configured to receive dropout information of a deep neural network, the dropout information indicating a ratio between connected edges and disconnected edges of a plurality of edges included in a layer of the deep neural network,  ([0027, 0035, 0046, 066, Figure 4] In each case, the rmask instruction has the effect of masking (putting to 0) randomly selected values in the source vector. Such a function has a number of different applications, including in particular functions which have been described above known in the art of neural nets as ’Drop connect’ and ’Drop out’., An input buffer location 26 is provided to hold a probability with which each individual value within the vector will be kept . This probability value may be provided in the instruction ( as shown in FIG . 3 ) , or could be set by an earlier instruction and accessed from a register or memory address with an rmask instruction . …The rmask module uses the random number generated by the PRNG 22 and the probability value 26 to randomly mask ( put to 0 ) values of the input vector., The rmask module uses the random number generated by the PRNG 22 and the probability value 26 to randomly mask (put to 0) values of the input vector. The output vector is placed into an output buffer 28. In Figure 2, the first and third values are shown as having been masked to 0. Note that the probability is the probability with which each individual value will be kept (i.e. not masked). For example, if the probability is set to .5, then each individual value has a probability of .5 that it will be masked.,  the above syntax means … srcl is allowed values in the range { 0 to 65536 } , such that the probability of being ‘ unmasked ' is src1 / 65536 . The rmask instruction has a number of different applications . For example , when implementing Drop out it could be used just prior to writing neural outputs to memory . All neurons in the next stage of the neural net would then pick up the masked activations from the previous layer ., wherein the sequence of numbers corresponding to the random masking or unmasking of edge connections in a layer of a neural network (cf’s or wpb’s) is determined according to the dropout probability (the probability of keeping the edge connections) associated with that layer in which this probability is a ratio between the number connected edges and the total number edges such that the edge connection determination is inherently also based on the ration of the connected edges to the disconnected edges (namely, 1/(1+1/(connected/disconnected))) and such that this probability is an indication at least in in a probabilistic sense in the number of connected edges and the number of disconnected edges (i.e., inherently the ratio) in which, it is noted, that the occurrence of an exact correspondence is also not excluded in this probabilistic framework, wherein the src1 fields are also dropout information since they provide a bit-level representation of the dropout probability, and wherein this dropout connection implementation framework encompasses deep neural network configurations because the disconnections determined across multiple layers/stages of that neural network.) generate an edge sequence indicating connection or disconnection of the plurality of edges based on the dropout information and the random number sequence, ([0010, 0031, 032, Figure 2] The execution unit may comprise a hardware pseudo random number generator ( HPRNG ) configured to generate a randomised bit string from which random bit sequences are derived for randomly selecting the values to be masked ., FIG . 2 shows a schematic block diagram of an execution unit arranged to execute a single instruction for masking randomly selected values in a vector . The instruction is referred to herein as the rmask instruction, In each case , the rmask instruction has the effect of masking ( putting to 0 ) randomly selected values in the source vector… . As already explained , in a neural net a vector representing weights may be provided , these weights representing links between neurons in the net ., wherein the random number sequence (cf or wpb) and the dropout information (e.g., src) are used to generate dropout-based masking/unmasking of values/instructions (wpb relative to the random sequence cf) or to generate the weight values (the sequence of masked/unmasked weights based on the random sequence wpb) that is an edge sequence used to form a corresponding sequence of masking/unmasking weight values, each weight corresponding to an edge connection between successive layers in a neural network, to generate a sequence of connected or disconnected edges (and wherein it is noted that the sequence of weight values thereby masked or unmasked is also and alternatively the edge sequence.) and output the edge sequence for reconfiguring the connection or disconnection of the plurality of edges.  ([0014, 0048-0052, 0067, Figure 2] The values of the source operand may represent weights of links in a neural network ( e . g . to implement Drop connect )., Subsequently the four bits wpb [ 0 . . 3 ] are used respectively to unmask the 4 , 16 - bit values in src0 [ 63 : 01 respectively , thus :…, When implementing Drop connect , the rmask instruction could be used just after reading activations from memory , just prior to calculating the dot product ., wherein the edge sequence (the wpb’s or the masked/unmasked set of weights – 28 in Figure 2) are output for configuring the neural network connectivity between successive layers, such as may be applied given the activation function at a given node in a layer before the application of the (masked or unmasked) edge weight between that node and a node of a successive layer (in other words, the output sequence wpb is used to configure the set of connections or disconnections {w} but also the set of weight connections {w} themselves are also and alternatively used for configuring the connections and disconnections such as may be used for particular operations within the neural network).)  As noted above, the “encoder” in the claims is being interpreted as a generic placeholder without the recitation of sufficient accompanying structure to perform the function a review of the specification shows that the following appears to be the corresponding structure described in the specification ([0016, 0059] “The encoder may generate the edge sequence based on a pattern of bits in the random number sequence having a bit value of 0 and bits in the random number sequence having a bit value of 1 a pattern of bits in the dropout information having a bit value indicating a connected edge and bits in the dropout information having a bit value indicating a disconnected edge., Referring to FIG. 1A, the encoding apparatus 100 includes a memory 110 and an encoding circuit 130 or encoder. In addition, the encoding apparatus 100 may be connected to a learning circuit 120 performing learning of a deep neural network. The encoding apparatus 100 may receive information output during an operation process of the learning circuit 120 and may also transmit information generated in the encoding apparatus 100 to the learning circuit 120.”)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 2-8 and 12-18 are rejected under 35 U.S.C. 103 as being unpatentable over Felix, in view of Yeoh et al. (“A Hardware-Oriented Dropout Algorithm for Efficient FPGA Implementation”, ICONIP 2017, Part VI, LNCS 10639, 2017, pp. 821-829), hereinafter referred to as Yeoh.

In regards to claim 2, the rejection of claim 1 is incorporated and Felix does not further teach wherein the random number sequence is based on a clock signal of the random number generator.  Although Felix indicates that the random number generator can be implemented in software or hardware, including a “transition effect ring oscillator” [0074], he does not disclose an implementation that explicitly uses a clock signal.
However, Yeoh, in the analogous environment of implementing random dropout connections for deep neural networks, teaches wherein the random number sequence is based on a clock signal of the random number generator.  ([Abstract, p. 827, Section 4.4, Table 2] To generate a dropout mask to randomly drop neurons during training phase, random number generators (RNGs) are usually used in software implementations. However, RNGs consume considerable FPGA resources in hardware implementations. The proposed method is able to minimize the resources required for FPGA implementation of dropout by performing a simple rotation operation to a predefined dropout mask. We apply the proposed method to MLPs and CNNs., In Table 2, we observe that with the proposed method, the number of required registers and look-up tables (LUTs) were much less in comparison to ordinary RNG-based methods. For the serial implementation, it took less resources; however, the number of clock cycles was increased, which caused the FPGA advantage to be lost. In contrast, for the parallel implementation of the conventional dropout method, only a single clock cycle was required to generate the mask; however, resource consumption increased significantly. The proposed method was able to achieve parallel processing by generating a dropout mask, in a single clock cycle, with a small number of resources required through the simple operation., wherein a random number sequence used for masking/unmasking an edge connection of a deep neural network is generated based on the clock signal of a hardware-based (FPGA) random number generator as shown in Table 2 (where, it is noted table 2 also characterizes both the efficient generation of the random number sequence which, is being interpreted to be a random number generation process even if based on a single original mask, as well as more conventional methods that require more cycles in the random number sequence generation).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Felix to incorporate the teachings of Yeoh for the random number sequence to be based on a clock signal of the random number generator. The modification would be obvious because one of ordinary skill would be motivated to efficiently generate random dropout connection sequences by minimizing the hardware resources required by that implementation, such as by minimizing the number of clock cycles/signals required to generate that sequence (Yeoh, [Abstract, p. 828, Section 5, Table 2]).

In regards to claim 3, the rejection of claim 2 is incorporated and Felix further teaches wherein a size of the random number sequence is based on a number of the plurality of edges in the layer of the deep neural network.  ([0033, 0035, 0048, Figure 2, Figure 4] Applying the rmask instruction to the vector of weights prior to calculating this dot product can be used to implement the Drop - connect function . Application of the rmask instruction to the vector of weights randomly zeroes individual weights within the vector., The execution unit 2 in FIG . 2 has an input buffer 20 which can hold four 16 - bit values or two 32 - bit values . It is shown holding four 16 - bit values , each representing a weight wo . . . Wz . The unit has a pseudo random number generator 22 implemented in hardware for providing random numbers to an rmask module 24 ., Subsequently the four bits wpb [ 0 . . 3 ] are used respectively to unmask the 4 , 16 - bit values in src0 [ 63 : 01 respectively , thus….,wherein the number of random numbers in the random number sequence (cf’s or wpb’s) corresponds to (is equal to) the number of weights in the sequence of weight values corresponding to the number of edges between successive layers in a neural network (e.g., that number is 4 in Figures 2 and 4).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Felix to incorporate the teachings of Yeoh for the same reasons as pointed out for claim 2.

In regards to claim 4, the rejection of claim 2 is incorporated and Felix further teaches wherein the dropout information, the random number sequence, and the edge sequence have a bit width formed of binary numbers.  ([0035, 0046, 0048-0052, Figure 2, Figure 4]
It is shown holding four 16 - bit values , each representing a weight wo . . . W3..,
The above syntax means that bit wpb [ 0 ] is assigned the value ‘ l ’ if cf0 < scr1 . Here , cfo is a 16 - bit unsigned random value , which can have values in the range { 0 . . . 65535 } ; src1 is a 32 - bit wide operand but only the least significant 17 - bits are used for rmask ; srcl is allowed values in the range { 0 to 65536 } , such that the probability of being ‘ unmasked ' is src1 / 65536., Subsequently the four bits wpb [ 0 . . 3 ] are used respectively to unmask the 4 , 16 - bit values in src0 [ 63 : 01 respectively ,… thus assign aDst [ 15 : 0 ] = ( wpb [ 0 ] = 1 ) ? aSrc0 [ 15 : 0 ] : 16 ' b0 ;…, wherein the sequence of random numbers used to mask/unmask the weight values/edge connections are binary numbers (wpb’s) that correspond to the edge sequence as previously noted in which those sequences are determined by comparing each element in the comparison field set cf0 … cf3  (each element of which has bit width 16) and the field src1 which has bit width 16 (and, as previously noted, also corresponds to dropout information) and corresponds to the dropout probability information and wherein the edge sequences themselves have a total bit width corresponding to the number of weights (i.e., each element in the wpb’s is a single binary number with the full bit width of the sequence equal to the number of weights and, also and alternatively, with the weights themselves that form an edge sequence with bit width of 16 for each element.)   
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Felix to incorporate the teachings of Yeoh for the same reasons as pointed out for claim 2.

In regards to claim 5, the rejection of claim 4 is incorporated and Felix further teaches wherein the encoder is further configured to generate the edge sequence based on a first ratio of bits in the random number sequence having a bit value of 0 to bits in the random number sequence having a bit value of 1, and a 47second ratio of bits in the dropout information having a bit value indicating a connected edge to bits in the dropout information having a bit value indicating a disconnected edge.   ([0033, 0046, 0048-0052, Figure 2, Figure 4] Applying the rmask instruction to the vector of weights prior to calculating this dot product can be used to implement the Drop - connect function . Application of the rmask instruction to the vector of weights randomly zeroes individual weights within the vector ., The above syntax means that bit wpb [ 0 ] is assigned the value ‘ l ’ if cf0 < scr1 . Here , cfo is a 16 - bit unsigned random value , which can have values in the range { 0 . . . 65535 } ; src1 is a 32 - bit wide operand but only the least significant 17 - bits are used for rmask ; srcl is allowed values in the range { 0 to 65536 } , such that the probability of being ‘ unmasked ' is src1 / 65536 ., Subsequently the four bits wpb [ 0 . . 3 ] are used respectively to unmask the 4 , 16 - bit values in src0 [ 63 : 01 respectively ,… thus assign aDst [ 15 : 0 ] = ( wpb [ 0 ] = 1 ) ? aSrc0 [ 15 : 0 ] : 16 ' b0 ;…, wherein the edge sequence (sequence of masked/unmasked weights) is based on the proportion of 1’s (connections) and 0’s (disconnections) in the sequence of masked/unmasked instructions (wpb’s) (i.e., the unmasked/masked weights are directly formed from the wpb random sequence and is based on the proportion of 0’s and 1’s by virtue of the dependence of the number of 0’s and 1’s on the dropout probability and the results of the comparison of the bits of individual elements in the cf field with the dropout probability/src field such that this edge sequence is thereby based on the ratio of connected edges to disconnected edges represented by that comparison) and wherein the edge sequence is also based on the ratio of bits in the srf value to the bits corresponding to the full dynamic range (i.e., the srf values correspond to a dropout probability as a second ratio) such that this probability/ratio is indicative of the proportion of connections relative to disconnections (in other words, the first ratio is a deterministic ratio since it occurs after the generation of the random number sequence whereas the second ratio is a probabilistic ratio since it corresponds to a general probabilistic association between the connectivity and the dropout probability).)   
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Felix to incorporate the teachings of Yeoh for the same reasons as pointed out for claim 2.

In regards to claim 6, the rejection of claim 4 is incorporated and Felix further teaches wherein the encoder is further configured to generate the edge sequence based on a pattern of bits in the random number sequence having a bit value of 0 and bits in the random number sequence having a bit value of 1 a pattern of bits in the dropout information having a bit value indicating a connected edge and bits in the dropout information having a bit value indicating a disconnected edge.  ([0033, 0048-0052, Figure 2, Figure 4] Applying the rmask instruction to the vector of weights prior to calculating this dot product can be used to implement the Drop - connect function . Application of the rmask instruction to the vector of weights randomly zeroes individual weights within the vector ., Subsequently the four bits wpb [ 0 . . 3 ] are used respectively to unmask the 4 , 16 - bit values in src0 [ 63 : 01 respectively ,… thus assign aDst [ 15 : 0 ] = ( wpb [ 0 ] = 1 ) ? aSrc0 [ 15 : 0 ] : 16 ' b0 ;…, wherein the pattern of 1’s and 0’s in the sequence of unmasked (wpb = 1) and masked (wbp = 0) bits in the random number bit sequence is used to generate the edge sequence of weights (with element in that sequence corresponding to a specific edge) such that this pattern of bits is indicative of a pattern of bits in the dropout information because they correspond either to dropout connections (0 – zero weight) or to unmasked connections (1 – non-zero weight) (as revealed according to the comparison operation of between the cf’s and the src).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Felix to incorporate the teachings of Yeoh for the same reasons as pointed out for claim 2.

In regards to claim 7, the rejection of claim 3 is incorporated and Felix further teaches wherein the size of the random number sequence is equal to a quantity of edges in the layer of the deep neural network.  ([0033, 0035, 0048, Figure 2, Figure 4] Applying the rmask instruction to the vector of weights prior to calculating this dot product can be used to implement the Drop - connect function . Application of the rmask instruction to the vector of weights randomly zeroes individual weights within the vector., The execution unit 2 in FIG . 2 has an input buffer 20 which can hold four 16 - bit values or two 32 - bit values . It is shown holding four 16 - bit values , each representing a weight wo . . . Wz . The unit has a pseudo random number generator 22 implemented in hardware for providing random numbers to an rmask module 24 ., Subsequently the four bits wpb [ 0 . . 3 ] are used respectively to unmask the 4 , 16 - bit values in src0 [ 63 : 01 respectively , thus …, wherein the number of random numbers in random number (or edge) sequence corresponds to (is equal to) the number of weights in the sequence of weight values corresponding to the number of edges between successive layers in a neural network (e.g., that number is 4 in Figures 2 and 4).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Felix to incorporate the teachings of Yeoh for the same reasons as pointed out for claim 2.

In regards to claim 8, the rejection of claim 2 is incorporated and Felix further teaches wherein the edge sequence is a basis for a dropout operation in the layer of the deep neural network.  ([0033, Figure 2, Figure 4] Applying the rmask instruction to the vector of weights prior to calculating this dot product can be used to implement the Drop - connect function . Application of the rmask instruction to the vector of weights randomly zeroes individual weights within the vector ., wherein the edge sequence (wpb’s or, also and alternatively, the sequence of masked/unmasked weights) are used to implement the dropout operation in the (deep) neural network such as the dot product applied to the activation function in which the weights include dropout connectivity.) 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Felix to incorporate the teachings of Yeoh for the same reasons as pointed out for claim 2.

Claim 11 is also rejected because it is just a method implementation of the same subject matter of claim 1 which can be found in Felix. 

Claim 12/11 is also rejected because it is just a method implementation of the same subject matter of claim 2/1 which can be found in Felix and Yeoh. 
Claim 13/12 is also rejected because it is just a method implementation of the same subject matter of claim 3/2 which can be found in Felix and Yeoh. 

Claim 14/12 is also rejected because it is just a method implementation of the same subject matter of claim 4/2 which can be found in Felix and Yeoh. 

Claim 15/14 is also rejected because it is just a method implementation of the same subject matter of claim 5/4 which can be found in Felix and Yeoh. 

Claim 16/14 is also rejected because it is just a method implementation of the same subject matter of claim 6/4 which can be found in Felix and Yeoh. 

Claim 17/13 is also rejected because it is just a method implementation of the same subject matter of claim 7/3 which can be found in Felix and Yeoh. 

Claim 18/12 is also rejected because it is just a method implementation of the same subject matter of claim 8/2 which can be found in Felix and Yeoh. 

Claims 9 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Felix, in view of Yeoh, and in further view of Ji et al. (US2018/0232640, filed 14 April 2017), hereinafter referred to as Ji.

In regards to claim 9, the rejection of claim 2 is incorporated and Felix further teaches wherein the encoder is further configured to obtain weights of the plurality of edges in the layer of the deep neural network, perform a pruning operation …, and generate the edge sequence to indicate connection or disconnection of the plurality of edges of the layer of the deep neural network based on the pruning operation.  ([0033, Figure 2, Figure 4]Applying the rmask instruction to the vector of weights prior to calculating this dot product can be used to implement the Drop - connect function . Application of the rmask instruction to the vector of weights randomly zeroes individual weights within the vector .,wherein the application of the set of unmasking/masking instructions (an encoder operation) to a corresponding set of weights (available to/obtained by the computer component that performs the encoding) in the (deep) neural network generates an edge sequence (set of weights) that are masked or unmasked to generate the connectivity in that (deep) neural network such that the disconnection of edges through the masking instruction is a pruning operation.) 
However, Felix and Yeoh do not explicitly teach perform a pruning operation based on a result of comparing the weights with a preset threshold weight,. Neither Felix nor Yeoh discloses the configuration of neural network connectivity according to a weight value-based pruning operation.
However,  Ji, in the analogous environment of implementing random dropout connections for deep neural networks, teaches perform a pruning operation based on a result of comparing the weights with a preset threshold weight, and generate the edge sequence to indicate connection or disconnection of the plurality of edges of the layer of the deep neural network based on the pruning operation.  ([0020, 0032] FIG . 1A is a flowchart of a technique for automatically determining a threshold according to some embodiments . In 100 , a threshold for pruning a layer of a neural network is initialized . In some embodiments , the threshold may be initialized to an extreme end of a range of values , such as 0 or 1 . In other embodiments , the threshold may be initialized to a specific value such as 0 . 2 , 0 . 5 , or the like ., Using the threshold T7 , each weight W ; of the layer 1 is pruned . Equation 1 is an example of how the weights may be pruned . <equation 1>. wherein (deep) neural network connectivity is pruned at given edge according to whether the magnitude of the corresponding weight is less than a threshold (equation 1).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Felix and Yeoh to incorporate the teachings of Ji for the encoding apparatus to perform a pruning operation based on a result of comparing the weights with a preset threshold weight and generate the edge sequence to indicate connection or disconnection of the plurality of edges of the layer of the deep neural network based on the pruning operation. The modification would be obvious because one of ordinary skill would be motivated to improve deep neural network performance/accuracy by pruning edge connections with higher weight magnitudes with the pruning methodology implemented efficiently according to a pruning error allowance (Ji, [0061, 0062, Figure 6, Figure 7A, Figure 7B, Figure 7C]).

Claim 19/12 is also rejected because it is just a method implementation of the same subject matter of claim 9/2 which can be found in Felix, Yeoh, and Ji. 

Claims 10 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Felix, in view of Yeoh, and in further view of Mellempudi et al. (US2018/0322607, filed 29 January 2018), hereinafter referred to as Mellempudi.

In regards to claim 10, the rejection of claim 1 is incorporated and Felix does not further teach further comprising a selector configured to select one of a plurality of types of input signals and output the selected signal, wherein the encoder is further configured to receive an operation result from the deep neural network to determine whether overflow has occurred in the operation result and 48perform a dynamic fixed point operation of modifying an expressible range of information used in the deep neural network based on whether overflow has occurred.  In other words, Felix does not disclose the determination of overflow or a corresponding modification in the range of information and therefore does not disclose a selector in the context of overflow determination and dynamic range modification and does not disclose a selector which outputs different random sequences/edge sequences according to a given edge sequence. As noted above, the “selector” in the claims is being interpreted as a generic placeholder without the recitation of sufficient accompanying structure to perform the function; a review of the specification shows that the following appears to be the corresponding structure described in the specification ([0166, 0168, 0175] “Referring to FIG. 14, the selector 1440 may obtain certain information from each of the register 1442 and the random number generation circuit 1444 and selectively output the two pieces of information. According to an embodiment of the disclosure, the selector 1440 may include a multiplexer that obtains a certain selection signal of n bits (n > 0) … That is, the selector 1440 may selectively output input information of 2^n types that may be determined based on an n-bit selection signal., In detail, in order for the learning circuit 1430 to perform the above-described dropout operation, the selector 1440 may operate such that a signal generated in the random number generation circuit 1444 is output., 42According to an embodiment of the disclosure, the encoding apparatus 1500 connected to the selector 1540 may selectively use at least one of an operation of determining a second edge sequence based on a result of comparing a first edge weight stored in the register 1542 with a preset threshold weight, an operation of determining a second edge sequence based on a result of comparing a first edge sequence stored in the memory 1510 with a random number sequence obtained by using the random number generation circuit 1546, or a learning process performed using the counter 1544 calculating a number of times of overflow of an intermediate calculation result stored in the register 1542.”)
However, Yeoh, in the analogous environment of implementing random dropout connections for deep neural networks, teaches further comprising a selector configured to select one of a plurality of types of input signals and output the selected signal.  ([Abstract, p. 823, Section 3] To generate a dropout mask to randomly drop neurons during training phase, random number generators (RNGs) are usually used in software implementations. However, RNGs consume considerable FPGA resources in hardware implementations. The proposed method is able to minimize the resources required for FPGA implementation of dropout by performing a simple rotation operation to a predefined dropout mask. We apply the proposed method to MLPs and CNNs., Thus, we propose a method for eliminating both RNGs and comparators, by performing a simple rotation operation to generate a predefined mask as follows: <equation 4> The rotation bit r is introduced as a parameter and it controls the bit to be rotated. To further increase randomness, a split or XOR operation can be introduced by splitting the mask into several portions before performing rotation or XOR on the previous bits., wherein a selection of a second random number sequence is made based on a first random number sequence (a pre-defined mask) such that the input to that selection process includes the pre-defined mask sequence and the modified sequence (different types of input signal with one being modified according to permutation or XOR operations and the other not) but also, alternatively, such that the input to the selection process includes the pre-defined mask sequence, the rotation bit, and the modified mask sequence (in which the different types of the input signals are a mask sequence vs. a rotation bit) and wherein, in any case, the selected signal for output is the modified mask sequence.) 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Felix to incorporate the teachings of Yeoh for a selector to select and output a signal based upon a plurality of types of input signals. The modification would be obvious because one of ordinary skill would be motivated to efficiently generate random dropout connection sequences by minimizing the hardware resources required by that implementation by selecting a second random sequence from a first random sequence with the second sequence formed as a permutation/XOR operation applied to the first sequence (Yeoh, [Abstract, p. 823, Section 3,  p. 828, Section 5, Table 2]).
However, Yeoh and Felix do not explicitly teach wherein the encoder is further configured to receive an operation result from the deep neural network to determine whether overflow has occurred in the operation result and 48perform a dynamic fixed point operation of modifying an expressible range of information used in the deep neural network based on whether overflow has occurred. In other words, Yeoh does not disclose the determination of overflow and a dynamic adaption of dynamic range according to this determination.
However, Mellempudi, in the analogous environment of efficiently implementing deep neural networks teaches further comprising a selector configured to select one of a plurality of types of input signals and output the selected signal, wherein the encoder is further configured to receive an operation result from the deep neural network to determine whether overflow has occurred in the operation result and 48perform a dynamic fixed point operation of modifying an expressible range of information used in the deep neural network based on whether overflow has occurred.  ([0218, 0222, 0225, 0230, Figure 20] Overflow and / or saturation of accumulator during integer arithmetic operations introduces significant computational errors while performing longer accumulation chains ( such as GEMM or Convolution ) . One embodiment enables techniques to dynamically adjust the effective precision of the input using dynamic fixed - point representation to prevent such errors with a minimum impact on accuracy ., In one embodiment , the dynamic precision manager 1819 can be configured to dynamically adjust the precision of compute operations , for example , to prevent overflows from occurring during a chain of accumulations . … Intermediate data is shifted “ right by Rshift before passing the intermedia data through next iteration of computation . The Rshift value is computed based on heuristics and can be incremented or decremented to keep as many precision bits as possible while keeping overflow in check., In such embodiment , the number of leading zeros or leading ones can be used as a heuristic to determine whether to shift intermediate output . In one embodiment , the leading bit detector 1931 can be configured to detect only leading zeros when an absolute value is to be examined ., After the compute operation is performed at block 2004 , the logic 2000 can then check the leading zero count of the absolute max of the output tensor at block 2006 . If the leading zero count is above the overflow threshold at block 2007 , then the logic 2000 can adjust the precision of the output tensor based on the leading zero count . Adjusting the precision of the output tensor includes adjusting the rshift count the shared exponent for the output …. Otherwise the logic 2000 can decrease the rshift count for the output and decrement the shared exponent at block 2008 . Adjusting the rshift count can include adjusting an rshift counter configured to track a degree of right - shift applied to the output tensor., wherein the count of the number of leading ones is used as an indication of overflow (and used to track the dynamic range of blocks computed as intermediate outputs/operational results within a deep neural network) such that, in response to this number exceeding a threshold (Figure 20), the rshift value is decremented in order to mitigate the overflow and optimize the expressed dynamic range (the expressible range of information used in the neural network) such that the application of rshift to the fixed point representation (the shared exponent for the block of operational results) is a dynamic fixed point operation that modifies that dynamic range and wherein this process involves the selection of different types of output signals (i.e., a selection operation) depending on the evaluation of the count of the number of leading zeros (i.e., an increment in rshift or decrement in rshift, or no change which is a selection of an output signal (a particular rshift) given a set of input signals (a current rshift vs. a modified rshift but also any rshift vs count information) corresponding to a bit shift application that is used to encode the representation of the information at each block in the neural network (including the output at any particular node in any particular layer) such that, in other words, Mellempudi, like Yeoh, also teaches a selector and a selection process based on different types of input signals but does so in the context of overflow and dynamic fixed point operations.)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Felix and Yeoh to incorporate the teachings of Mellempudi, for a selector to select and output a signal based upon a plurality of types of input signals and for the encoding apparatus to receive an operation result from the deep neural network to determine whether overflow has occurred in the operation result and 48perform a dynamic fixed point operation of modifying an expressible range of information used in the deep neural network based on whether overflow has occurred. The modification would be obvious because one of ordinary skill would be motivated to achieve state of the art deep neural network performance while achieving efficient implementation on graphics processors through dynamically adjusting the fixed point dynamic range of intermediated neural network outputs through adaptive selection of the bit shift in the representation of those outputs to minimize overflow and optimize the dynamic range (Mellempudi, [0218, 0234, 0242, Table 6]).

Claim 20/11 is also rejected because it is just a method implementation of the same subject matter of claim 10/1 which can be found in Felix, Yeoh, and Mellempudi. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Gibson et al. (US2017/0323197, 9 November 2017) teach the dynamic shifting of exponent values for weights in a fixed point dynamic operation in response to a detection of overflow in intermediate operations within a deep neural network.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT LEWIS KULP whose telephone number is (571)272-7983. The examiner can normally be reached M, Th, F 8-5:30; Tu 8-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang, can be reached on 571-270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/ROBERT LEWIS KULP/Examiner, Art Unit 2124                                                                                                                                                                                                        
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124