DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending and have been examined.

Priority
The Examiner has noted applicant’s claim for foreign priority based on United Kingdom of Great Britain and Northern Ireland application number GB1715215.8 filed on 09/20/2017. The examiner acknowledges that a certified copy of United Kingdom of Great Britain and Northern Ireland application number GB1715215.8 has been retrieved, as required by 37 CFR 1.55.

Information Disclosure Statement
Acknowledgment is made of the information disclosure statements filed 9/20/2018, 5/24/2019 and 2/17/2021, which comply with 37 CFR 1.97. As such, the information disclosure statements have been placed in the application file and the information referred to therein has been considered by the examiner.

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(3) because Figures 1 and 4 include letters which do not measure at least .32 cm. (1/8 
The drawings are also objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference characters not mentioned in the description: 
Reference character 618 shown in Figure 6 is not found in the detailed description.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
Reference character 618 shown in Figure 6 is not described in applicant’s specification (see, e.g., paragraphs 96-110 describing FIG. 6). Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f):
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 

Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f), because the claim limitations use a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitations are:
an input module configured to receive a set of input data … ;
a decoder configured to receive information indicating a desired output data format … ;
a processing module configured to process the set of input data … ; and
an output module configured to convert the processed data into the desired output data format in claims 2 and 19. 

Because these claim limitations are being interpreted under 35 U.S.C. 112(f), they are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.

Regarding claims 2 and 19 and the above-noted three-prong test, the recited input module is a generic placeholder, configured to receive a set of input data is functional language, and there is no recitation in claims 2 or 19 of sufficient structure to perform the receiving. Also in claims 2 and 19, the recited decoder is a generic placeholder, configured to receive information indicating a desired output data format is functional language, and there is no recitation in claims 2 or 19 of sufficient structure to perform the receiving. Additionally in claims 2 and 19, the recited processing module is a generic placeholder, configured to process the set of input data is functional language, and there is no recitation in claims 2 or 19 of sufficient structure to perform the processing. Further in claims 2 and 19, the recited output module is a generic placeholder, configured to convert the processed data into the desired output data format is functional language, and there is no recitation in claims 2 or 19 of sufficient structure to perform the converting. 

Regarding the above-noted input module, decoder, processing module, and output module claim limitations in claims 2 and 19, with reference to the block diagram of FIG. 2, paragraphs 37-38, 41, 44 and 57 of Applicant’s specification state “hardware implementation 200 comprises an input module 202, a processing module 204, a command decoder 206, and an output module 208”, “The input module 202 comprises digital logic circuitry configured to receive a set of input data for a hardware pass and provide the received set of input data to the processing module 204 for processing”, “The processing module 204 comprises digital logic circuitry configured to process the received set of input data in accordance with one or more layers associated with the hardware pass to generate processed data”, “The command decoder 206 comprises digital logic circuitry configured to receive information indicating the desired format of the output data for the current hardware pass” and “The output module 208 is configured to receive the processed data from the processing module 204 and convert the processed data to the desired output data format to produce output data.”
Since claims 2 and 19 are interpreted under 35 U.S.C. 112(f), and paragraph 37 of applicant’s specification describes that the recited modules and decoder are components of the hardware implementation 200 of FIG. 2 and paragraphs 38, 41, 44 and 57 of the specification disclose that the modules and the decoder “comprises digital logic circuitry”, the above-noted input module, decoder, processing module, and output 
If applicant wishes to provide further explanation or dispute the examiner's interpretation of the corresponding structure, applicant must identify the corresponding structure with reference to the specification by page and line number, and to the drawing, if any, by reference characters in response to this Office action.

If applicant does not intend to have these limitations interpreted under 35 U.S.C. 112(f), applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitations recite sufficient structure to perform the claimed function so as to avoid them being interpreted under 35 U.S.C. 112(f).

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
In independent claims 1, 2 and 19, the recitations of “one or more hardware passes” and “a hardware pass of the hardware implementation” are unclear. In particular, it is unclear what the “hardware passes” and “hardware pass” recited in each 
Claim 20 which depends from claim 1, is rejected under 35 U.S.C. 112(b) as being indefinite under the same rationale as claim 1.
Also, claims 3-18, which each depend directly or indirectly from claim 2, are rejected under 35 U.S.C. 112(b) as being indefinite under the same rationale as claim 2.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly 
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Falcon et al. (U.S. Patent Application Pub. No. 2016/0026912 A1, cited in applicant’s IDS submitted on 5/24/2019, hereinafter “Falcon”) in view of Yang et al. (U.S. Patent Application Pub. No. 2017/0061279 A1, cited in applicant’s IDS submitted on 2/17/2021, hereinafter “Yang”)
With respect to claim 1, Falcon discloses the invention as claimed including a method in a hardware implementation of a Deep Neural Network "DNN" configured to implement the DNN by processing data using one or more hardware passes (paragraphs 31 and 81 of applicant’s specification state “A hardware implementation for a DNN may be configured to compute the output of a DNN through a series of hardware passes (which also may be referred to as processing passes) wherein during each pass the hardware implementation receives at least a portion of the input data for a layer of the DNN and processes the received input data in accordance with that layer (and optionally in accordance with one or more following layers) to produce processed data” and “The layer or set of layers that will be processed in each hardware pass is/are typically based on the order of the layers in the DNN”. Therefore, as indicated above, “processing data using one or more hardware passes”, under the broadest reasonable interpretation (BRI), is using any hardware for processing input data in one or more layers of a DNN) (see, e.g., FIG. 1A – depicting a hardware implementation of “System 100” and paragraphs 23, 27, 35 and 82, “weight-, the method comprising:
receiving a set of input data for a hardware pass of the hardware implementation, the set of input data representing at least a portion of input data for a particular layer of the DNN (as indicated above, “a hardware pass of the hardware implementation”, under the BRI, is using hardware for processing input data in one or more layers of a DNN) (see, e.g., FIG. 14 flowchart showing “Receive input values” in step 1425 and paragraphs 89, 91, 93 and 119, “calculation circuit 1200 may accept inputs from, for example, input data 1202 and weights 1204”, “Weights 1204 or input data 1202 may be low precision”, “Input data 1202 may be read from various input layers”, “At 1425, input values and weight values may be received. … The input values and weight values may be of a fixed size and of a lower precision than which the weight ;
receiving information indicating a … output data format for the hardware pass (see, e.g., FIG. 14 flowchart showing step 1425 to “Receive … weight values, and scale values” and paragraphs 103, 106, 119 and 123, “right shifter and truncate logic 1232 may scale down the results so that they are normalized for use in a range expected by other elements”, “an augmented, scaled-up result … may be truncated when such a result is passed out” [i.e., expected/desired scaled and truncated results/output data format], “scale values indicating the degree to which the weights were scaled may be received”, “partial results may be stored for future computation on the same layer … if such results are to be performed on a different calculation circuit then the results may be partially truncated. Furthermore, the results may be scaled down” [i.e., receiving scale information indicating a scaling and truncation to be performed on the results to produce a desired output data format]);
processing the set of input data according to one or more layers of the DNN associated with the hardware pass to produce processed data, the one or more layers comprising the particular layer of the DNN (see, e.g., FIG. 14 flowchart showing steps 1440 and 1445 to “Use scaled weights to determine convolution or dot-product calculations on input” data for a layer and then determine if data processing for that particular “Layer [is] finished?” and paragraphs 85 and 121-122, “calculation accelerator 1004, to perform calculation for different layers of CNN system 900”, “At 1440, the scaled weights may be used to determine suitable calculations, such as convolution or dot-product, on the input.”, “At 1445, in one embodiment it may be ; and
converting the processed data into the … output data format for the hardware pass to produce output data for the hardware pass (see, e.g., FIG. 14 flowchart showing steps 1450, 1455, 1460 and 1465 to scale, truncate and output calculated values/results and paragraphs 123-124, “the results may be partially truncated. Furthermore, the results may be scaled down by, for example, shifting their values right by the scaling factor. The truncated and scaled results may be stored in memory, a register, or otherwise sent to another calculation circuit.”, “the results may be scaled down … in another embodiment the results may be truncated … the upper integer bits and lower fractional bits may be truncated according to an expected output format. At 1465, the result may be output as the determined calculated value associated with the layer.” [i.e., scaling/truncating results/processed data into an expected/desired output data format to calculate/produce output data for the hardware pass for the layer and to the calculation circuit]).
Although Falcon substantially discloses the claimed invention, Falcon is not relied on for explicitly disclosing receiving information indicating a desired output data format … and converting the processed data into the desired output data format.
In the same field, analogous art Yang teaches receiving information indicating a desired output data format (see, e.g., paragraphs 49, 54, 66 and 76, “a register identifier or other memory/storage location of where the result is to be stored is included ; and converting the processed data into the desired output data format (see, e.g., FIGs. 6 and 7 flowcharts showing steps 606 and 706, respectively to “Format a final intermediate result of the instruction to a desired fixed point representation format” and paragraphs 66 and 76, “At 606, a final intermediate result of the instruction is formatted to a desired fixed point representation format. The final intermediate result may be an intermediate result matrix that includes the elements to be formatted to produce the final result matrix. For example, the desired fixed point representation format for the result matrix is specified in the received matrix multiplication instruction.” [i.e., converting the processed data/intermediate result by formatting it to a desired representation/output data format]).
Falcon and Yang are analogous art because they are both related to fixed point precision calculations in deep convolutional neural networks (see, e.g., Falcon, Abstract and paragraphs 81-82 and 98-100 and Yang, Abstract and paragraphs 13-15 and 38).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Falcon to incorporate the teachings of Yang to provide techniques for “Updating an artificial neural network … using a fixed point node parameter and a network characteristic is represented using a fixed point network parameter” where a “value associated with the fixed point intermediate parameter is truncated and/or rounded according to a flexible system truncation 

With respect to independent claim 2, Falcon discloses the invention as claimed including a hardware implementation of a Deep Neural Network "DNN" configured to implement the DNN by processing data using one or more hardware passes (As indicated above, “processing data using one or more hardware passes”, under the BRI, is using any hardware for processing input data in one or more layers of a DNN) (see, e.g., FIG. 1A – depicting a hardware implementation of “System 100” and paragraphs 23, 27, 35 and 82, “weight-shifting mechanism for reconfigurable processing units within or in association with a processor … computer system, or other processing apparatus … such a weight-shifting mechanism may be used in convolution neural networks (CNN)”, “a circuit level model with logic and/or transistor gates may be produced at some stages of the design process … a level of data representing the physical placement of various devices in the hardware model … to produce the integrated circuit”, “System 100 may include a component, such as a processor 102 to , the hardware implementation comprising:
an input module configured to receive a set of input data for a hardware pass of the hardware implementation, the set of input data representing at least a portion of input data for a particular layer of the DNN (as indicated above, the “input module” has been interpreted as being hardware, and “a hardware pass of the hardware implementation”, under the BRI, is using hardware for processing input data in one or more layers of a DNN) (see, e.g., FIG. 14 flowchart showing “Receive input values” in step 1425 and paragraphs 89, 91, 93 and 119, “calculation circuit 1200 may accept inputs from, for example, input data 1202 and weights 1204”, “Weights 1204 or input data 1202 may be low precision”, “Input data 1202 may be read from various input layers”, “At 1425, input values and weight values may be received. … The input values and weight values may be of a fixed size and of a lower precision than which the weight values were originally determined.” [i.e., hardware/circuit 1200 to accept/receive a set of input data representing a portion of input data for a layer of the CNN/DNN]);
a decoder configured to receive information indicating a … output data format for the hardware pass (as indicated above, the “decoder” has been interpreted as being hardware) (see, e.g., FIGs. 2 and 14 showing “DECODER” 228 and flowchart step 1425 to “Receive … weight values, and scale values” and paragraphs 46, 50-54, 103, 106, 119 and 123, “Processing core 159 comprises an execution unit 142, a set of ;
a processing module configured to process the set of input data according to one or more layers of the DNN associated with the hardware pass to produce processed data, the one or more layers comprising the particular layer of the DNN (as indicated above, the “processing module” has been interpreted as being hardware) (see, e.g., FIG. 14 flowchart showing steps 1440 and 1445 to “Use scaled weights to determine convolution or dot-product calculations on input” data for a layer and then determine if data processing for that particular “Layer [is] finished?” and paragraphs 85 and 121-122, “calculation accelerator 1004, to perform calculation for different layers of CNN system 900”, “At 1440, the scaled weights may be used to determine suitable calculations, such as convolution or dot-product, on the input.”, “At 1445, in one embodiment it may be determined whether computations have finished for the layer” [i.e., hardware/calculation module 1004 performs computations to process the input data and produce processed data for the layer/particular layer of the CNN/DNN]); and
an output module configured to convert the processed data into the … output data format for the hardware pass to produce output data for the hardware pass (as indicated above, the “output module” has been interpreted as being hardware) (see, e.g., FIG. 14 flowchart showing steps 1450, 1455, 1460 and 1465 to scale, truncate and output calculated values/results and paragraphs 123-124, “the results may be partially truncated. Furthermore, the results may be scaled down by, for example, shifting their values right by the scaling factor. The truncated and scaled results may be stored in memory, a register, or otherwise sent to another calculation circuit.”, “the results may be scaled down … in another embodiment the results may be truncated … the upper integer bits and lower fractional bits may be truncated according to an expected output format. At 1465, the result may be output as the determined calculated value associated with the layer.” [i.e., circuit/hardware module to scale/truncate results/processed data into an expected/desired output data format to calculate/produce output data for the hardware pass for the layer and to the calculation circuit]).
Although Falcon substantially discloses the claimed invention, Falcon is not relied on for explicitly disclosing receive information indicating a desired output data format … and convert the processed data into the desired output data format.
In the same field, analogous art Yang teaches receiving information indicating a desired output data format (see, e.g., paragraphs 49, 54, 66 and 76, “a register identifier or other memory/storage location of where the result is to be stored is included in the instruction. … the instruction identifies a desired decimal point placement of the result of the operation. … This may allow the result of the operation to be in the desired fixed point representation format”, “the desired fixed point representation format for the ; and converting the processed data into the desired output data format (see, e.g., FIGs. 6 and 7 flowcharts showing steps 606 and 706, respectively to “Format a final intermediate result of the instruction to a desired fixed point representation format” and paragraphs 66 and 76, “At 606, a final intermediate result of the instruction is formatted to a desired fixed point representation format. The final intermediate result may be an intermediate result matrix that includes the elements to be formatted to produce the final result matrix. For example, the desired fixed point representation format for the result matrix is specified in the received matrix multiplication instruction.” [i.e., converting the processed data/intermediate result by formatting it to a desired representation/output data format]).
Falcon and Yang are analogous art because they are both related to fixed point precision calculations in deep convolutional neural networks (see, e.g., Falcon, Abstract and paragraphs 81-82 and 98-100 and Yang, Abstract and paragraphs 13-15 and 38).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Falcon to incorporate the teachings of Yang to provide techniques for “Updating an artificial neural network … using a fixed point node parameter and a network characteristic is represented using a fixed point network parameter” where a “value associated with the fixed point intermediate parameter is truncated and/or rounded according to a flexible system truncation schema” and an “instruction identifies a desired decimal point placement of the result of the operation.” (See, e.g., Yang, Abstract and paragraphs 13-14 and 49). Doing so would have allowed Falcon to use Yang’s technique for “Updating an artificial neural 

Regarding claim 3, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 2.
Falcon further discloses the input module is further configured to receive a second set of input data for a different hardware pass of the hardware implementation (paragraphs 31 and 81 of applicant’s specification state “A hardware implementation for a DNN may be configured to compute the output of a DNN through a series of hardware passes (which also may be referred to as processing passes) wherein during each pass the hardware implementation receives at least a portion of the input data for a layer of the DNN and processes the received input data in accordance with that layer (and optionally in accordance with one or more following layers) to produce processed data” and “The layer or set of layers that will be processed in each hardware pass is/are typically based on the order of the layers in the DNN”. Therefore, “a different hardware pass of the hardware implementation”, under the BRI, is using any hardware for processing input data in one or more different, additional and/or next/subsequent layers of a DNN. Also, as indicated above, “the input module” has been interpreted as being hardware) (see, e.g., FIG. 14 flowchart showing “Receive ;
the decoder is further configured to receive information indicating a … output format for the different hardware pass, the … output format for the different hardware pass being different from the … output data format for the hardware pass (as indicated above, “the decoder” has been interpreted as being hardware) (see, e.g., FIG. 14 showing repeatable flowchart step 1425 to “Receive … weight values, and scale values” – repeated per step 1470 for multiple iterations/passes for receiving information for different output formats and paragraphs 46, 50-54, 103, 106, 119 and 123, “Processing core 159 comprises an execution unit 142, a set of register files 145, and a decoder 144” [i.e., a decoder], “right shifter and truncate logic 1232 may scale down the results so that they are normalized for use in a range ;
the processing module is further configured to process the second set of input data according to one or more layers of the DNN to produce second processed data (as indicated above, “the processing module” has been interpreted as being hardware) (see, e.g., FIG. 14 flowchart showing repeatable steps 1440 and 1445 to “Use scaled weights to determine convolution or dot-product calculations on input” data for a layer and then determine if data processing for one or more “Layer[s] finished?” and repeated per step 1470 for multiple iterations/passes for processing multiple sets of input data for one or more layers and paragraphs 85, 121-122 and 125, “calculation accelerator 1004, to perform calculation for different layers of CNN system 900”, “At 1440, the scaled weights may be used to determine suitable calculations, such as convolution or dot-product, on the input.”, “At 1445, in one embodiment it may be determined whether computations have finished for the layer”, “At 1470, it may be ; and 
the output module is further configured to convert the second processed data into the … output data format for the different hardware pass to produce second output data (as indicated above, the “output module” has been interpreted as being hardware) (see, e.g., (see, e.g., FIG. 14 flowchart showing repeatable steps 1450, 1455, 1460 and 1465 to scale, truncate and output calculated values/results – repeated per step 1470 for multiple iterations/passes to produce second output data and paragraphs 123-125, “the results may be partially truncated. Furthermore, the results may be scaled down by, for example, shifting their values right by the scaling factor. The truncated and scaled results may be stored in memory, a register, or otherwise sent to another calculation circuit.”, “the results may be scaled down … in another embodiment the results may be truncated … the upper integer bits and lower fractional bits may be truncated according to an expected output format. At 1465, the result may be output as the determined calculated value associated with the layer.”, “At 1470, it may be determined whether to repeat with, for example, additional input values for another layer.” [i.e., circuit/hardware module to scale/truncate/convert second results/processed data into another expected output data format by calculating/producing second output data for the different hardware pass for the next layer]).
 receive information indicating a desired output format for the different hardware pass, the desired output format for the different hardware pass being different from the desired output data format for the hardware pass … convert the second processed data into the desired output data format.
In the same field, analogous art Yang teaches receive information indicating a desired output format for the different hardware pass, the desired output format for the different hardware pass being different from the desired output data format for the hardware pass (see, e.g., paragraphs 49, 54, 66 and 76, “a register identifier or other memory/storage location of where the result is to be stored is included in the instruction. … the instruction identifies a desired decimal point placement of the result of the operation. … This may allow the result of the operation to be in the desired fixed point representation format”, “the desired fixed point representation format for the result matrix is identified in the received instruction.” [i.e., receiving information/the instruction specifying/indicating a different, desired output data format]) … ; convert the second processed data into the desired output data format (see, e.g., FIG. 7 flowchart showing step 706 to “Format a final intermediate result of the instruction to a desired fixed point representation format” and paragraph 76, “At 706, a final intermediate result of the instruction is formatted to a desired fixed point representation format. The final intermediate result may be an intermediate result matrix that includes the elements to be formatted to produce the final result matrix. For example, the desired fixed point representation format for the result matrix is specified in the received matrix 
Falcon and Yang are analogous art because they are both related to fixed point precision calculations in deep convolutional neural networks (see, e.g., Falcon, Abstract and paragraphs 81-82 and 98-100 and Yang, Abstract and paragraphs 13-15 and 38).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Falcon to incorporate the teachings of Yang to provide techniques for “Updating an artificial neural network … using a fixed point node parameter and a network characteristic is represented using a fixed point network parameter” where a “value associated with the fixed point intermediate parameter is truncated and/or rounded according to a flexible system truncation schema” and an “instruction identifies a desired decimal point placement of the result of the operation.” (See, e.g., Yang, Abstract and paragraphs 13-14 and 49). Doing so would have allowed Falcon to use Yang’s technique for “Updating an artificial neural network” and instruction for the operation to “allow the result of the operation to be in the desired fixed point representation format that is different from the fixed point representation formats of the operands of the operation” where “The desired fixed point representation format of the result may identify the fixed point representation format of one or more elements of the result”, as suggested by Yang (See, e.g., Yang, paragraphs 13 and 49).

Regarding claim 4, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 3.
wherein the second set of input data comprises a portion of input data for the particular layer of the DNN; or wherein the second set of input data comprises at least a portion of input data for another layer of the DNN (see, e.g., FIG. 14 flowchart showing repeatable step 1425 to “Receive input values”– repeated per step 1470 for different layers and paragraphs 93, 102, 119 and 125, “Input data 1202 may be read from various input layers” [i.e., including input data 1202 for the particular layer], “accept the result of convolution … The result is added to temp data 1206 received from another layer determination and to a previous iteration”, “At 1425, input values and weight values may be received.”, “At 1470, it may be determined whether to repeat with, for example, additional input values for another layer. If so, method 1400 may return to 1425” [i.e., second set of input data can include a portion of data from the particular layer of the various input layers or another/different/next layer of the CNN/DNN, the output of one processing layer can be input of a next/different processing layer, and the intermediate output data may be stored in memory]). 

Regarding claim 5, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 4.
Falcon further discloses wherein the second set of input data comprises a least a portion of input data for another layer (see, e.g., paragraphs 93, 102, 119 and 125, “Input data 1202 may be read from various input layers” [i.e., second set of input data comprises at least a portion of input data 1202 for another layer of the various layers], “accept the result of convolution … The result is added to temp data  and the second set of input data comprises at least a portion of the output data in the desired output data format for the hardware pass (see, e.g., FIG. 14 flowchart showing repeatable step 1425 to “Receive input values” and paragraphs 123-125, “partial results may be stored for future computation on the same layer. … if such results are to be performed on a different calculation circuit then the results may be partially truncated. Furthermore, the results may be scaled down … The truncated and scaled results may be stored in memory, a register, or otherwise sent to another calculation circuit. Method 1400 may return to 1425.” “the result may be output as the determined calculated value associated with the layer. At 1470, it may be determined whether to repeat with, for example, additional input values for another layer. If so, method 1400 may return to 1425.” [i.e., the second set of input data includes at least a portion of the processed output data in the desired scaled and truncated format when step 1425 is repeated to receive the second set of input data for the hardware pass]).
Alternatively, Yang also teaches the second set of input data comprises at least a portion of the output data in the desired output data format (see, e.g., FIG. 7 flowchart showing step 706: “Format a final intermediate result of the instruction to a desired fixed point representation format” and paragraph 76, “At 706, a final intermediate result of the instruction is formatted to a desired fixed point representation format. The final intermediate result may be an intermediate result matrix that includes 
Falcon and Yang are analogous art because they are both related to fixed point precision calculations in deep convolutional neural networks (see, e.g., Falcon, Abstract and paragraphs 81-82 and 98-100 and Yang, Abstract and paragraphs 13-15 and 38).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Falcon to incorporate the teachings of Yang to provide techniques for “Updating an artificial neural network … using a fixed point node parameter and a network characteristic is represented using a fixed point network parameter” where a “value associated with the fixed point intermediate parameter is truncated and/or rounded according to a flexible system truncation schema” and an “instruction identifies a desired decimal point placement of the result of the operation.” (See, e.g., Yang, Abstract and paragraphs 13-14 and 49). Doing so would have allowed Falcon to use Yang’s technique for “Updating an artificial neural network” and instruction for the operation to “allow the result of the operation to be in the desired fixed point representation format that is different from the fixed point representation formats of the operands of the operation” where “The desired fixed point representation format of the result may identify the fixed point representation format of one or more elements of the result”, as suggested by Yang (See, e.g., Yang, paragraphs 13 and 49).

Regarding claim 6, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 2.
Falcon further discloses wherein the output module is further configured to store the output data in the desired output data format in memory (as indicated above, the “output module” has been interpreted as being hardware) (see, e.g., paragraph 123, “The truncated and scaled results may be stored in memory” [i.e., output data/results in the desired, truncated and scaled format may be stored in memory]); and
the input module is further configured to read the output data in the desired output data format from memory as a set of input data for another hardware pass (As indicated above, “the input module” has been interpreted as being hardware) (see, e.g., paragraph 123, “At 1450, partial results may be stored for future computation … if such results are to be performed on a different calculation circuit [i.e., another hardware pass] then the results may be partially truncated … the truncated and scaled results may be stored in memory, a register, or otherwise sent to another calculation circuit. Method 1400 may return to 1425.” [i.e., to repeat step 1425 to “Receive input values” for another hardware pass by reading the output data in the truncated and scaled/desired format for the different calculation circuit]). 

Regarding claim 7, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 2.
wherein the one or more layers comprises at least two layers (see, e.g., paragraphs 80-81 and 95-101, “CNN system 900 that includes a convolution layer 902, an average pooling layer 904, and a fully-connected neural network 906”, “convolution and pooling layers may be applied to input data multiple times prior to the results being transmitted to the fully-connected layer” [i.e., at least two layers]) and the processing module is configured to process the set of input data according to the at least two layers by processing the set of input data according to one layer of the at least two layers using a first input data format and processing the set of input data according to another layer of the at least two layers using a second input data format wherein the first input data format and the second input data format are independent from the desired output data format for the hardware pass (see, e.g., paragraphs 53, 56-57 and 80, “operate on data elements having sizes of byte, word, doubleword, quadword, etc., as well as datatypes, such as single and double precision integer and floating point datatypes”, “integer register file 208 may be split into two separate register files, one register file for low-order thirty-two bits of data and a second register file for high order thirty-two bits of data”, “ALUs 216, 218, 220 may be implemented to support a variety of data bit sizes including sixteen, thirty-two, 128, 256, etc. Similarly, floating point units 222, 224 may be implemented to support a range of operands having bits of various widths. … floating point units 222, 224, may operate on 128-bit wide packed data”, “Results of filter operations 908 may be summed together to provide an output from convolution layer 902 to the next pooling layer 904. Pooling layer 904 may perform subsampling … The output of pooling layer 904 may be fed to the fully-connected neural network 906” [i.e., 

Regarding claim 8, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 7.
Falcon further discloses wherein each of the first and second input data formats is a fixed point format defined by an exponent and an integer bit-width and the exponent of the first data format is different than the exponent of the second data format (see, e.g., paragraphs 81, 95, 98-99 and 104, “CNN systems are implemented according to the highest precision requirement at … 32-bit or 16-bit fixed point precision”, “scaling from, for example, thirty-two-bit floating point values to eight-bit fixed point values are illustrated, scaling may be performed from any value in higher precision fixed or floating point to any lower prevision [precision] value in fixed point.”, “weights 1204 … the stored, scaling value may be similar to an associated exponent”, “Multiplication may be made by hardware components that perform multiplication operation of integer or fixed-point inputs. … such multipliers may include 8-bit fixed-point multipliers. If input data 1202 and weights 1204 are each eight-bits wide (and in 1.7 format, wherein a bit is used to represent the sign and seven bits are used to represent a fractional part of a fixed-point number), then there may be sixteen pairs of inputs”, “Integer data 1312 (with an example 10-bit width) and fractional data 1314 (with an example 14-bit width) may be input.” [i.e., first and second input data formats are 
Regarding claim 9, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 7.
Falcon further discloses wherein the processing module is configured to process the input data according to one layer of the at least two layers using a first input data format by converting input data to that layer into the first input data format using a fixed point to fixed point converter (see, e.g., paragraphs 98-99, “scaling may be performed from any value in higher precision fixed … to any lower prevision [precision] value in fixed point.”, “elements of input data 1202 … are multiplied pair-wise at 1304 and then added together in accumulators 1306. Multiplication may be made by hardware components that perform multiplication operation of integer or fixed-point inputs. In one embodiment, such multipliers may include 8-bit fixed point multipliers. If input data 1202 … are each eight-bits wide (and in 1.7 format” [i.e., scaling and multiplication to convert fixed-point input data into the first input format using a fixed point to fixed point converter]).

Regarding claim 10, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 2.
Falcon further discloses wherein the processing module is configured to process the set of input data according to one or more layers by performing at least a first operation and a second operation on the set of input data wherein the first operation is performed using a first data format and the second operation is performed using a second, different, data format (as indicated above, “the processing module” has been interpreted as being hardware) (see, e.g., paragraphs 46, 53, 85 and 99-101, “execution unit 142 may perform instructions in packed instruction set 143 for performing operations on packed data formats”, “operate on data elements having sizes of byte, word, doubleword, quadword, etc., as well as datatypes, such as single and double precision integer and floating point datatypes” [i.e., including first and second, different data formats], “calculation accelerator 1004, to perform calculation for different layers of CNN system 900”, “Input data 1202 and weights 1204 are multiplied pair-wise at 1304 and then added together in accumulators 1306. Multiplication may be made by hardware components that perform multiplication operation of integer or fixed-point inputs. … such multipliers may include 8-bit fixed-point multipliers. If input data 1202 and weights 1204 are each eight-bits wide (and in 1.7 format … of a fixed-point number), then there may be sixteen pairs of inputs”, “Partial results may be kept in sixteen bit format. If a partial result is sent to memory or another calculation unit 1200 [i.e., as input data], it may be truncated into an eight-bit fixed point format”, “any suitable format may be used” [i.e., hardware components process the input data for the layers by performing first and second calculations/multiplication/truncation operations on the input data using first and second data formats, e.g., 8-bit and 16-bit or a floating point datatype]). 

Regarding claim 11, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 10.
wherein the first data format is a fixed point data format and the second data format is a floating point data format (see, e.g., paragraphs 53, 56 and 81, “operate on data elements having sizes of byte, word, doubleword, quadword, etc., as well as datatypes, such as single and double precision integer and floating point datatypes”, “perform integer and floating point operations”, “may include integer (or fixed-point) multiplication and addition, or float-point fused multiply-add (FMA). These operations involve multiplication operations of inputs” [i.e., first and second data formats, e.g., integer/fixed point and a floating point datatype]); and 
the processing module is configured to perform the second operation of the set of input data using the second data format by converting fixed point input data into floating point input data, performing the second operation on the floating point input data to produce floating point output data, and converting the floating point output data to fixed point output data (see, e.g., paragraphs 56-57 and 98, “Each of register files 208, 210 perform integer and floating point operations … Integer register file 208 and floating point register file 210 may communicate data with the other. … Floating point register file 210 may include 128-bit wide entries”, “register files 208, 210 that store the integer and floating point data operand values … a 64-bit by 64-bit floating point divider to execute divide, square root, and remainder micro-ops.” [i.e., operations to convert integer/fixed point data into floating point data], “Although example scaling from, for example, thirty-two-bit floating point values to eight-bit fixed point values are illustrated, scaling may be performed from any value in higher precision 

Regarding claim 12, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 2.
Falcon further discloses wherein the decoder is further configured to receive information indicating a format of the set of input data for the hardware pass (as indicated above, “the decoder” has been interpreted as being hardware) (see, e.g., FIGs. 2, 12 and 14 showing “DECODER” 228, “Input Data” 1202 and “1.7 format” information received by “MAC Unit” 1210 and “16 Bit Arithmetic Left Shifter” 1240 and flowchart step 1425 to “Receive … weight values, and scale values” and paragraphs 46, 99, 106, 119, 121 and 123, “instruction set 143 for performing operations on packed data formats”, “If input data 1202 and weights 1204 are each eight-bits wide (and in 1.7 format, wherein a bit is used to represent the sign and seven bits are used to represent a fractional part of a fixed-point number)” [i.e., input data 1202 includes 1.7 format information indicating a format], “Partial results are stored … successive operations of different calculation circuits upon successive portions of the same layers. When used by a subsequent calculation circuit, partial results may be scaled up by 16-bit arithmetic left shifter 1240”, “scale values indicating the degree to which the weights were scaled may be received”, “the scaled weights may be used to determine suitable calculations, such as convolution or dot-product, on the input.”, “partial results may be stored for future computation … if such results are to be performed on a different calculation circuit then the results may be partially truncated. Furthermore, the results may be scaled down” ; and
wherein the processing module is configured to process the set of input data (as indicated above, “the processing module” has been interpreted as being hardware) (see, e.g., FIG. 14 flowchart showing step 1440 to “Use scaled weights to determine convolution or dot-product calculations on input” data and paragraph 121, “At 1440, the scaled weights may be used to determine suitable calculations, such as convolution or dot-product, on the input.” [i.e., hardware performs calculations to process the input data]). 

Regarding claim 13, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 12.
Falcon further discloses wherein the format of the set of input data for the hardware pass is a fixed point format defined by an exponent and an integer bit-length (see, e.g., paragraphs 95 and 98, “the stored, scaling value may be similar to an associated exponent”, “scaling from, for example, thirty-two-bit floating point values to eight-bit fixed point values are illustrated, scaling may be performed from any value in higher precision fixed or floating point to any lower prevision [precision] value in fixed point.” [i.e., the format of the input data is a fixed-point format defined by an integer bit-length 8 to 32 bits, and an exponent]); and/or 
wherein the desired output data format for the hardware pass is different than the format of the set of input data for the hardware pass (see, e.g., paragraphs 92, 99, 102-103, 106, 119 and 123 “down-scaling in association with 

Regarding claim 14, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 2.
Falcon further discloses wherein the desired output data format is a fixed point format defined by an exponent and an integer bit-length (see, e.g., paragraphs 81, 95, 98 and 100, “CNN systems are implemented according to … 32-bit or 16-bit fixed point precision”, “scaling from, for example, thirty-two-bit floating point 

Regarding claim 15, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 2.
Falcon further discloses wherein the decoder is further configured to receive information indicating a format of one or more weights associated with one of the one or more layers of the DNN (as indicated above, “the decoder” has been interpreted as being hardware) (see, e.g., FIGs. 2 and 14 showing “DECODER” 228 and flowchart step 1425 to “Receive … weight values, and scale values” and paragraphs 89, 93-95, 98 and 119, “calculation circuit 1200 may accept inputs from … weights 1204”, “Weights 1204 may be calculated during, for example, a learning process of the functions for the CNN”, “for a given layer, the maximum and minimum values of weights 1204 may be determined … weights 1204 may be scaled up to meet a defined range”, “scale values indicating the degree to which the weights were scaled may be received” [i.e., receiving scaling information indicating a format of one or more weights associated with the given layer of the layers of the CNN/DNN]; and 
the processing module is configured to process the set of input data according to that layer based on the indicated format of the one or more weights (as indicated above, “the processing module” has been interpreted as being hardware) (see, e.g., FIG. 14 flowchart showing step 1440 to “Use scaled weights to determine convolution or dot-product calculations on input” data and paragraph 121, “At 1440, the scaled weights may be used to determine suitable calculations, such as convolution or dot-product, on the input.” [i.e., hardware performs calculations to process the input data based on the format of the weights/scaling information]).

Regarding claim 16, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 15.
Falcon further discloses wherein the format of the one or more weights associated with the layer of the one or more layers is a fixed point format that is defined by an exponent and an integer bit-length (see, e.g., paragraphs 89, 93-95 and 98, “calculation circuit 1200 may accept inputs from … weights 1204”, “Weights 1204 may be calculated during, for example, a learning process of the functions for the CNN”, “for a given layer, the maximum and minimum values of weights 1204 may be determined … weights 1204 may be scaled up to meet a defined range” [i.e., one or more weights associated with the given layer of the layers of the CNN/DNN], “weights 1204 may be similar to a mantissa of floating-point operations, while the stored, scaling value may be similar to an associated exponent”, “scaling from, for example, thirty-two-bit floating point values to eight-bit fixed point values are illustrated, scaling may be performed from any value in higher precision fixed or floating point to any lower ; 
wherein the information indicating a format of one or more weights indicates a different format for at least two weights associated with the layer of the one or more layers; and/or wherein a format of one or more weights associated with a different layer is different than the format of the one or more weights associated with the one layer of the one or more layers of the DNN (see, e.g., paragraphs 93, 97, 99, 103, 106, 113 and 123, “Weights 1204 may be calculated during, for example, a learning process of the functions for the CNN. Weights 1204 may vary based on, for example, different filter functions”, “after weights are scaled up for use in weights 1204, weight values may be truncated … if calculation circuit 1200 is to use weights with eight bits of precision, the bottom sixteen bits may be truncated from weights before they are provided as weights 1204” [i.e., scaling information indicates a different/varying format/precision level for at least two of the weights 1204], “the number of digits or bits shifted may be held constant for all weights within a given layer, even though some weight values might yet be shifted again.” [i.e., some weights 1204 associated with/within the given layer are shifted, scaled and truncated to have a format that is different than other weights 1204 in the layer]).

Regarding claim 17, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 2.
Falcon further discloses wherein the desired output data format for the hardware pass is based on an expected output data range for a … layer of the one or more layers (as indicated above, “the hardware pass”, under the BRI, is using hardware for processing input data in one or more layers of a DNN) (see, e.g., paragraphs 94, 101 and 103, “In one embodiment, for a given layer [i.e., a layer of the one or more layers of the DNN/CNN], the maximum and minimum values of weights 1204 may be determined. In another embodiment and based on such a determination, weights 1204 may be scaled up to meet a defined range. For example, if weights 1204 are given as positive and negative fractions less than one, then weights 1204 may be scaled up to the range (-1, 1).”, “accumulate values that overpass the output range”, “shifter and truncate logic 1232 may scale down the results so that they are normalized for use in a range expected by other elements” [i.e., an expected/desired/defined results/output data range for the layer]). 
Although Falcon substantially discloses the claimed invention, Falcon is not relied on for explicitly disclosing a last layer of the one or more layers.
In the same field, analogous art Yang teaches a last layer of the one or more layers (see, e.g., FIG. 4 showing a neural network with a last layer L2 with node 431 and paragraphs 30 and 34, “The nodes of the neural network may represent input, intermediate, and output data and may be organized as input nodes, hidden nodes, and output nodes. The nodes may also be grouped together in various hierarchy levels.”, “In the example neural network shown in FIG. 4, the first hierarchy layer grouping of nodes, L0 nodes, includes nodes 411, 412, 413, 414, and 415. The second hierarchy layer grouping of nodes, L1 nodes, includes nodes 421, 422, and 423. The third hierarchy layer grouping of nodes, L2 nodes includes node 431.” [i.e., output nodes such as node 431 are in a third, last, hierarchical level of the neural network]). 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Falcon to incorporate the teachings of Yang to provide techniques for “Updating an artificial neural network … using a fixed point node parameter and a network characteristic is represented using a fixed point network parameter” where a “value associated with the fixed point intermediate parameter is truncated and/or rounded according to a flexible system truncation schema” and an “instruction identifies a desired decimal point placement of the result of the operation.” (See, e.g., Yang, Abstract and paragraphs 13-14 and 49). Doing so would have allowed Falcon to use Yang’s technique for “Updating an artificial neural network” and instruction for the operation to “allow the result of the operation to be in the desired fixed point representation format that is different from the fixed point representation formats of the operands of the operation” where “The desired fixed point representation format of the result may identify the fixed point representation format of one or more elements of the result”, as suggested by Yang (See, e.g., Yang, paragraphs 13 and 49).

Regarding claim 18, as discussed above, Falcon in view of Yang teaches the hardware implementation of claim 2.
Falcon further discloses a … output data format for the hardware pass indicates a different … output data format for at least two portions of the processed data (as indicated above, “the hardware pass”, under the BRI, is using hardware for processing input data in one or more layers of a DNN) (see, e.g., paragraphs 103, 106 and 123, “right shifter and truncate logic 1232 may scale down the results so that they are normalized for use in a range expected by other elements” [i.e., portions of results/output data indicate different formats for portions in the expected/desired range of the processed data], “an augmented, scaled-up result … may be truncated when such a result is passed out … Partial results are stored in memory so as not to lose interim precision between successive operations of different calculation circuits upon successive portions of the same layers. When used by a subsequent calculation circuit, partial results may be scaled”, “partial results may be stored for future computation … if such results are to be performed on a different calculation circuit then the results may be partially truncated … the results may be scaled down … The truncated and scaled results may be … sent to another calculation circuit” [i.e., scale information indicates scaling and truncation to be performed on the results to produce different output data formats for the different portions of the scaled and truncated/processed data]).
Although Falcon substantially discloses the claimed invention, Falcon is not relied on for explicitly disclosing wherein the information indicating a desired output data format for the hardware pass indicates a different desired output data format for at least two portions of the processed data.
In the same field, analogous art Yang teaches wherein the information indicating a desired output data format for the hardware pass indicates a different desired output data format for at least two portions of the processed data (see, 
Falcon and Yang are analogous art because they are both related to fixed point precision calculations in deep convolutional neural networks (see, e.g., Falcon, Abstract and paragraphs 81-82 and 98-100 and Yang, Abstract and paragraphs 13-15 and 38).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Falcon to incorporate the teachings of Yang to provide techniques for “Updating an artificial neural network … using a fixed point node parameter and a network characteristic is represented using a fixed point network parameter” where a “value associated with the fixed point intermediate 

With respect to independent claim 19, Falcon discloses the invention as claimed including a non-transitory computer readable storage medium having stored thereon a computer readable description of an integrated circuit that, when processed in an integrated circuit manufacturing system, causes the integrated circuit manufacturing system to manufacture (see, e.g., paragraphs 25, 27, 45 and 130, “a machine or computer-readable medium having stored thereon instructions which may be used to program a computer (or other electronic devices) to perform one or more operations” [i.e., a computer-readable medium storing instructions], “data representing the physical placement of various devices in the hardware model. … semiconductor fabrication techniques are used, the data representing the hardware model may be the data specifying the presence or absence of various features … used to produce the integrated circuit. In any representation of the design, the data may be a hardware implementation of a Deep Neural Network "DNN" configured to implement the DNN by processing data using one or more hardware passes (As indicated above, “processing data using one or more hardware passes”, under the BRI, is using any hardware for processing input data in one or more layers of a DNN) (see, e.g., FIG. 1A – depicting a hardware implementation of “System 100” and paragraphs 23, 27, 35 and 82, “weight-shifting mechanism for reconfigurable processing units within or in association with a processor … computer system, or other processing apparatus … such a weight-shifting mechanism may be used in convolution neural networks (CNN)”, “a circuit level model with logic and/or transistor gates may be produced at some stages of the design process … a level of data representing the , the hardware implementation comprising:
an input module configured to receive a set of input data for a hardware pass of the hardware implementation, the set of input data representing at least a portion of input data for a particular layer of the DNN (as indicated above, the “input module” has been interpreted as being hardware, and “a hardware pass of the hardware implementation”, under the BRI, is using hardware for processing input data in one or more layers of a DNN) (see, e.g., FIG. 14 flowchart showing “Receive input values” in step 1425 and paragraphs 89, 91, 93 and 119, “calculation circuit 1200 may accept inputs from, for example, input data 1202 and weights 1204”, “Weights 1204 or input data 1202 may be low precision”, “Input data 1202 may be read from various input layers”, “At 1425, input values and weight values may be received. … The input values and weight values may be of a fixed size and of a lower precision than which the weight values were originally determined.” [i.e., hardware/circuit 1200 to accept/receive a set of input data representing a portion of input data for a layer of the CNN/DNN]);
a decoder configured to receive information indicating a … output data format for the hardware pass (as indicated above, the “decoder” has been interpreted as being hardware) (see, e.g., FIGs. 2 and 14 showing “DECODER” 228 and flowchart ;
a processing module configured to process the set of input data according to one or more layers of the DNN associated with the hardware pass to produce processed data, the one or more layers comprising the particular layer of the DNN (as indicated above, the “processing module” has been interpreted as being hardware) (see, e.g., FIG. 14 flowchart showing steps 1440 and 1445 to “Use scaled weights to determine convolution or dot-product calculations on input” data for a layer and then determine if data processing for that particular “Layer [is] finished?” and paragraphs 85 and 121-122, “calculation accelerator 1004, to perform calculation for different layers of CNN system 900”, “At 1440, the scaled weights may be used to determine suitable calculations, such as convolution or dot-product, on the input.”, “At 1445, in one embodiment it may be determined whether computations have finished for the layer” ; and
an output module configured to convert the processed data into the … output data format for the hardware pass to produce output data for the hardware pass (as indicated above, the “output module” has been interpreted as being hardware) (see, e.g., FIG. 14 flowchart showing steps 1450, 1455, 1460 and 1465 to scale, truncate and output calculated values/results and paragraphs 123-124, “the results may be partially truncated. Furthermore, the results may be scaled down by, for example, shifting their values right by the scaling factor. The truncated and scaled results may be stored in memory, a register, or otherwise sent to another calculation circuit.”, “the results may be scaled down … in another embodiment the results may be truncated … the upper integer bits and lower fractional bits may be truncated according to an expected output format. At 1465, the result may be output as the determined calculated value associated with the layer.” [i.e., circuit/hardware module to scale/truncate results/processed data into an expected/desired output data format to calculate/produce output data for the hardware pass for the layer and to the calculation circuit]).
Although Falcon substantially discloses the claimed invention, Falcon is not relied on for explicitly disclosing receive information indicating a desired output data format … and convert the processed data into the desired output data format.
In the same field, analogous art Yang teaches receiving information indicating a desired output data format (see, e.g., paragraphs 49, 54, 66 and 76, “a register identifier or other memory/storage location of where the result is to be stored is included in the instruction. … the instruction identifies a desired decimal point placement of the ; and converting the processed data into the desired output data format (see, e.g., FIGs. 6 and 7 flowcharts showing steps 606 and 706, respectively to “Format a final intermediate result of the instruction to a desired fixed point representation format” and paragraphs 66 and 76, “At 606, a final intermediate result of the instruction is formatted to a desired fixed point representation format. The final intermediate result may be an intermediate result matrix that includes the elements to be formatted to produce the final result matrix. For example, the desired fixed point representation format for the result matrix is specified in the received matrix multiplication instruction.” [i.e., converting the processed data/intermediate result by formatting it to a desired representation/output data format]).
Falcon and Yang are analogous art because they are both related to fixed point precision calculations in deep convolutional neural networks (see, e.g., Falcon, Abstract and paragraphs 81-82 and 98-100 and Yang, Abstract and paragraphs 13-15 and 38).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Falcon to incorporate the teachings of Yang to provide techniques for “Updating an artificial neural network … using a fixed point node parameter and a network characteristic is represented using a fixed point network parameter” where a “value associated with the fixed point intermediate parameter is truncated and/or rounded according to a flexible system truncation schema” and an “instruction identifies a desired decimal point placement of the result of 

Regarding claim 20, as discussed above, Falcon in view of Yang teaches the method of claim 1.
Examiner’s Note: claim 20, as drafted, depends from claim 1. If applicant intended for claim 20 to be an independent claim, the examiner suggests that one way to do so is to amend claim 20 to explicitly recite the steps of claim 1 instead of the current recitation of a “non-transitory computer readable storage medium having stored thereon computer readable instructions that, when executed at a computer system, cause the computer system to perform the method as set forth in claim 1.”
Falcon further discloses a non-transitory computer readable storage medium having stored thereon computer readable instructions that, when executed at a computer system, cause the computer system to perform (see, e.g., paragraph 25, “embodiments … may be accomplished by way of a data or instructions stored on a machine-readable, tangible medium, which when performed by a machine cause the machine to perform functions consistent with at least one embodiment … instructions the method as set forth in claim 1 (as indicated above, Falcon in view of Yang teaches the method of claim 1, see above citations to Falcon and Yang regarding the limitations and method steps of claim 1).

Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon, is considered pertinent to applicant's disclosure.
For example, Lin et al. (U.S. Patent Application Pub. No. 2016/0328645 A1, cited in applicant’s IDS submitted on 5/24/2019, hereinafter “Lin”), like Yang, also teaches receiving information indicating a desired output data format (See, e.g., Lin, FIG. 8 flowchart – showing step 802 to “SPECIFY A LIMITED BIT WIDTH” for a desired output data format and paragraphs 54 and 89, “The increased bit width may also be specified to round off bits”, “In block 802, “a limited bit width multiplier-accumulator is specified. … in accordance with a predetermined output number format.” [i.e., receiving information specifying/indicating a bit width/desired output data format]). Lin further teaches converting the processed data into the desired output data format (see, e.g., Lin FIG. 8 flowchart – showing step 810 to “ROUND OFF A NUMBER OF LEAST SIGNIFICANT BITS AND REMOVE A NUMBER OF MOST SIGNIFICANT BITS SO THAT AN 
Falcon, Yang and Lin are analogous art because they are each related to fixed point precision calculations in deep convolutional neural networks (see, e.g., Falcon, Abstract and paragraphs 81-82 and 98-100, Yang, Abstract and paragraphs 13-15 and 38, and Lin, Abstract and paragraphs 10-13, 28-32 and 49)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Falcon to incorporate the teachings of Lin to provide “an apparatus for reducing computational complexity for a fixed point neural network operating in a system having a limited bit width in a multiplier- accumulator” [MAC] (See, e.g., Lin, Abstract and paragraphs 11-13). Doing so would have allowed Falcon to use Lin’s “apparatus for reducing computational complexity for a fixed point neural network” to “reduce a number of bit shift operations when computing activations in the fixed point neural network”, to “balance an amount of quantization error and an overflow error when computing activations in the fixed point neural network” and also improve “fixed point computations with multiplier-accumulator bit width constraints”, as suggested by Lin (See, e.g., Lin, paragraphs 12-13 and 29).
 line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111 (c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For 





/R.K.B./Examiner, Art Unit 2125 

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125