DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claims 1-4, 6-8, 10-12, and 14-15 are pending in this office action and presented for examination. Claims 1, 3-4, 6-7, 10, 14 are newly amended, and claims 5, 9, and 13 are newly cancelled, by the response received September 15, 2022. 

Specification
The disclosure is objected to because of the following informalities. Appropriate correction is required.
In the specification as amended on September 15, 2022, reference character 112 is associated with a “queue”; however, reference character 112 in FIG. 1A is associated with a “cache”. 

The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. Examiner notes that the general concept of writing back instruction execution results was widespread prior to the effective filing date of the claimed invention. 

Drawings
The drawings are objected to because:
All drawings must be made by a process which will give them satisfactory reproduction characteristics. Every line, number, and letter must be durable, clean, black (except for color drawings), sufficiently dense and dark, and uniformly thick and well-defined. The weight of all lines and letters must be heavy enough to permit adequate reproduction. This requirement applies to all lines however fine, to shading, and to lines representing cut surfaces in sectional views. However, all figures do not meet this criteria.
Suitable descriptive legends should be included for blocks 102 and 104 in FIG. 1A; blocks 102 and 104 in FIG. 1B; blocks 202 and 204 in FIG. 2; blocks 202 and 204 in FIG. 3; blocks 202 and 204 in FIG. 4; and blocks 202 and 204 in FIG. 5. Examiner notes that suitable descriptive legends were previously present (though in an incorrect orientation).
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-4 and 6-8 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites the limitation “a cache unit, configured to couple between a fifth execution unit with a low use frequency and a fifth execution delay different from the first execution delay and the second execution delay, and receive a fifth execution result from the fifth execution unit” in lines 12-14. However, the metes and bounds of this limitation are indefinite. For example, "between" can be defined as "indicating a connection or relationship involving two or more parties"; however, only one party (i.e., the cache unit) appears to be associated with the "between" language.
Claim 1 recites the limitation “a fifth execution unit with a low use frequency” in lines 12-13. However, the claim does not particularly point out and distinctly define the metes and bounds of the subject matter to be protected by the patent grant, because it would not be clear to a hypothetical person possessing the ordinary level of skill in the pertinent art whether a particular execution unit would be considered to have "low use frequency". The threshold separating “low use frequency” with a frequency that is not low use may vary from one person to another. Examiner notes that the disclosure does not explicitly or implicitly provide definite criteria by which whether something has “low use frequency” can be determined.
Claims 2-4 are rejected for failing to alleviate the rejections of claim 1 above. 

Claim 6 recites the limitation “one execution unit with a low use frequency” in line 11. However, the claim does not particularly point out and distinctly define the metes and bounds of the subject matter to be protected by the patent grant, because it would not be clear to a hypothetical person possessing the ordinary level of skill in the pertinent art whether a particular execution unit would be considered to have "low use frequency". The threshold separating “low use frequency” with a frequency that is not low use may vary from one person to another. Examiner notes that the disclosure does not explicitly or implicitly provide definite criteria by which whether something has “low use frequency” can be determined.
Claim 6 recites the limitation “the execution delays of other execution units” in line 12. However, there is insufficient antecedent basis for this limitation in the claims. (Note that, since “other execution units” has not yet been recited, any execution delays of those other execution units would likewise not yet be recited as well.)
Claim 6 recites the limitation “the execution result” in lines 20-21. However, it is indefinite as to whether this limitation has antecedent basis to “an execution unit” in claim 6, lines 14-15, or “an execution unit” in claim 6, lines 20-21.
Claims 7-8 are rejected for failing to alleviate the rejections of claim 6 above.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4 and 6-8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jacobs et al. (Jacobs) (US 20050278510 A1) in view of Barowski et al. (Barowski) (US 20130346729 A1) in view of Tran (US 20170351520 A1).
Consider claim 1, Jacobs discloses an apparatus for writing back an instruction execution result ([0003], line 13, process instructions; [0003], line 14, destination operand), comprising: a first writing port (FIG. 1, WD2), coupled between a first execution unit (FIG. 1, Execution unit 2 24) with a first execution delay (FIG. 1, Execution unit 2 24; note that the time it takes for the execution unit to execute its function corresponds to the execution delay) and a register file (FIG. 1, register file 30), and configured to receive a first execution result from the first execution unit, and to write the first execution result back to a first register unit in the register file ([0018], lines 5-7, four write lines 2 to 8 are provided such that each of the execution units 22 to 28 may write data to registers in the register file through write ports W.sub.D1 to W.sub.D4 respectively) based on a first writing address ([0019], lines 15-17, address line 3 is 6 bits wide and provides the address of the register to which the write data is to be written; [0021], lines 1-8, FIG. 2 illustrates the lines to and from the register file 30 for one of the execution units of FIG. 1, however identical lines exist between each of the execution units shown in FIG. 1 and the register file 30. Write address ports W.sub.A2 to W.sub.A4 (one associated with each write data port) and read address ports R.sub.A3 to R.sub.A8 (one associated with each read data port) are also provided in the register file, although these have not been shown in FIG. 2); a second writing port (FIG. 1, WD3), coupled between a second execution unit (FIG. 1, Execution unit 3 26) with a second execution delay (FIG. 1, Execution unit 3 26; note that the time it takes for the execution unit to execute its function corresponds to the execution delay) and the register file (FIG. 1, register file 30), and configured to receive a second execution result from the second execution unit, and write the second execution result back to a second register unit in the register file ([0018], lines 5-7, four write lines 2 to 8 are provided such that each of the execution units 22 to 28 may write data to registers in the register file through write ports W.sub.D1 to W.sub.D4 respectively) based on a second writing address ([0019], lines 15-17, address line 3 is 6 bits wide and provides the address of the register to which the write data is to be written; [0021], lines 1-8, FIG. 2 illustrates the lines to and from the register file 30 for one of the execution units of FIG. 1, however identical lines exist between each of the execution units shown in FIG. 1 and the register file 30. Write address ports W.sub.A2 to W.sub.A4 (one associated with each write data port) and read address ports R.sub.A3 to R.sub.A8 (one associated with each read data port) are also provided in the register file, although these have not been shown in FIG. 2); wherein the first writing port is not coupled to the second execution unit (Figure 1, which shows WD2 not coupled to Execution unit 3 26), and the second writing port is not coupled to the first execution unit (Figure 1, which shows WD3 not coupled to Execution unit 2 24).
However, Jacobs does not disclose that the second execution delay is different from the first execution delay, and that the apparatus further comprises: a cache unit, configured to couple between a fifth execution unit with a low use frequency and a fifth execution delay different from the first execution delay and the second execution delay, and receive a fifth execution result from the fifth execution unit; and a multiplexer, configured to couple between the cache unit and the second writing port, and receive the fifth execution result from the cache unit, and transmit the fifth execution result to the second writing port based on a selection signal, so as to cause the second writing port to write the fifth execution result back to a fifth register unit in the register file based on a fifth writing address, wherein the second writing port is coupled to the second execution unit via the multiplexer.
On the other hand, Barowski discloses second execution delay different from first execution delay ([0003], lines 4-10, instructions can vary in complexity, thus complex mathematical floating point operations are usually executed in four to forty execution cycles, whereas simple integer type instructions are executed within one cycle. Thus different architectures implement different execution pipelines with varying fixed instruction execution length according to the complexity type of operations; [0003], lines 16-18, typically these two classes of instructions are issued and executed on different execution units), and a cache unit (Figure 1, auxiliary buffer 60), configured to couple between an execution unit (Figure 1, execution pipeline 40) with an execution delay ([0039], lines 3-4, two execution pipelines 30, 40 for short instructions IS, IS1, IS2, IS3 of the first latency type) different from another execution delay and yet another execution delay ([0003], lines 4-10, instructions can vary in complexity, thus complex mathematical floating point operations are usually executed in four to forty execution cycles, whereas simple integer type instructions are executed within one cycle. Thus different architectures implement different execution pipelines with varying fixed instruction execution length according to the complexity type of operations; [0003], lines 16-18, typically these two classes of instructions are issued and executed on different execution units; [0004], lines 1-2, two or more instruction classes also referred to as latency types with different execution lengths or latencies; [0038], lines 12-13, execution pipeline 50 for long instructions I0, I1, I2 of the second latency type), and receive an execution result from the execution unit (Figure 1, see output from execution pipeline 40 to auxiliary buffer 60); and a multiplexer (Figure 1, mux 12), configured to couple between the cache unit (Figure 1, auxiliary buffer 60) and a writing port associated with the yet another execution delay (Figure 1, input into register file 10), and receive the execution result from the cache unit (Figure 1, see output from auxiliary buffer 60 to mux 12), and transmit the execution result to the writing port based on a selection signal, so as to cause the writing port to write the execution result back to a register unit in the register file based on a writing address (Figure 1, see output from mux 12 to register file 10; note that a mux outputs a selected input based on a selection signal; note that a particular register in a register file is selected based on an address of the register), wherein the writing port (Figure 1, input into register file 10) is coupled to another execution unit associated with the writing port associated with the yet another execution delay (Figure 1, execution pipeline 50) via the multiplexer (Figure 1, mux 12).
Barowski’s teaching increases the capability of a processor by supporting execution of both less complex instructions (e.g., addition, as per [0003], line 13) and more complex instructions (e.g., multiplication or floating-point operations, as per [0003], lines 14-15), and decreases the number of write ports needed, which decreases complexity, latency and power (Barowski, [0007], lines 3-4).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Barowski with the invention of Jacobs in order to increase processor capability and decrease the number of write ports needed, which decreases complexity, latency and power. Alternatively, this modification merely entails a use of known technique (Barowski’s teaching of execution units with different execution delays to support both less complex and more complex operations) to improve similar devices (methods, or products) (the invention of Jacobs) in the same way (after the combination, Jacobs’s invention would likewise entail execution units with different execution delays to support both less complex and more complex operations), which is an example of a rationale that may support a conclusion of obviousness as per MPEP 2143. Additionally, this modification merely entails a use of known technique (Barowski’s teaching of a write port being shared by multiple execution units via use of a multiplexer and a cache unit) to improve similar devices (methods, or products) (the invention of Jacobs) in the same way (after the combination, the invention of Jacobs would likewise entail a write port being shared by multiple execution units), which is an example of a rationale that may support a conclusion of obviousness as per MPEP 2143. Note that Barowski’s teaching, when applied to the invention of Jacobs that entails a “first” execution delay, a “second” execution delay, a “second” writing port, and a “second” execution unit, results in the overall claim limitation.
However, the combination thus far does not entail the fifth execution unit having a low use frequency.
On the other hand, Tran discloses an execution unit having a low use frequency ([0031], lines 18-19, infrequently used resources, such as a divide unit (DIV) 522 and an IMUL unit 524).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Tran with the combination of Jacobs and Barowski, as this modification merely entails simple substitution of one known element (an undisclosed use frequency of the combination of Jacobs and Barowski) for another (a low use frequency, as explicitly disclosed by Tran) to obtain predictable results (the previously explained combination of Jacobs and Barowski, wherein the fifth execution unit has a low use frequency), which is an example of a rationale that may support a conclusion of obviousness as per MPEP 2143. Also note that Tran discloses infrequently used resources being shared ([0034], lines 10-12) rather than being replicated ([0031], lines 18-20); therefore, it would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention for a writing port coupled to an infrequently used execution unit (such that the writing port would, if not shared, be infrequently used as well) to be instead shared with another execution unit in order to preclude replication-related costs (such as increased size and reduced maximum frequency, as per paragraph [0004] of Jacobs). 

Consider claim 2, the overall combination entails the apparatus of claim 1 (see above), wherein the first writing port is coupled to all register units in the register file, and the second writing port is coupled to all register units in the register file (Jacobs, [0019], lines 17-19, in a present embodiment the register file comprises 64 registers, and therefore an address signal comprising 6 bits is provided to address each of the registers).

Consider claim 3, the overall combination entails the apparatus of claim 1 (see above), further comprising: a third writing port (Jacobs, FIG. 1, WD4), coupled between the register file (Jacobs, FIG. 1, register file 30) and a third execution unit (Jacobs, FIG. 1, Execution unit 4 28) with a third execution delay different from the first execution delay and the second execution delay (Barowski, [0004], lines 1-2, two or more instruction classes also referred to as latency types with different execution lengths or latencies), and configured to receive a third execution result from the third execution unit, and write the third execution result back to a third register unit in the register file (Jacobs, [0018], lines 5-7, four write lines 2 to 8 are provided such that each of the execution units 22 to 28 may write data to registers in the register file through write ports W.sub.D1 to W.sub.D4 respectively) based on a third writing address (Jacobs, [0019], lines 15-17, address line 3 is 6 bits wide and provides the address of the register to which the write data is to be written; [0021], lines 1-8, FIG. 2 illustrates the lines to and from the register file 30 for one of the execution units of FIG. 1, however identical lines exist between each of the execution units shown in FIG. 1 and the register file 30. Write address ports W.sub.A2 to W.sub.A4 (one associated with each write data port) and read address ports R.sub.A3 to R.sub.A8 (one associated with each read data port) are also provided in the register file, although these have not been shown in FIG. 2); wherein the third writing port is not coupled to the first execution unit and the second execution unit (Jacobs, Figure 1, which shows WD4 not coupled to Execution unit 2 24 or Execution unit 3 26), and the third writing port is coupled to all register units in the register file (Jacobs, [0019], lines 17-19, in a present embodiment the register file comprises 64 registers, and therefore an address signal comprising 6 bits is provided to address each of the registers).

Consider claim 4, the combination thus far discloses the apparatus of claim 1 (see above), but does not entail the first writing port is further coupled between a fourth execution unit with a fourth execution delay and the register file, and is further configured to receive a fourth execution result from the fourth execution unit, and write the fourth execution result back to a fourth register unit in the register file based on a fourth writing address, and the fourth execution delay is substantially the same as the first execution delay.
On the other hand, Barowski further discloses a writing port (Figure 1, input into register file 10) is further coupled between an execution unit (Figure 1, execution pipeline 40) with an execution delay ([0003], lines 4-10, instructions can vary in complexity, thus complex mathematical floating point operations are usually executed in four to forty execution cycles, whereas simple integer type instructions are executed within one cycle. Thus different architectures implement different execution pipelines with varying fixed instruction execution length according to the complexity type of operations; [0003], lines 16-18, typically these two classes of instructions are issued and executed on different execution units) and a register file (Figure 1, register file 10), and is further configured to receive an execution result from the execution unit, and to write the execution result back to a register unit in the register file based on a writing address (Figure 1, see output from execution pipeline 40 to register file 10; note that a particular register in a register file is selected based on an address of the register), and the execution delay is substantially the same as another execution delay associated with the writing port ([0039], lines 3-4, two execution pipelines 30, 40 for short instructions IS, IS1, IS2, IS3 of the first latency type).
Barowski’s teaching decreases the number of write ports needed, which decreases complexity, latency and power (Barowski, [0007], lines 3-4).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Barowski with the previously explained combination of Jacobs, Barowski, and Tran in order to decrease the number of write ports needed, which decreases complexity, latency and power. Alternatively, this modification merely entails a use of known technique (Barowski’s teaching of a write port being shared by multiple execution units) to improve similar devices (methods, or products) (the previously explained combination of Jacobs, Barowski, and Tran) in the same way (after the combination, the previously explained combination of Jacobs, Barowski, and Tran would likewise entail a write port being shared by multiple execution units), which is an example of a rationale that may support a conclusion of obviousness as per MPEP 2143. Note that Barowski’s further teaching, when applied to the previously explained combination of Jacobs, Barowski, and Tran that entails a “first” writing port and a “first” execution delay, results in the overall claim limitation.

Consider claim 6, Jacobs discloses a processing apparatus, including: a register file (FIG. 1, register file 30), comprising a plurality of register units ([0019], lines 15-17, address line 3 is 6 bits wide and provides the address of the register to which the write data is to be written; [0021], lines 1-8, FIG. 2 illustrates the lines to and from the register file 30 for one of the execution units of FIG. 1, however identical lines exist between each of the execution units shown in FIG. 1 and the register file 30. Write address ports W.sub.A2 to W.sub.A4 (one associated with each write data port) and read address ports R.sub.A3 to R.sub.A8 (one associated with each read data port) are also provided in the register file, although these have not been shown in FIG. 2); a plurality of execution units (Figure 1, execution units 1-4 22-28), configured to execute instructions respectively and output execution results ([0003], line 13, process instructions; [0003], line 14, destination operand) with execution delays (Figure 1, execution units 1-4 22-28; note that the time it takes for the execution unit to execute its function corresponds to the execution delay); a plurality of writing ports (Figure 1, WD1-WD4), wherein each writing port is configured to couple between an execution unit (Figure 1, execution units 1-4 22-28) with a corresponding execution delay (Figure 1, execution units 1-4 22-28; note that the time it takes for the execution unit to execute its function corresponds to the execution delay) and the plurality of register units ([0019], lines 15-17, address line 3 is 6 bits wide and provides the address of the register to which the write data is to be written; [0021], lines 1-8, FIG. 2 illustrates the lines to and from the register file 30 for one of the execution units of FIG. 1, however identical lines exist between each of the execution units shown in FIG. 1 and the register file 30. Write address ports W.sub.A2 to W.sub.A4 (one associated with each write data port), receive an execution result from the execution unit with the corresponding execution delay, and write the execution result back to any one of the plurality of register units ([0018], lines 5-7, four write lines 2 to 8 are provided such that each of the execution units 22 to 28 may write data to registers in the register file through write ports W.sub.D1 to W.sub.D4 respectively; [0019], lines 17-19, in a present embodiment the register file comprises 64 registers, and therefore an address signal comprising 6 bits is provided to address each of the registers) corresponding to a writing address ([0019], lines 15-17, address line 3 is 6 bits wide and provides the address of the register to which the write data is to be written; [0021], lines 1-8, FIG. 2 illustrates the lines to and from the register file 30 for one of the execution units of FIG. 1, however identical lines exist between each of the execution units shown in FIG. 1 and the register file 30. Write address ports W.sub.A2 to W.sub.A4 (one associated with each write data port) and read address ports R.sub.A3 to R.sub.A8 (one associated with each read data port) are also provided in the register file, although these have not been shown in FIG. 2), wherein one writing port of the plurality of writing ports (Figure 1, WD1-WD4)  is configured to couple to one execution unit (Figure 1, execution units 1-4 22-28) with an execution delay at the same time (Figure 1, execution units 1-4 22-28; note that the time it takes for the execution unit to execute its function corresponds to the execution delay).
However, Jacobs does not disclose that the aforementioned writing port configuration is according to the execution delays of the plurality of execution units, wherein the one execution unit has a low use frequency and an execution delay different from the execution delays of other execution units; a cache unit, configured to couple with the one execution unit, and receive an execution result from the one execution unit; and a multiplexer, configured to couple between the cache unit and the one writing port, and receive the execution result from the cache unit, and transmit the execution result to the one writing port based on a selection signal, wherein the multiplexer is further configured to couple with another execution unit, and receive an execution result from the another execution unit, and transmit the execution result to the writing port based on the selection signal.
On the other hand, Barowski discloses different execution delays ([0003], lines 4-10, instructions can vary in complexity, thus complex mathematical floating point operations are usually executed in four to forty execution cycles, whereas simple integer type instructions are executed within one cycle. Thus different architectures implement different execution pipelines with varying fixed instruction execution length according to the complexity type of operations; [0003], lines 16-18, typically these two classes of instructions are issued and executed on different execution units) to support execution of both less complex instructions (e.g., addition, as per [0003], line 13) and more complex instructions (e.g., multiplication or floating-point operations, as per [0003], lines 14-15); reduced complexity, latency, and power via use of fewer write ports ([0007], lines 1-4, additional writeback paths require additional write ports for the register file and additional multiplexer inputs for the operand input latches of the functional units and add complexity, latency and power); and prioritizing writeback ports according to execution delays of a plurality of execution units in order to prevent pipeline stalls from long running instructions ([0004], lines 1-4, if two or more instruction classes also referred to as latency types with different execution lengths or latencies are issued generally long running instruction are given higher priority to prevent pipeline stalls). Barowski further discloses a cache unit (Figure 1, auxiliary buffer 60), configured to couple with one execution unit (Figure 1, execution pipeline 40) with an execution delay ([0039], lines 3-4, two execution pipelines 30, 40 for short instructions IS, IS1, IS2, IS3 of the first latency type) different from the execution delays of other execution units ([0003], lines 4-10, instructions can vary in complexity, thus complex mathematical floating point operations are usually executed in four to forty execution cycles, whereas simple integer type instructions are executed within one cycle. Thus different architectures implement different execution pipelines with varying fixed instruction execution length according to the complexity type of operations; [0003], lines 16-18, typically these two classes of instructions are issued and executed on different execution units; [0004], lines 1-2, two or more instruction classes also referred to as latency types with different execution lengths or latencies), and receive an execution result from the one execution unit (Figure 1, see output from execution pipeline 40 to auxiliary buffer 60); and a multiplexer (Figure 1, mux 12), configured to couple between the cache unit (Figure 1, auxiliary buffer 60) and one writing port (Figure 1, input into register file 10), and receive the execution result from the cache unit (Figure 1, see output from auxiliary buffer 60 to mux 12), and transmit the execution result to the one writing port based on a selection signal (Figure 1, see output from mux 12 to register file 10; note that a mux outputs a selected input based on a selection signal), wherein the multiplexer is further configured to couple with another execution unit and receive an execution result from the another execution unit (Figure 1, see output from execution pipeline 50 to mux 12), and transmit the execution result to the writing port based on the selection signal (Figure 1, see output from mux 12 to register file 10; note that a mux outputs a selected input based on a selection signal).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Barowski with the invention of Jacobs in order to increase processor capability; reduce complexity, latency, and power via use of fewer write ports; and prevent pipeline stalls from long running instructions. Note that the reduction of write ports and prioritization of write ports according to execution delays of a plurality of execution units reflects writing port configuration according to execution delays of a plurality of execution units. Alternatively, this modification merely entails a use of known technique (Barowski’s teaching of execution units with different execution delays to support both less complex and more complex operations; a reduced number of write ports; and prioritizing writeback ports according to execution delays) to improve similar devices (methods, or products) (the invention of Jacobs) in the same way (after the combination, Jacobs’s invention would likewise entail execution units with different execution delays to support both less complex and more complex operations, and a reduced number of write ports, with write ports prioritized according to execution delays), which is an example of a rationale that may support a conclusion of obviousness as per MPEP 2143.
However, the combination thus far does not entail the one execution unit having a low use frequency.
On the other hand, Tran discloses an execution unit having a low use frequency ([0031], lines 18-19, infrequently used resources, such as a divide unit (DIV) 522 and an IMUL unit 524).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Tran with the combination of Jacobs and Barowski, as this modification merely entails simple substitution of one known element (an undisclosed use frequency of the combination of Jacobs and Barowski) for another (a low use frequency, as explicitly disclosed by Tran) to obtain predictable results (the previously explained combination of Jacobs and Barowski, wherein the one execution unit has a low use frequency), which is an example of a rationale that may support a conclusion of obviousness as per MPEP 2143. Also note that Tran discloses infrequently used resources being shared ([0034], lines 10-12) rather than being replicated ([0031], lines 18-20); therefore, it would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention for a writing port coupled to an infrequently used execution unit (such that the writing port would, if not shared, be infrequently used as well) to be instead shared with another execution unit in order to preclude replication-related costs (such as increased size and reduced maximum frequency, as per paragraph [0004] of Jacobs). 

Consider claim 7, the overall combination entails the processing apparatus of claim 6 (see above), wherein the execution delays of the plurality of execution units are different from each other (Barowski, [0003], lines 4-10, instructions can vary in complexity, thus complex mathematical floating point operations are usually executed in four to forty execution cycles, whereas simple integer type instructions are executed within one cycle. Thus different architectures implement different execution pipelines with varying fixed instruction execution length according to the complexity type of operations; [0003], lines 16-18, typically these two classes of instructions are issued and executed on different execution units; [0004], lines 1-2, two or more instruction classes also referred to as latency types with different execution lengths or latencies).

Consider claim 8, the overall combination entails the processing apparatus of claim 6 (see above), wherein at least two of the execution delays of the plurality of execution units are equal to a preset delay value (Barowski, [0039], lines 3-4, two execution pipelines 30, 40 for short instructions IS, IS1, IS2, IS3 of the first latency type), and a writing port (Barowski, Figure 1, input into register file 10) of the plurality of writing ports (Figure 1, WD1-WD4) is coupled to at least two execution units with the at least two execution delays among the plurality of execution units (Barowski, [0039], lines 3-4, two execution pipelines 30, 40 for short instructions IS, IS1, IS2, IS3 of the first latency type).

Claims 10-12 and 14-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jacobs et al. (Jacobs) (US 20050278510 A1) in view of Barowski et al. (Barowski) (US 20130346729 A1).
Consider claim 10, Jacobs discloses a method for writing back an instruction execution result ([0003], line 13, process instructions; [0003], line 14, destination operand), comprising: receiving, via a first writing port (FIG. 1, WD2), a first execution result ([0018], lines 5-7, four write lines 2 to 8 are provided such that each of the execution units 22 to 28 may write data to registers in the register file through write ports W.sub.D1 to W.sub.D4 respectively) from a first execution unit (FIG. 1, Execution unit 2 24) with a first execution delay (FIG. 1, Execution unit 2 24; note that the time it takes for the execution unit to execute its function corresponds to the execution delay), and writing the first execution result back to a first register unit in a register file ([0018], lines 5-7, four write lines 2 to 8 are provided such that each of the execution units 22 to 28 may write data to registers in the register file through write ports W.sub.D1 to W.sub.D4 respectively) based on a first writing address ([0019], lines 15-17, address line 3 is 6 bits wide and provides the address of the register to which the write data is to be written; [0021], lines 1-8, FIG. 2 illustrates the lines to and from the register file 30 for one of the execution units of FIG. 1, however identical lines exist between each of the execution units shown in FIG. 1 and the register file 30. Write address ports W.sub.A2 to W.sub.A4 (one associated with each write data port) and read address ports R.sub.A3 to R.sub.A8 (one associated with each read data port) are also provided in the register file, although these have not been shown in FIG. 2); receiving, via a second writing port (FIG. 1, WD3), a second execution result ([0018], lines 5-7, four write lines 2 to 8 are provided such that each of the execution units 22 to 28 may write data to registers in the register file through write ports W.sub.D1 to W.sub.D4 respectively) from a second execution unit (FIG. 1, Execution unit 3 26) with a second execution delay (FIG. 1, Execution unit 3 26; note that the time it takes for the execution unit to execute its function corresponds to the execution delay), and writing the second execution result back to a second register unit in the register file ([0018], lines 5-7, four write lines 2 to 8 are provided such that each of the execution units 22 to 28 may write data to registers in the register file through write ports W.sub.D1 to W.sub.D4 respectively) based on a second writing address ([0019], lines 15-17, address line 3 is 6 bits wide and provides the address of the register to which the write data is to be written; [0021], lines 1-8, FIG. 2 illustrates the lines to and from the register file 30 for one of the execution units of FIG. 1, however identical lines exist between each of the execution units shown in FIG. 1 and the register file 30. Write address ports W.sub.A2 to W.sub.A4 (one associated with each write data port); wherein the first writing port is not coupled to the second execution unit (Figure 1, which shows WD2 not coupled to Execution unit 3 26), and the second writing port is not coupled to the first execution unit (Figure 1, which shows WD3 not coupled to Execution unit 2 24).
However, Jacobs does not disclose that the second execution delay is different from the first execution delay. Jacobs also does not disclose receiving, via a cache unit, a fifth execution result from a fifth execution unit with a fifth execution delay different from the first execution delay and the second execution delay; and receiving, via a multiplexer, the fifth execution result from the cache unit, and transmitting the fifth execution result to the second writing port based on a selection signal, so as to cause the second writing port writing the fifth execution result back to a fifth register unit in the register file based on a fifth writing address, wherein the second writing port is coupled to the second execution unit via the multiplexer.
On the other hand, Barowski discloses second execution delay different from first execution delay ([0003], lines 4-10, instructions can vary in complexity, thus complex mathematical floating point operations are usually executed in four to forty execution cycles, whereas simple integer type instructions are executed within one cycle. Thus different architectures implement different execution pipelines with varying fixed instruction execution length according to the complexity type of operations; [0003], lines 16-18, typically these two classes of instructions are issued and executed on different execution units), and receiving, via a cache unit (Figure 1, auxiliary buffer 60), an execution result (Figure 1, see output from execution pipeline 40 to auxiliary buffer 60) from an execution unit (Figure 1, execution pipeline 40) with an execution delay ([0039], lines 3-4, two execution pipelines 30, 40 for short instructions IS, IS1, IS2, IS3 of the first latency type) different from another execution delay and yet another execution delay ([0003], lines 4-10, instructions can vary in complexity, thus complex mathematical floating point operations are usually executed in four to forty execution cycles, whereas simple integer type instructions are executed within one cycle. Thus different architectures implement different execution pipelines with varying fixed instruction execution length according to the complexity type of operations; [0003], lines 16-18, typically these two classes of instructions are issued and executed on different execution units; [0004], lines 1-2, two or more instruction classes also referred to as latency types with different execution lengths or latencies; [0038], lines 12-13, execution pipeline 50 for long instructions I0, I1, I2 of the second latency type); and receiving, via a multiplexer (Figure 1, mux 12), the execution result from the cache unit (Figure 1, see output from auxiliary buffer 60 to mux 12), and transmitting the execution result to the writing port based on a selection signal, so as to cause the writing port writing the execution result back to a register unit in the register file based on a writing address (Figure 1, see output from mux 12 to register file 10; note that a mux outputs a selected input based on a selection signal; note that a particular register in a register file is selected based on an address of the register), wherein the writing port (Figure 1, input into register file 10) is coupled to another execution unit associated with the writing port associated with the yet another execution delay (Figure 1, execution pipeline 50) via the multiplexer (Figure 1, mux 12).
Barowski’s teaching increases the capability of a processor by supporting execution of both less complex instructions (e.g., addition, as per [0003], line 13) and more complex instructions (e.g., multiplication or floating-point operations, as per [0003], lines 14-15), and decreases the number of write ports needed, which decreases complexity, latency and power (Barowski, [0007], lines 3-4) .
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Barowski with the invention of Jacobs in order to increase processor capability and decrease the number of write ports needed, which decreases complexity, latency and power. Alternatively, this modification merely entails a use of known technique (Barowski’s teaching of execution units with different execution delays to support both less complex and more complex operations) to improve similar devices (methods, or products) (the invention of Jacobs) in the same way (after the combination, Jacobs’s invention would likewise entail execution units with different execution delays to support both less complex and more complex operations), which is an example of a rationale that may support a conclusion of obviousness as per MPEP 2143. Additionally, this modification merely entails a use of known technique (Barowski’s teaching of a write port being shared by multiple execution units via use of a multiplexer and a cache unit) to improve similar devices (methods, or products) (the invention of Jacobs) in the same way (after the combination, the invention of Jacobs would likewise entail a write port being shared by multiple execution units), which is an example of a rationale that may support a conclusion of obviousness as per MPEP 2143. Note that Barowski’s teaching, when applied to the invention of Jacobs that entails a “first” execution delay, a “second” execution delay, a “second” writing port, and a “second” execution unit, results in the overall claim limitation.

Consider claim 11, the overall combination discloses the method of claim 10 (see above), further comprising: receiving, via a third writing port (Jacobs, FIG. 1, WD4), a third execution result (Jacobs, [0018], lines 5-7, four write lines 2 to 8 are provided such that each of the execution units 22 to 28 may write data to registers in the register file through write ports W.sub.D1 to W.sub.D4 respectively) from a third execution unit (Jacobs, FIG. 1, Execution unit 4 28) with a third execution delay different from the first execution delay and the second execution delay (Barowski, [0004], lines 1-2, two or more instruction classes also referred to as latency types with different execution lengths or latencies), and writing the third execution result back to a third register unit in the register file (Jacobs, [0018], lines 5-7, four write lines 2 to 8 are provided such that each of the execution units 22 to 28 may write data to registers in the register file through write ports W.sub.D1 to W.sub.D4 respectively) based on a third writing address (Jacobs, [0019], lines 15-17, address line 3 is 6 bits wide and provides the address of the register to which the write data is to be written; [0021], lines 1-8, FIG. 2 illustrates the lines to and from the register file 30 for one of the execution units of FIG. 1, however identical lines exist between each of the execution units shown in FIG. 1 and the register file 30. Write address ports W.sub.A2 to W.sub.A4 (one associated with each write data port) and read address ports R.sub.A3 to R.sub.A8 (one associated with each read data port) are also provided in the register file, although these have not been shown in FIG. 2); wherein the third writing port is not coupled to the first execution unit and the second execution unit (Jacobs, Figure 1, which shows WD4 not coupled to Execution unit 2 24 or Execution unit 3 26), and the third writing port is coupled to all register units in the register file (Jacobs, [0019], lines 17-19, in a present embodiment the register file comprises 64 registers, and therefore an address signal comprising 6 bits is provided to address each of the registers).

Consider claim 12, the combination thus far discloses the method of claim 10 (see above), but does not entail receiving, via the first writing port, a fourth execution result from a fourth execution unit with a fourth execution delay, and to write the fourth execution result back to a fourth register unit in the register file based on a fourth writing address, wherein the fourth execution delay is substantially the same as the first execution delay.
On the other hand, Barowki discloses receiving, via a first writing port (Figure 1, input into register file 10), an execution result (Figure 1, see output from execution pipeline 40 to register file 10 from an execution unit (Figure 1, execution pipeline 40) with an execution delay ([0003], lines 4-10, instructions can vary in complexity, thus complex mathematical floating point operations are usually executed in four to forty execution cycles, whereas simple integer type instructions are executed within one cycle. Thus different architectures implement different execution pipelines with varying fixed instruction execution length according to the complexity type of operations; [0003], lines 16-18, typically these two classes of instructions are issued and executed on different execution units), and to write the execution result back to a register unit in the register file based on a writing address (Figure 1, see output from execution pipeline 40 to register file 10; note that a particular register in a register file is selected based on an address of the register), wherein the execution delay is substantially the same as another execution delay associated with the writing port ([0039], lines 3-4, two execution pipelines 30, 40 for short instructions IS, IS1, IS2, IS3 of the first latency type).
Barowski’s teaching decreases the number of write ports needed, which decreases complexity, latency and power (Barowski, [0007], lines 3-4).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Barowski with the previously explained combination of Jacobs and Barowski in order to decrease the number of write ports needed, which decreases complexity, latency and power. Alternatively, this modification merely entails a use of known technique (Barowski’s teaching of a write port being shared by multiple execution units) to improve similar devices (methods, or products) (the previously explained combination of Jacobs and Barowski) in the same way (after the combination, the previously explained combination of Jacobs and Barowski would likewise entail a write port being shared by multiple execution units), which is an example of a rationale that may support a conclusion of obviousness as per MPEP 2143. Note that Barowski’s further teaching, when applied to the previously explained combination of Jacobs and Barowski that entails a “first” writing port and a “first” execution delay, results in the overall claim limitation.

Consider claim 14, the combination thus far entails at least one processor (Jacobs, [0002], line 1, processor) and the method of claim 10 (see above), but does not entail an electronic device comprising the aforementioned at least one processor; and a memory communicatively coupled with the aforementioned at least one processor, wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to cause the at least one processor to execute the method of claim 10.
On the other hand, Barowski further discloses an electronic device comprising at least one processor; and a memory communicatively coupled with the aforementioned at least one processor, wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to cause the at least one processor to execute a method ([0030], [0035]-[0036]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Barowski with the previously explained combination of Jacobs and Barowski, as this modification merely entails a combination of prior art elements (the prior art elements of the previously explained combination of Jacobs and Barowski, and the well-known concept of a computer-readable medium comprising instructions that are executable to perform a method) according to known methods (the concept of a computer-readable medium comprising instructions that are executable to perform a method is well known) to yield predictable results (the previously explained combination of Jacobs and Barowski, implemented via a computer-readable medium), which is an exemplary rationale that may support a conclusion of obviousness, as per MPEP 2143.

Consider claim 15, the combination thus far entails a computer to implement the method of claim 10 (see above), but does not entail a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are executed to cause the aforementioned computer to implement the aforementioned method of claim 10.
On the other hand, Barowski further discloses a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are executed to cause a computer to implement a method ([0030], [0035]-[0036]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Barowski with the previously explained combination of Jacobs and Barowski, as this modification merely entails a combination of prior art elements (the prior art elements of the previously explained combination of Jacobs and Barowski, and the well-known concept of a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are executed to cause a computer to implement a method) according to known methods (the concept of a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are executed to cause a computer to implement a method, is well known) to yield predictable results (the previously explained combination of Jacobs and Barowski, implemented via a computer-readable medium), which is an exemplary rationale that may support a conclusion of obviousness, as per MPEP 2143.

Response to Arguments
Applicant on page 11 argues: “Applicant has generally amended the specification including the Abstract as suggested by the Examiner to address the issues set forth in the Office Action. These amendments do not alter the scope of the claims in any way.”
In view of the aforementioned amendments, most of the previously presented objections to the specification are withdrawn. However, one objection (aside from the title objection, addressed below) appears to remains applicable, since reference character 112 is associated with a “queue” in the amended specification but is associated with a “cache” in FIG. 1A.

Applicant on page 11 argues: ‘Applicant has not amended the Title. Applicant respectfully submits that the current title "APPARATUS AND METHOD FOR WRITING     BACK INSTRUCTION EXECUTION RESULT AND PROCESSING APPARATUS" is descriptive of the claims. In other words, the claims are directed to subject matter associated with the Title. In the event the Examiner disagrees, Applicant would greatly appreciate any suggestions the Examiner may provide.’
However, Examiner submits that the general concept of writing back instruction execution results was widespread prior to the effective filing date of the claimed invention. Examiner further submits that almost all processors, if not all processors, write back an instruction execution result in some manner. For the purpose of facilitating indexing, classifying, and searching the application, a new title is required that is clearly indicative of the invention to which the claims are directed. Examiner recommends incorporating into the title the difference(s) between FIG. 1A and 1B (labeled “Prior Art”) and FIG. 2-5 which the claims reflect. 

Applicant on page 11 argues: “The Examiner objected to the drawings. Applicant traverses the objection. As noted on page 5 of this paper, Applicant is submitting herewith Replacement Sheets including all of the drawings to address the issues set forth in the Office Action. Applicant requests withdrawal of the objection to the drawings.”
However, one objection to the drawings appears to remain applicable, and an objection to the drawings is newly catalyzed by the aforementioned drawing amendments — see the drawings section above. 
Regarding Applicant’s note on page 5 of the response that “On all of the REPLACEMENT SHEETS, all drawings are presented in higher resolution”, Examiner notes that it is possible that differences in quality between the drawings before submitted by Applicant and the drawings present in the file wrapper stems from the drawings not being black and white drawings, causing dithering during a conversion from greyscale to black. Examiner recommends ensuring that any submitted drawings do not contain any grey elements.

Applicant on page 11 argues: “Claims 5, 9, 13, and 14 were objected to due to informalities. Applicant traverses the objection. Applicant has amended the claims to address the alleged informalities and requests withdrawal of the claim objection.”
In view of the aforementioned amendments, the previously presented objections to the claims are withdrawn.

Applicant on page 12 argues: ‘Applicant has amended claim 1 to incorporate the subject matter of claim 5 and the feature of "with a low use frequency." 
Examiner notes that this feature catalyzes an indefinite issue — see the Claim Rejections - 35 USC § 112 section above.

Applicant on page 12 argues: ‘Applicant has amended amended claim 6 to incorporate the subject matter of claim 9 and the feature of "with a low use frequency." Applicant has amended claim 10 to incorporate the subject matter of claim 13 and the feature of "with a low use frequency."’
However, claim 10 does not appear to be amended with the feature of “with a low use frequency.”

Applicant on page 13 argues: ‘Neither Jacobs nor Barowski discloses the frequency of use of the fifth execution unit is low. In summary, the amended independent claim 1 would not have been obvious over Jacobs in view of Barowski. Independent claims 6 and 10 have been similarly amended and are believed to be patentable for similar reasons. The other pending claims are dependent claims which may rely on the nonobviousness of one of the independent claims.”
In view of the aforementioned newly amended subject matter, Examiner is newly relying upon the Tran reference — see the Claim Rejections - 35 USC § 103 section above.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEITH E VICARY whose telephone number is (571)270-1314. The examiner can normally be reached Monday to Friday, 9:00 AM to 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached on (571)270-3995. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KEITH E VICARY/Primary Examiner, Art Unit 2182