Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responding to the amendment filed on 5/24/2022.
Claims 1-35 are pending in the application.  
				Claim Rejections - 35 USC § 101
	35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


 	Claims 1-35 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more. Specifically, claims 1-35 are directed to an abstract idea. 
	Per claim 1, the claim is directed to an idea of itself, mathematical operation that can be also performed in the human mind, or by a human using a pen and paper. The step of performing two or more portions of a matrix operation based at least in part on one or more dependencies among the two or more portions indicated by the matrix operation is a mere mathematical matrix operation, calculation or computation.  Furthermore, the calculation can be performed mentally as well as the performing step is recited at a high level of generality without providing any details of the step. The additional limitations, the processor and one or more circuits are described at a high level of generality for applying or performing the abstract idea and do not indicate any integration of the abstract idea into a practical application as the abstract idea is merely applied on a generic computer and performed using a computer. Automating mental processes by using a generic computer does not make the abstract idea to be automatically patent eligible.  See MPEP see MPEP 2106.05(f) /2106.05(h). It is noted that employing a generic computer functions to execute an abstract idea, even when limiting the use of the idea to one particular environment, does not add significantly more, similar to how limiting the abstract idea in Flook to petrochemical and oil-refining industries was insufficient. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process).  Viewing the limitations individually and as a combination, the additional elements merely apply the abstract idea by a generic computer and performs the abstract idea without integrating the abstract idea into a practical application. For at least these reasons, claim 1 is not patent eligible. 
   	Per claim 2, the claim is directed to an idea of itself, mental processes of/for mathematical calculation/operation that can be performed in the human mind, or by a human using a pen and paper. The steps of detecting, determining, and generating steps can be pure mental processes.  Particularly, No particular manners of generating the code utilizing the determined matter to be considered as being significantly more are recited.  Generating executable code is mere converting of one code format to another which can be mental steps as no specific implementations of such generation is recited, the size of code does not need to be large, and the fetching is only an intended action to be performed if executed.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process). For at least these reasons, claim 2 is not patent eligible. 
 	Per claims 3-6, these claims are directed to the same idea itself, reciting details of the abstract idea and mathematical operation without adding any other additional element that is significantly more. Therefore, the claims are rejected for the same reasons as in claim 1 and 2 respectively.
 		Per claim 7, the claim is directed to an idea of itself, mathematical operation that can be performed in the human mind, or by a human using a pen and paper. The step of performing two or more portions of matrix operation is a mathematical matrix operation, calculation or computation. Furthermore, the calculation can be performed mentally as well as the performing step is recited at a high level of generality without providing any details of the step. The additional limitations, the one or more memories and one or more processors are described at a high level of generality for applying or performing the abstract idea and do not indicate any integration of the abstract idea into a practical application as the abstract idea is merely applied on a generic computer and performed using a computer. Automating mental processes by using a generic computer does not make the abstract idea to be automatically patent eligible.  See MPEP see MPEP 2106.05(f) /2106.05(h). It is noted that employing a generic computer functions to execute an abstract idea, even when limiting the use of the idea to one particular environment, does not add significantly more, similar to how limiting the abstract idea in Flook to petrochemical and oil-refining industries was insufficient. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process).  Viewing the limitations individually and as a combination, the additional elements merely apply the abstract idea by a generic computer and performs the abstract idea without integrating the abstract idea into a practical application. For at least these reasons, claim 7 is not patent eligible. 
 	Per claims 8, the claim is directed to an idea of itself, mental processes of/for mathematical calculation/operation that can be performed in the human mind, or by a human using a pen and paper. The steps of determining, and generating steps can be pure mental processes.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process). For at least these reasons, claim 8 is not patent eligible. 
 	Per claims 9-13, these claims are directed to the same idea itself, reciting details of the abstract idea and mathematical operation without adding any other additional element that is significantly more. Therefore, the claims are rejected for the same reasons as in claim 7 and 8 respectively.
	Per claim 14, the claim is directed to an idea of itself, mathematical operation that can be performed in the human mind, or by a human using a pen and paper. The step of performing two or more portions of matrix operation based at least in part on one or more dependencies among the two or more portions indicated by the matrix operation is a mathematical matrix operation, calculation or computation. Furthermore, the calculation can be performed mentally as well as the performing step is recited at a high level of generality without providing any details of the step. The additional limitations, a processor is described at a high level of generality for applying or performing the abstract idea and do not indicate any integration of the abstract idea into a practical application as the abstract idea is merely applied on a generic computer and performed using a computer. Automating mental processes by using a generic computer does not make the abstract idea to be automatically patent eligible. See MPEP see MPEP 2106.05(f) /2106.05(h). It is noted that employing a generic computer functions to execute an abstract idea, even when limiting the use of the idea to one particular environment, does not add significantly more, similar to how limiting the abstract idea in Flook to petrochemical and oil-refining industries was insufficient. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process).  Viewing the limitations individually and as a combination, the additional elements merely apply the abstract idea by a generic computer and performs the abstract idea without integrating the abstract idea into a practical application. For at least these reasons, claim 14 is not patent eligible. 
 	Per claims 15, the claim is directed to an idea of itself, mental processes of/for mathematical calculation/operation that can be performed in the human mind, or by a human using a pen and paper. The steps of detecting, determining, and generating steps can be pure mental processes.  Particularly, No particular manners of generating the code utilizing the determined matter to be considered as being significantly more are recited.  Generating executable code is mere converting of one code format to another which can be mental steps as no specific implementations of such generation is recited, the size of code does not need to be large, and the fetching is only an intended action to be performed if executed.   If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process). For at least these reasons, claim 15 is not patent eligible.
	Per claims 16-21, these claims are directed to the same idea itself, reciting details of the abstract idea and mathematical operation without adding any other additional element that is significantly more. Therefore, the claims are rejected for the same reasons as in claim 14 and 15 respectively.
	Per claim 22, the claim is directed to an idea of itself, mathematical operation that can be performed in the human mind, or by a human using a pen and paper. The step of performing two or more portions of matrix operation based at least in part on one or more dependencies among the two or more portions is a mathematical matrix operation, calculation or computation. Furthermore, the calculation can be performed mentally as well as the performing step is recited at a high level of generality without providing any details of the step. The additional limitations, a processor and ALUs are described at a high level of generality for applying or performing the abstract idea and do not indicate any integration of the abstract idea into a practical application as the abstract idea is merely applied on a generic computer and performed using a computer. Automating mental processes by using a generic computer does not make the abstract idea to be automatically patent eligible. See MPEP see MPEP 2106.05(f) /2106.05(h). It is noted that employing a generic computer functions to execute an abstract idea, even when limiting the use of the idea to one particular environment, does not add significantly more, similar to how limiting the abstract idea in Flook to petrochemical and oil-refining industries was insufficient. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process).  Viewing the limitations individually and as a combination, the additional elements merely apply the abstract idea by a generic computer and performs the abstract idea without integrating the abstract idea into a practical application. For at least these reasons, claim 22 is not patent eligible. 
 	Per claim 23, the claim is directed to an idea of itself, mental processes of/for mathematical calculation/operation that can be performed in the human mind, or by a human using a pen and paper. The steps of detecting, determining, and generating steps can be pure mental processes.  Particularly, No particular manners of generating the code utilizing the determined matter to be considered as being significantly more are recited.  Generating executable code is mere converting of one code format to another which can be mental steps as no specific implementations of such generation is recited, the size of code does not need to be large, and the fetching is only an intended action to be performed if executed.   If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process). For at least these reasons, claim 23 is not patent eligible. 
 	Per claims 24-27, these claims are directed to the same idea itself, reciting details of the abstract idea and mathematical operation without adding any other additional element that is significantly more. Therefore, the claims are rejected for the same reasons as in claim 22 and 23 respectively.
 	Per claim 28, the claim is directed to an idea of itself, mathematical operation that can be performed in the human mind, or by a human using a pen and paper. The step of performing two or more portions of matrix operation based at least in part on one or more dependencies among the two or more portions is a mathematical matrix operation, calculation or computation. Furthermore, the calculation can be performed mentally as well as the performing step is recited at a high level of generality without providing any details of the step. The additional limitations, a processor and ALUs are described at a high level of generality for applying or performing the abstract idea and do not indicate any integration of the abstract idea into a practical application as the abstract idea is merely applied on a generic computer and performed using a computer. Automating mental processes by using a generic computer does not make the abstract idea to be automatically patent eligible. See MPEP see MPEP 2106.05(f) /2106.05(h). It is noted that employing a generic computer functions to execute an abstract idea, even when limiting the use of the idea to one particular environment, does not add significantly more, similar to how limiting the abstract idea in Flook to petrochemical and oil-refining industries was insufficient. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process).  Viewing the limitations individually and as a combination, the additional elements merely apply the abstract idea by a generic computer and performs the abstract idea without integrating the abstract idea into a practical application. For at least these reasons, claim 28 is not patent eligible. 
 	Per claim 29, the claim is directed to an idea of itself, mental processes of/for mathematical calculation/operation that can be performed in the human mind, or by a human using a pen and paper. The steps of detecting, determining, and generating steps can be pure mental processes.  Particularly, No particular manners of generating the code utilizing the determined matter to be considered as being significantly more are recited.  Generating executable code is mere converting of one code format to another which can be mental steps as no specific implementations of such generation is recited, the size of code does not need to be large, and the fetching is only an intended action to be performed if executed.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components or insignificant extra solution activities (e.g. processors, devices, program instructions), then it falls within the "Mental Processes" grouping of abstract ideas (2019 PEG step 2A, Prong 1: Abstract idea grouping? Yes, Mental Process). For at least these reasons, claim 29 is not patent eligible. 
 	Per claims 30-35, these claims are directed to the same idea itself, reciting details of the abstract idea and mathematical operation without adding any other additional element that is significantly more. Therefore, the claims are rejected for the same reasons as in claim 28 and 29 respectively.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

 	Claims 1, 6-8, 11-16, 19, 20, 22, 27, 28, and 35 are rejected under 35 U.S.C. 103 as being unpatentable over Espig (US20200210516, hereafter Espig) in view of Liu et al. (US 20190347544 hereafter Liu).
Per claim 1:
Espig teaches: A processor, comprising: one or more circuits to perform two or more portions of a matrix operation (Espig, see at least [0150]),  The matrix operations accelerator performs a matrix multiply operation using tiles ([0081]; Fig. 6; [0085]; fig. 16 presenting an inner loop of an algorithm to compute a matrix multiplication …two result tiles…to accumulate the intermediate results …adjusts the pointers for the C tiles; [0184]; [0225]; [0053] Matrix-matrix multiplication (a.k.a., GEMM or General Matrix Multiplication is a compute-heavy operation on certain processors);  [0055] Described herein are mechanisms to support matrix operations in computer hardware such as central processing units (CPUs), graphic processing units (GPUs), and accelerators. The matrix operations utilize 2-dimensional (2-D) data structures representing one or more packed regions of memory such as registers; [0056], matrix (tile) multiplication…tile move, etc.; [0081], the matrix operations accelerator 307 is to perform a matrix multiply operation; [0068]; [0079]; [0091], multiplier circuit; [0131] execution circuitry; [0152], a matrix operations circuit…as a part of a processor core or as an external device).
	Even though Espig does not explicitly state that performing a matrix operation is based on one or more dependencies among the two more portions indicated by the matrix operation, the matrix operation need be performed (or at least it is obvious to be performed) in a specific order for data dependencies between the matrix operations to enforce correct and efficient execution.   Nonetheless, Liu explicitly teach that performing a matrix operation is based, at least in part, on one or more dependencies among the two more portions indicated by the matrix operation (Liu, see at least [0036] analyzing, by the dependency processing unit, whether the decoded computational instructions have a dependency in term of data with previous instructions that have not been completed; if a dependency exists, the decoded computational instructions and the corresponding address information of the computation matrix may need to wait in an instruction queue memory until their dependency in term of data with the previous instructions that have not been completed no longer exists;[0079] the above-mentioned matrix computation control unit may further include a dependency processing unit 3-24 for determining whether the decoded matrix computational instructions and the address information of the computation matrix are conflict with a previous computation, if a conflict exists, the decoded matrix computational instructions and the address information of the computation matrix may be temporarily stored; if no conflict exists, the decoded matrix computational instructions and the address information of the computation matrix may be sent to the matrix determination unit; [0104]; [0060] a computation unit 3-3 for subjecting a computation matrix to a partitioning computation, a transpose computation, and a merging computation according to the partitioning information, to obtain a transposed matrix of the computation matrix; [0036]; [0081], if the present instruction is detected to be dependent on data of the previous instructions, the present instruction may have to wait in the instruction queue memory until the dependency is eliminated; [0111]; [0112]). It would have been obvious for one having ordinary skill in the art before the effective filing date of the claimed invention to have combined Liu’s dependency based matrix operations with Espig’s matrix operation in neural networks to modify Espig’s matrix operation system to incorporate dependency information, with a reasonable expectation of success, since Espig’s neural network matrix operation configuration can be performed based at least in part on dependencies among the matrix operation portions and they are analogous art because they are from the same field of endeavor related to matrix operations.  Combining Liu’s functionality with that of Espig results in a system that enables a dependency analysis for matrix operations. The modification would be obvious because one having ordinary skill in the art would be motivated to make this combination to ensure correct execution without any dependency conflicts (Liu, see at least [0036] analyzing, by the dependency processing unit, whether the decoded computational instructions have a dependency in term of data with previous instructions that have not been completed; if a dependency exists, the decoded computational instructions and the corresponding address information of the computation matrix may need to wait in an instruction queue memory until their dependency in term of data with the previous instructions that have not been completed no longer exists;[0079] the above-mentioned matrix computation control unit may further include a dependency processing unit 3-24 for determining whether the decoded matrix computational instructions and the address information of the computation matrix are conflict with a previous computation, if a conflict exists, the decoded matrix computational instructions and the address information of the computation matrix may be temporarily stored; if no conflict exists, the decoded matrix computational instructions and the address information of the computation matrix may be sent to the matrix determination unit; [0104]; [0060] a computation unit 3-3 for subjecting a computation matrix to a partitioning computation, a transpose computation, and a merging computation according to the partitioning information, to obtain a transposed matrix of the computation matrix; [0036]; [0081], if the present instruction is detected to be dependent on data of the previous instructions, the present instruction may have to wait in the instruction queue memory until the dependency is eliminated; [0111]; [0112]).

 	6. The processor of claim 1, wherein the matrix operation comprises at least one general matrix-matrix multiplication (GEMM) operation (Espig, see at least [0053] Matrix-matrix multiplication (a.k.a., GEMM or General Matrix Multiplication) is a compute-heavy operation on certain processors. Special hardware for matrix multiplication (e.g., GEMM) is a good option for improving the peak compute (and energy efficiency) of certain applications, such as deep learning;  [0167];  [0176] causes matrix operations circuitry (e.g., a matrix operations accelerator circuit) to switch from a first mode that performs other operations (e.g., GEMM operations) to a second mode to configure the matrix operations circuitry for the performance of a fast Fourier transform operation).

Per claim 7, it is the system version of claim 1, respectively, and is rejected for the same reasons set forth in connection with the rejection of claim 1 above. 

 	8. The system of claim 7, wherein the one or more processors to perform the two or more portions of the matrix operation based, at least in part, on the one or more dependencies among the two or more portions: determine structural information of the matrix operation (Espig, see at least [0055] The matrix operations utilize 2-dimensional (2-D) data structures representing one or more packed regions of memory such as registers. Throughout this description, these 2-D data structures are referred to as tiles. Note that a matrix may be smaller than a tile (e.g., use less than all of a tile) or utilize a plurality of tiles (e.g., the matrix is larger than the size of any one tile). Throughout the description, matrix (tile) language is used to indicate operations performed using tiles that impact a matrix; whether or not that matrix is larger than any one tile is not typically relevant) and 
 	determine a manner in which to interleave executable instructions to fetch data for the two or more portions of the data and executable instructions of sub-operations of the matrix operation to perform using at least two or more portions of the data (Espig, see at least  [0168]; [0171] IA radix-2 FFT form of the Cooley-Tukey algorithm may be used to divide a DFT of size N into two interleaved DFTs of size N/2 with each recursive stage. Furthermore, using this divide and conquer strategy, a 4-point transform can be reduced to two 2-point transforms; [0173]; [0315]). [0247], partial vector operations; [0148] matrix C 1601 includes two tiles, matrix A 1603 includes one tile, and matrix B 1605 includes two tiles. This figure shows an example of the inner loop of an algorithm to compute a matrix multiplication. In this example, two result tiles, tmm 0 and tmm1, from matrix C 1601 are used to accumulate the intermediate results. One tile from the matrix A 1603 (tmm2) is re-used twice as it multiplied by two tiles from matrix B 1605. Pointers to load a new A matrix (tile) and two new B matrices (tiles) from the directions indicated by the arrows. An outer loop, not shown, adjusts the pointers for the C tiles;  [0172] In general, a matrix operations circuit is of a finite size (e.g., with a fixed number of processing element circuits and/or types of circuitry) … Intermediate values may be recycled back into the matrix operations circuitry as inputs for the next iteration … the FFT decomposition will result in partial products that are merged together, e.g., by the matrix operations circuitry or by software;  [0201] select a first proper subset (e.g., the upper half or the lower half) of the entire complex number for the first of the two repeats and a second, different proper subset (e.g., the other of the upper half or the lower half) of the entire complex number for the second of the two repeats;    [0206] (e.g., FMA) circuits just pick a first subset (e.g., upper half) of bits of a packed data complex number first followed by (e.g., for the next calculation), a second, different subset (e.g., the lower half) of bits of the packed data complex number). 
 11. The system of claim 8, wherein the manner in which to interleave the executable instruction are to fetch the data and the executable instructions of the sub- operations without increasing data storage required of the one or more processors (Espig, see at least [0168] these operations can be cleverly re-arranged to optimize the algorithm down to O(N log(N)), which for large N may greatly reduce the number of operations. The optimized version of the algorithm may be referred to as the fast Fourier transform (FFT); [0171] IA radix-2 FFT form of the Cooley-Tukey algorithm may be used to divide a DFT of size N into two interleaved DFTs of size N/2 with each recursive stage. Furthermore, using this divide and conquer strategy, a 4-point transform can be reduced to two 2-point transforms; [0173]; [0315]).
 	12. The system of claim 7, wherein data of the matrix operation comprises one or more complex numbers (Espig, see at least [0170],  FFTs can be of varied dimensions (e.g., 1D, 2D, 3D, etc.), of different datatypes ( complex to complex, real to complex, etc.), and can be used at multiple granularities (e g , many small FFTs or one large FFT); [0194] a single complex number and a second element that is the other of the real element and the imaginary element of the single complex number. A complex number may also be expressed in the form of a+bi, where a is the real element (number), b is the imaginary element (number), and i is the square root of negative one).
 Per claim 14, it is the method version of claim 1, respectively, and is rejected for the same reasons set forth in connection with the rejection of claim 1 above. 

 	15. The method of claim 14, further comprising: detecting structural information of the matrix operation; (Espig, see at least [0055] The matrix operations utilize 2-dimensional (2-D) data structures representing one or more packed regions of memory such as registers. Throughout this description, these 2-D data structures are referred to as tiles. Note that a matrix may be smaller than a tile (e.g., use less than all of a tile) or utilize a plurality of tiles (e.g., the matrix is larger than the size of any one tile). Throughout the description, matrix (tile) language is used to indicate operations performed using tiles that impact a matrix; whether or not that matrix is larger than any one tile is not typically relevant) and 
  	determining, based at least in part on the structural information of the matrix operation, a manner in which to fetch data for the two or more portions before one or more sub-operations of the matrix operation; and generating executable code according to the determined manner (Espig, see at least  [0168]; [0171] IA radix-2 FFT form of the Cooley-Tukey algorithm may be used to divide a DFT of size N into two interleaved DFTs of size N/2 with each recursive stage. Furthermore, using this divide and conquer strategy, a 4-point transform can be reduced to two 2-point transforms; [0173]; [0315]). [0247], partial vector operations; [0148] matrix C 1601 includes two tiles, matrix A 1603 includes one tile, and matrix B 1605 includes two tiles. This figure shows an example of the inner loop of an algorithm to compute a matrix multiplication. In this example, two result tiles, tmm 0 and tmm1, from matrix C 1601 are used to accumulate the intermediate results. One tile from the matrix A 1603 (tmm2) is re-used twice as it multiplied by two tiles from matrix B 1605. Pointers to load a new A matrix (tile) and two new B matrices (tiles) from the directions indicated by the arrows. An outer loop, not shown, adjusts the pointers for the C tiles;  [0172] In general, a matrix operations circuit is of a finite size (e.g., with a fixed number of processing element circuits and/or types of circuitry) … Intermediate values may be recycled back into the matrix operations circuitry as inputs for the next iteration … the FFT decomposition will result in partial products that are merged together, e.g., by the matrix operations circuitry or by software;  [0201] select a first proper subset (e.g., the upper half or the lower half) of the entire complex number for the first of the two repeats and a second, different proper subset (e.g., the other of the upper half or the lower half) of the entire complex number for the second of the two repeats;    [0206] (e.g., FMA) circuits just pick a first subset (e.g., upper half) of bits of a packed data complex number first followed by (e.g., for the next calculation), a second, different subset (e.g., the lower half) of bits of the packed data complex number). 
 	16. The method of claim 15, wherein generating executable code according to the manner comprises generating a set of dependencies which interleave the data fetches with the one or more sub-operations so as to limit how many registers are simultaneously in use to perform the matrix operation (Espig, see at least  [0168]; [0171] IA radix-2 FFT form of the Cooley-Tukey algorithm may be used to divide a DFT of size N into two interleaved DFTs of size N/2 with each recursive stage. Furthermore, using this divide and conquer strategy, a 4-point transform can be reduced to two 2-point transforms; [0094]  an iteration of a chained fused multiply accumulate instruction … the chained fused multiply accumulate is operating on signed sources wherein the accumulator is 2.times. the input data size;  [0322] while others may be capable of executing only a subset of that instruction set or a different instruction set; [0222]; [0247], While embodiments of the disclosure are described in which the write mask field's 2970 content selects one of a number of write mask registers that contains the write mask to be used (and thus the write mask field's 2970 content indirectly identifies that masking to be performed) [0112] parallel execution is done using lanes that are the size of third signed source 1015 (initial or previous result). The result of the addition of the results of the multiplications is added to the data from most significant packed data element position of third signed source 1015 (initial or previous result) using adder/saturation 1013 circuitry; [0128]).
 	19. The method of claim 15, wherein the one or more sub-operations include one or more multiply add operations (Espig, see at least [0173], fused multiply accumulate (FMA);  [0094] FIG. 8 illustrates an embodiment of a subset of the execution of an iteration of a chained fused multiply accumulate instruction. … the chained fused multiply accumulate is operating on signed sources wherein the accumulator is 2.times. the input data size;  [0056] matrix (tile) multiplication, tile add, tile subtract, tile diagonal, tile zero, tile transform, tile dot product, tile broadcast, tile row broadcast, tile column broadcast, tile multiplication, tile multiplication and accumulation, tile move, etc.; [0222]).  
 	20. The method of claim 19, wherein the one or more multiply add operations include at least one fused multiply add (FMAs) according to AVX2 extension to x86 instruction set architecture (Espig, see at least [0173] (e.g., including fused multiply accumulate (FMA) circuits and/or arithmetic logic unit (ALU) circuit); [0127] The core may support one or more instructions sets (e.g., the x86 instruction set (with some extensions that have been added with newer versions)…AVX1, AVX2), thereby allowing the operations used by many multimedia applications to be performed using packed data; [0135]; [0088]).  
 	21. The method of claim 14, wherein the matrix operation comprises computing a gradient with respect to data or weights (Espig, see at least [0195] The weights (e.g., twiddle factors) may also be complex numbers (e.g., with the same format and same real/imaginary order of elements as in the input values). In one embodiment, the order is a real element that is then followed (e.g., in series) by an imaginary element. In certain embodiments, the weights (e.g., twiddle factors) are alternated; [0196]).
 	Per claim 22:
Espig teaches:  A processor, comprising: one or more arithmetic logic units (ALUs) to train a neural network using at least one or more circuits to perform two or more portions of a matrix operation (Espig, see at least[0150], The matrix operations accelerator performs a matrix multiply operation using tiles ([0081]; Fig. 6; [0085]; fig. 16 presenting an inner loop of an algorithm to compute a matrix multiplication …two result tiles…to accumulate the intermediate results …adjusts the pointers for the C tiles; [0184]; [0225]; [0053] Matrix-matrix multiplication (a.k.a., GEMM or General Matrix Multiplication is a compute-heavy operation on certain processors);  [0055] Described herein are mechanisms to support matrix operations in computer hardware such as central processing units (CPUs), graphic processing units (GPUs), and accelerators. The matrix operations utilize 2-dimensional (2-D) data structures representing one or more packed regions of memory such as registers; [0056], matrix (tile) multiplication…tile move, etc.; [0081], the matrix operations accelerator 307 is to perform a matrix multiply operation; [0068]; [0079]; [0091], multiplier circuit; [0131] execution circuitry; [0152], a matrix operations circuit…as a part of a processor core or as an external device; [0173] (e.g., including fused multiply accumulate (FMA) circuits and/or arithmetic logic unit (ALU) circuit; [0198]; [0203], ALU; [0052]; [0081]).
	Even though Espig does not explicitly state that performing matrix operations is based on one or more dependencies among the two or more portions indicated by the matrix operation, the matrix operation need be performed (or at least it is obvious to be performed) in a specific order for data dependencies between the matrix operation portions to enforce correct and efficient execution.   Nonetheless, Liu explicitly teach that performing matrix operations is based, at least in part, on one or more dependencies among the two or more portions indicated by the matrix operation (Liu, see at least [0036] analyzing, by the dependency processing unit, whether the decoded computational instructions have a dependency in term of data with previous instructions that have not been completed; if a dependency exists, the decoded computational instructions and the corresponding address information of the computation matrix may need to wait in an instruction queue memory until their dependency in term of data with the previous instructions that have not been completed no longer exists;[0079] the above-mentioned matrix computation control unit may further include a dependency processing unit 3-24 for determining whether the decoded matrix computational instructions and the address information of the computation matrix are conflict with a previous computation, if a conflict exists, the decoded matrix computational instructions and the address information of the computation matrix may be temporarily stored; if no conflict exists, the decoded matrix computational instructions and the address information of the computation matrix may be sent to the matrix determination unit; [0104]; [0060] a computation unit 3-3 for subjecting a computation matrix to a partitioning computation, a transpose computation, and a merging computation according to the partitioning information, to obtain a transposed matrix of the computation matrix; [0036]; [0081], if the present instruction is detected to be dependent on data of the previous instructions, the present instruction may have to wait in the instruction queue memory until the dependency is eliminated; [0111]; [0112]).  It would have been obvious for one having ordinary skill in the art before the effective filing date of the claimed invention to have combined Liu’s dependency based matrix operations with Espig’s matrix operation in neural networks to modify Espig’s matrix operation system to incorporate dependency information, with a reasonable expectation of success, since Espig’s neural network matrix operations can be performed based at least in part on dependencies among the matrix operations and they are analogous art because they are from the same field of endeavor related to matrix operations.  Combining Liu’s functionality with that of Espig results in a system that enables a dependency analysis for matrix operations. The modification would be obvious because one having ordinary skill in the art would be motivated to make this combination to ensure correct execution without any dependency conflicts (Liu, see at least [0036] analyzing, by the dependency processing unit, whether the decoded computational instructions have a dependency in term of data with previous instructions that have not been completed; if a dependency exists, the decoded computational instructions and the corresponding address information of the computation matrix may need to wait in an instruction queue memory until their dependency in term of data with the previous instructions that have not been completed no longer exists;[0079] the above-mentioned matrix computation control unit may further include a dependency processing unit 3-24 for determining whether the decoded matrix computational instructions and the address information of the computation matrix are conflict with a previous computation, if a conflict exists, the decoded matrix computational instructions and the address information of the computation matrix may be temporarily stored; if no conflict exists, the decoded matrix computational instructions and the address information of the computation matrix may be sent to the matrix determination unit; [0104]; [0060] a computation unit 3-3 for subjecting a computation matrix to a partitioning computation, a transpose computation, and a merging computation according to the partitioning information, to obtain a transposed matrix of the computation matrix; [0036]; [0081], if the present instruction is detected to be dependent on data of the previous instructions, the present instruction may have to wait in the instruction queue memory until the dependency is eliminated; [0111]; [0112]).

Per claim 27, it is the processor version of claim 6, respectively, and is rejected for the same reasons set forth in connection with the rejection of claim 6 above. 
 	28. A processor, comprising: one or more arithmetic logic units (ALUs) to use a neural network to inference, the neural network trained using at least one or more circuits to perform two or more portions of matrix operation (Espig, see at least [0150], The matrix operations accelerator performs a matrix multiply operation using tiles ([0081]; Fig. 6; [0085]; fig. 16 presenting an inner loop of an algorithm to compute a matrix multiplication …two result tiles…to accumulate the intermediate results …adjusts the pointers for the C tiles; [0184]; [0225]; [0053] Matrix-matrix multiplication (a.k.a., GEMM or General Matrix Multiplication is a compute-heavy operation on certain processors);  [0055] Described herein are mechanisms to support matrix operations in computer hardware such as central processing units (CPUs), graphic processing units (GPUs), and accelerators. The matrix operations utilize 2-dimensional (2-D) data structures representing one or more packed regions of memory such as registers; [0056], matrix (tile) multiplication…tile move, etc.; [0081], the matrix operations accelerator 307 is to perform a matrix multiply operation; [0068]; [0079]; [0091], multiplier circuit; [0131] execution circuitry; [0152], a matrix operations circuit…as a part of a processor core or as an external device [0173] (e.g., including fused multiply accumulate (FMA) circuits and/or arithmetic logic unit (ALU) circuit; [0198]; [0203], ALU; [0052]; [0081]).
	Even though Espig does not explicitly state that performing matrix operations is based on one or more dependencies among the two or more portions indicated by the matrix operation, the matrix operation need be performed (or at least it is obvious to be performed) in a specific order for data dependencies between the matrix operation portions to enforce correct and efficient execution.   Nonetheless, Liu explicitly teach that performing matrix operations is based, at least in part, on one or more dependencies among the two or more portions indicated by the matrix operation (Liu, see at least [0036] analyzing, by the dependency processing unit, whether the decoded computational instructions have a dependency in term of data with previous instructions that have not been completed; if a dependency exists, the decoded computational instructions and the corresponding address information of the computation matrix may need to wait in an instruction queue memory until their dependency in term of data with the previous instructions that have not been completed no longer exists;[0079] the above-mentioned matrix computation control unit may further include a dependency processing unit 3-24 for determining whether the decoded matrix computational instructions and the address information of the computation matrix are conflict with a previous computation, if a conflict exists, the decoded matrix computational instructions and the address information of the computation matrix may be temporarily stored; if no conflict exists, the decoded matrix computational instructions and the address information of the computation matrix may be sent to the matrix determination unit; [0104]; [0060] a computation unit 3-3 for subjecting a computation matrix to a partitioning computation, a transpose computation, and a merging computation according to the partitioning information, to obtain a transposed matrix of the computation matrix; [0036]; [0081], if the present instruction is detected to be dependent on data of the previous instructions, the present instruction may have to wait in the instruction queue memory until the dependency is eliminated; [0111]; [0112]).  It would have been obvious for one having ordinary skill in the art before the effective filing date of the claimed invention to have combined Liu’s dependency based matrix operations with Espig’s matrix operation in neural networks to modify Espig’s matrix operation system to incorporate dependency information, with a reasonable expectation of success, since Espig’s neural network matrix operations can be performed based at least in part on dependencies among the matrix operations and they are analogous art because they are from the same field of endeavor related to matrix operations.  Combining Liu’s functionality with that of Espig results in a system that enables a dependency analysis for matrix operations. The modification would be obvious because one having ordinary skill in the art would be motivated to make this combination to ensure correct execution without any dependency conflicts (Liu, see at least [0036] analyzing, by the dependency processing unit, whether the decoded computational instructions have a dependency in term of data with previous instructions that have not been completed; if a dependency exists, the decoded computational instructions and the corresponding address information of the computation matrix may need to wait in an instruction queue memory until their dependency in term of data with the previous instructions that have not been completed no longer exists;[0079] the above-mentioned matrix computation control unit may further include a dependency processing unit 3-24 for determining whether the decoded matrix computational instructions and the address information of the computation matrix are conflict with a previous computation, if a conflict exists, the decoded matrix computational instructions and the address information of the computation matrix may be temporarily stored; if no conflict exists, the decoded matrix computational instructions and the address information of the computation matrix may be sent to the matrix determination unit; [0104]; [0060] a computation unit 3-3 for subjecting a computation matrix to a partitioning computation, a transpose computation, and a merging computation according to the partitioning information, to obtain a transposed matrix of the computation matrix; [0036]; [0081], if the present instruction is detected to be dependent on data of the previous instructions, the present instruction may have to wait in the instruction queue memory until the dependency is eliminated; [0111]; [0112]).
Per claim 35, it is the processor version of claim 27, respectively, and is rejected for the same reasons set forth in connection with the rejection of claim 27 above. 

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Espig in view of Liu and Redfern et al.  (US 20180253402, hereafter Redfern).
Per claim 13:
Espig does not explicitly teach that wherein the matrix operation comprises at least one convolution operation. Redfern teaches such convolution operation is well-known in the art (see at least [0019], convolution as used in convolution neural networks; [0071]; [0072]). It would have been obvious for one having ordinary skill in the art before the effective filing date of the claimed invention to have combined Espig’s matrix operation in neural networks and Liu’s dependency based matrix operations with Redfern’s convolution operation to modify Espig’s matrix operation system to incorporate a convolutional layer, with a reasonable expectation of success, since Espig’s neural network can include a convolutional layer for a deep learning and they are analogous art because they are from the same field of endeavor related to matrix operations.  Combining Redfern’s functionality with that of Espig and Liu results in a system that enables a convolutional operation in the neural network. The modification would be obvious because one having ordinary skill in the art would be motivated to make this combination to provide automatic learning ability of a large number of filters to inputs in parallel to detect specific features  as is known in the industry (see at least [0019], convolution as used in convolution neural networks; [0071]; [0072]).
   				Allowable Subject Matter
Claims 2-5, 9, 10, 17, 18, 23-26, and 29-34 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims (Also if the 101 rejection is overcome). 
The following is an examiner’s statement of reasons for allowance over prior arts: While Espig teaches fast Fourier transform configuration and computation instructions with a matrix operations circuit including a two-dimensional grid, Redfern teaches a matrix multiplication accelerator in convolutional neural networks, executing the configured fundamental computational primitive, ultimately the prior arts of record, taken alone or in combination, do not teach the subject matter recited in the claims.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Examiner’s Note
 	The Examiner has pointed out particular references contained in the prior art of record within the body of this action for the convenience of the Applicant.  Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply.  Applicant, in preparing the response, should consider fully the entire reference as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
Response to Arguments
Applicant's arguments filed on 5/24/2022 have been fully considered but they are not persuasive. 
Amendment dated December 31, 2020 	The applicant states that the claims 1-35 cannot be practically performed in the human mind or with pen and paper; performing portions of a single matrix operation based on dependencies among the portions, in which the dependencies are indicated by the single matrix operation, is specific to a computer environment and are not related to a human mind, performing something in the human mind, or using pen and paper. Further, Applicant submits that claim 1 recites a system to perform concrete steps that result in a specific technical effect rooted in computer technology because performing portions of a single matrix operation based on dependencies among the portions sets "an upper bound as to how far ahead to prefetch a data element before it is to be used and a lower bound is determined based on how long it takes to prefetch said data element to ensure that said data element is loaded and available for use when it is to be used for a computation, thereby avoiding a memory stall ... dependencies limit live ranges of registers which makes register-allocation and scoreboarding cleaner." Specification at [0069].   4879-3812-3018v 1 0112912-046US0 Applicant's claims are directed to a specific improvement in computer technology that "generates optimized machine-readable executable code based on structure information of GEMM or GEMM-like operation ... reduces memory usage, causes one or more processors to execute more efficiently, improves parallelization of computer programs, and combinations thereof ... allow[s] registers to be allocated more efficiently." Specification at [0052].  Reply to Office Action of November 26, 2021Applicant submits that the pending rejection of claims 1-35 are moot at least because pending claims 1-35 recite an improvement to computer technology, as described in greater detail above, in connection with Applicant's remarks relating to Step 2A and/or below in connection with the remarks that claims 1-35 are non-obvious. 
 	In response, the single step of performing two or more portions of a matrix operation based on one or more dependences does not indicate any specific improvement or technique to improve.  Merely claiming what the invention is without reciting specific or particular manners for achieving such an improvement is not “concrete steps that result in a specific technical effect” in claim 1 as the applicant states.  In fact, there are no such “concrete steps” recited other than performing a matrix operation.   Furthermore, it is noted that although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).   The step of merely performing a matrix operation can be any  calculation or a mathematical operation which can also be considered as being a mental process performed by human analog (i.e., by hand or by merely thinking).  Mere a matrix multiplication is a step of performing a matrix operation which is a mathematical calculation.  Nonetheless, there is no recitation of what specific matrix operation is performed and how it is performed based on the claimed dependences to achieve that specific improvement as the applicant states.  Also, automating a mental process by using a generic computer does not make the abstract idea to be automatically patent eligible. It is noted that an improvement to the abstract idea, or manipulation of particular data or providing a detailed or specific implementation of abstract idea (e.g. analysis, detection, identification, calculation) does not make an abstract concept any less abstract. Similarly, evaluation of data via a highly detailed implementation or mathematical calculation is still a judicial exception.  See MPEP see MPEP 2106.05(f) /2106.05(h).  It is noted that employing generic computer functions to execute an abstract idea, even when limiting the use of the idea to one particular environment, does not add significantly more, similar to how limiting the abstract idea in Flook to petrochemical and oil-refining industries was insufficient.  
 	The applicant further states that on the contrary to Espig and Liu, claim 1 recites "perform[ing] two or more portions" of a single "matrix operation" based on "one or more dependencies among the two or more portions." 
In response, Espig’s FFT configuration and computation method does not limit to multiple matrix operations.  Espig clearly states that at least one matrix (tile) operation is performed using matrices/tiles where a number of rows and columns per tile is set, at least one matrix/tile is stored out to memory and a context switch is occurred ([0150]),  The matrix operations accelerator performs a matrix multiply operation using tiles ([0081]; Fig. 6; [0085]; fig. 16 presenting an inner loop of an algorithm to compute a matrix multiplication …two result tiles…to accumulate the intermediate results …adjusts the pointers for the C tiles; [0184]; [0225]; [0053] Matrix-matrix multiplication (a.k.a., GEMM or General Matrix Multiplication is a compute-heavy operation on certain processors);  [0055] Described herein are mechanisms to support matrix operations in computer hardware such as central processing units (CPUs), graphic processing units (GPUs), and accelerators. The matrix operations utilize 2-dimensional (2-D) data structures representing one or more packed regions of memory such as registers; [0056], matrix (tile) multiplication…tile move, etc.; [0081], the matrix operations accelerator 307 is to perform a matrix multiply operation; [0068]; [0079]; [0091], multiplier circuit; [0131] execution circuitry; [0152], a matrix operations circuit…as a part of a processor core or as an external device).  Furthermore, the claims merely recite that matrix operations are performed based on one or more dependencies without specifying how such operations are performed in a particular manner in view of dependencies.  Liu clearly teaches determining data dependencies in matrix computation instructions and based on the dependencies, the matrix operation(s) is performed based on the dependencies ([0036]; [0081], if the present instruction is detected to be dependent on data of the previous instructions, the present instruction may have to wait in the instruction queue memory until the dependency is eliminated; [0111]; [0112]). It is noted that the dependencies are indicated by matrix operations needing data for the computations). Therefore, applicant’s statement above is not persuasive.
Reply to Office Action of April 12, 2021 
  					Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to INSUN KANG whose telephone number is (571)272-3724.  The examiner can normally be reached on M-F 10 am-6 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chat Do can be reached on 571-272-3721.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/INSUN KANG/Primary Examiner, Art Unit 2193