DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1, 18, and 20 have been amended.
Claims 1-3, 5-8, 11-15, and 18-21 have been examined.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5-8, 11-15, 18, 20, and 21  are rejected under 35 U.S.C. 103 as being unpatentable over US Publication No. 2019/0042242 by Das et al. (hereinafter referred to as “Das”) in view of US Publication No. 2021/0049230 by Fleischer et al. (hereinafter referred to as “Fleischer”) in view of US Publication No. 2014/0195783 by Karthikeyan et al. (as cited by Applicant and hereinafter referred to as “Karthikeyan ”).
Regarding claim 1, Das discloses:
an apparatus comprising: an instruction decoder to decode program instructions (Das discloses, at ¶ [0045], a processor having a decode circuit.); 
processing circuitry to perform data processing in response to the program instructions decoded by the instruction decoder, the processing circuitry comprising matrix processing circuitry responsive to a single instruction to generate a...result of performing a matrix multiplication of a plurality of first data elements representing a first matrix...by a plurality of second elements representing a second matrix...(Das discloses, at ¶ [0046], the processor has execution circuitry. Das also discloses, at ¶¶ [0049]-[0050], the circuitry multiplies the elements of two vectors, i.e., matrices, in response to a single instruction, e.g., a VNNI instruction, to generate a result.); and 
a plurality of registers to store operands for processing by the processing circuitry (Das discloses, at ¶ [0047], the processor has registers to store operands.); in which: 
in response to a single mixed-element-size matrix multiplication instruction specifying a first operand comprising a plurality of first data elements, each having a first data element size, and a second operand comprising a plurality of second data elements, each having a second data element size smaller than the first data element size, the first operand and the second operand being stored in the registers, the instruction decoder is configured to control the matrix processing circuitry to perform a matrix multiplication operation on the plurality of first data elements from the first operand and the plurality of second data elements from the second operand to generate the result... (Das discloses, at ¶¶ [0049]-[0050], the circuitry performs matrix multiplication on the elements of two vectors, i.e., matrices, that are stored in registers where the elements of the second vector are smaller than those of the first vector, e.g., 4 bits and 8 bits respectively, in response to a single instruction, e.g., a VNNI instruction that specifies the operands, to generate a result.); and
in response to a single same-element-size matrix multiplication instruction specifying a third operand comprising a plurality of third data elements, each having a third data element size, and a fourth operand comprising a plurality of fourth data elements, each having a fourth data element size the same as the third data element size, the third operand and the fourth operand being stored in the registers, the instruction decoder is configured to control the matrix processing circuitry to perform a matrix multiplication operation on the plurality of third data elements from the third operand and the plurality of fourth data elements from the fourth operand to generate the result... (Das discloses, at ¶ [0043], an instruction for multiplying vectors that are both limited to having operands of the same size, e.g., 8 bits,, where the operands are stored in registers.); 
wherein the processing circuitry is configured to output each element of the result...of the same-element-size matrix multiplication instruction to a register element of a result register (Das discloses, at ¶ [0074], outputting the results to destination registers.); and 
in response to the mixed-element-size matrix multiplication instruction, the processing circuitry is configured to output two elements of the result...of the mixed-element-size matrix multiplication instruction to be stored separately in each register element of the result register (Das discloses, at ¶ [0074], outputting the results, including k elements to destination registers, where each element is stored separately in register elements.); 
wherein the processing circuitry is configured to use the same result register having the same size for the same-element-size matrix multiplication instruction and the mixed-element-size matrix multiplication instruction (Das discloses, at Figures 4A-4F, storing results in destination registers. This discloses storing the results in the same register for the various instructions disclosed in the figures and corresponding text.).
Das does not explicitly disclose the aforementioned result is a two dimensional result matrix, the aforementioned first and second matrices comprise at least two rows and at least two columns, and the second matrix comprises a greater number of independent data values represented by the plurality of second data elements than a number of independent data values represented by the plurality of first data elements of the first matrix.
However, in the same field of endeavor (e.g., matrix multiplication) Fleischer discloses:
multiplying matrices that each have at least two rows and columns to generate a two dimensional result matrix (Fleischer discloses, at ¶ [0012], multiplying an n by m matrix with an m by p matrix to generate an n by p result matrix, where n, m, and p are all at least two.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Das’s instruction for multiplication of vectors, i.e., row matrices, to perform matrix multiplication, as disclosed by Fleischer, because this modification merely entails a combination of prior art elements (cited above) according to known methods to yield predictable results, which is an exemplary rationale to support a conclusion of obviousness, as per MPEP § 2143.
Also in the same field of endeavor (e.g., packed data arithmetic) Karthikeyan discloses:
wherein the second matrix comprises a greater number of independent data values represented by the plurality of second data elements than a number of independent data values represented by the plurality of first data elements of the first matrix  (Karthikeyan discloses, at ¶ [0047], a first source with four elements and a second source with eight elements.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Das’s instructions to include instruction features disclosed by Karthikeyan because this modification merely entails a combination of prior art elements (cited above) according to known methods to yield predictable results, which is an exemplary rationale to support a conclusion of obviousness, as per MPEP § 2143.

Regarding claim 2, Das, as modified, discloses the elements of claim 1, as discussed above. Das also discloses:
in which the plurality of second data elements are packed in a contiguous portion of one or more second operand registers (Das discloses, at ¶ [0049], the elements in the second source are packed contiguously, i.e., eight 4-bit elements in a 32 bit register.).

Regarding claim 3, Das, as modified, discloses the elements of claim 2, as discussed above. Das also discloses:
in which the plurality of first data elements are packed into a contiguous portion of one or more first operand registers (Das discloses, at ¶ [0049], the elements in the first source are packed contiguously, i.e., eight 8-bit elements in a 64 bit register.).
Das does not explicitly disclose the one or more first operand registers and the one or more second operand registers have the same register size .
However, in the same field of endeavor (e.g., packed data arithmetic) Karthikeyan discloses:
the first operand and the second operand have the same size (Karthikeyan discloses, at ¶ [0066], both sources have the same size.). 
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Das’s instructions to include instruction features disclosed by Karthikeyan because this modification merely entails a combination of prior art elements (cited above) according to known methods to yield predictable results, which is an exemplary rationale to support a conclusion of obviousness, as per MPEP § 2143.

Regarding claim 5, Das, as modified, discloses the elements of claim 1, as discussed above. Das also discloses:
the matrix multiplication operation comprises a plurality of multiplications, each multiplication multiplying one of the first data elements with one of the second data elements (Das discloses, at ¶ [0050], the instruction multiplies each element of the first source with corresponding elements of the second source.).
Das does not explicitly disclose the plurality of multiplications corresponding to different combinations of first and second data elements.
However, in the same field of endeavor (e.g., packed data arithmetic) Karthikeyan discloses:
the plurality of multiplications corresponding to different combinations of first and second data elements (Karthikeyan discloses, at ¶ [0068], multiplying the first source by a first subset of the second source and by a second subset of the second source.). 
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Das’s instructions to include instruction features disclosed by Karthikeyan because this modification merely entails a combination of prior art elements (cited above) according to known methods to yield predictable results, which is an exemplary rationale to support a conclusion of obviousness, as per MPEP § 2143.

Regarding claim 6, Das, as modified, discloses the elements of claim 5, as discussed above. Das does not explicitly disclose in which at least two of the plurality of multiplications multiply different second data elements with the same first data element. 
However, in the same field of endeavor (e.g., packed data arithmetic) Karthikeyan discloses:
in which at least two of the plurality of multiplications multiply different second data elements with the same first data element (Karthikeyan discloses, at ¶ [0068], multiplying the first source by a first subset of the second source and by a second subset of the second source.). 
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Das’s instructions to include instruction features disclosed by Karthikeyan because this modification merely entails a combination of prior art elements (cited above) according to known methods to yield predictable results, which is an exemplary rationale to support a conclusion of obviousness, as per MPEP § 2143.

Regarding claim 7, Das, as modified, discloses the elements of claim 5, as discussed above. Das also discloses:
the matrix multiplication operation comprises at least one addition based on one or more products generated in the plurality of multiplications (Das discloses, at ¶ [0050], the instruction accumulates (adds) the products.).

Regarding claim 8, Das, as modified, discloses the elements of claim 5, as discussed above. Das also discloses:
the matrix multiplication operation comprises performing one or more accumulation operations, each accumulation operation comprising adding one or more products generated in the plurality of multiplications to an accumulator value (Das discloses, at ¶ [0050], the instruction accumulates (adds) the products.).

Regarding claim 11, Das, as modified, discloses the elements of claim 1, as discussed above. Das also discloses:
the result data elements have a larger data element size than the first data elements (Das discloses, at ¶ [0074], the instruction multiplies k vectors of 8 bit inputs times k vectors of 2 bit weights to produce an array of n 32 bit outputs.).

Regarding claim 12, Das, as modified, discloses the elements of claim 11, as discussed above. Das does not explicitly disclose the first data elements have data element size N and the result data elements have data element size 2N.
However, in the same field of endeavor (e.g., packed data arithmetic) Karthikeyan discloses:
the first data elements have data element size N and the result data elements have data element size 2N (Karthikeyan discloses, at ¶ [0068], each of the result data elements includes twice as many bits as each of the data elements of the first source.). 
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Das’s instructions to include instruction features disclosed by Karthikeyan because this modification merely entails a combination of prior art elements (cited above) according to known methods to yield predictable results, which is an exemplary rationale to support a conclusion of obviousness, as per MPEP § 2143.

Regarding claim 13, Das, as modified, discloses the elements of claim 1, as discussed above. Das also discloses:
the first data elements have data element size N, and the second data elements have data element size N/Z, where Z is a power of 2 (Das discloses, at ¶ [0049], elements of the first source are 8 bits and elements of the second source are 4 bits.).

Regarding claim 14, Das, as modified, discloses the elements of claim 13, as discussed above. Das also discloses:
N = 8 (Das discloses, at ¶ [0049], elements of the first source are 8 bits and elements of the second source are 4 bits.).

Regarding claim 15, Das, as modified, discloses the elements of claim 13, as discussed above. Das also discloses:
Z = 2 (Das discloses, at ¶ [0049], elements of the first source are 8 bits and elements of the second source are 4 bits.).

Regarding claim 18, Das discloses:
a data processing method comprising: decoding program instructions; performing data processing using processing circuitry in response to decoded program instructions (Das discloses, at ¶¶ [0045]- [0047], a processor having a decode circuit and an execution circuit.); 
generating, in response to a single instruction, a... result of performing a matrix multiplication of a plurality of first data elements representing a first matrix...by a plurality of second elements representing a second matrix... (Das discloses, at ¶¶ [0049]-[0050], the circuitry multiplies the elements of two vectors, i.e., matrices, in response to a single instruction, e.g., a VNNI instruction, to generate a result.); and 
in response to a single mixed-element-size matrix multiplication instruction specifying a first operand comprising the plurality of first data elements, each having a first data element size, and a second operand comprising the plurality of second data elements, each having a second data element size smaller than the first data element size, controlling the processing circuitry to perform a matrix multiplication operation on the plurality first data elements from the first operand and the plurality of second data elements from the second operand to generate the result... (Das discloses, at ¶¶ [0049]-[0050], the circuitry performs matrix multiplication on the elements of two vectors, i.e., matrices, that are stored in registers where the elements of the second vector are smaller than those of the first vector, e.g., 4 bits and 8 bits respectively, in response to a single instruction, e.g., a VNNI instruction that specifies the operands, to generate a result.);
in response to a single same-element-size matrix multiplication instruction specifying a third operand comprising a plurality of third data elements, each having a third data element size, and a fourth operand comprising a plurality of fourth data elements, each having a fourth data element size the same as the third data element size, the third operand and the fourth operand being stored in the registers, controlling the processing circuitry to perform a matrix multiplication operation on the plurality of third data elements from the third operand and the plurality of fourth data elements from the fourth operand to generate the result... (Das discloses, at ¶ [0043], an instruction for multiplying vectors that are both limited to having operands of the same size, e.g., 8 bits,, where the operands are stored in registers.); 
wherein each element of the result...of the same-element-size matrix multiplication instruction is output to a register element of a result register (Das discloses, at ¶ [0074], outputting the results to destination registers.); and 
two elements of the result...of the mixed-element-size matrix multiplication instruction are output to be stored separately in each register element of the result register (Das discloses, at ¶ [0074], outputting the results, including k elements to destination registers, where each element is stored separately in register elements.);  
wherein the same result register having the same size is used for the same-element-size matrix multiplication instruction and the mixed-element-size matrix multiplication instruction (Das discloses, at Figures 4A-4F, storing results in destination registers. This discloses storing the results in the same register for the various instructions disclosed in the figures and corresponding text.).
Das does not explicitly disclose the aforementioned result is a two dimensional result matrix, the aforementioned first and second matrices comprise at least two rows and at least two columns, and the second matrix comprises a greater number of independent data values represented by the plurality of second data elements than a number of independent data values represented by the plurality of first data elements of the first matrix.
However, in the same field of endeavor (e.g., matrix multiplication) Fleischer discloses:
multiplying matrices that each have at least two rows and columns to generate a two dimensional result matrix (Fleischer discloses, at ¶ [0012], multiplying an n by m matrix with an m by p matrix to generate an n by p result matrix, where n, m, and p are all at least two.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Das’s instruction for multiplication of vectors, i.e., row matrices, to perform matrix multiplication, as disclosed by Fleischer, because this modification merely entails a combination of prior art elements (cited above) according to known methods to yield predictable results, which is an exemplary rationale to support a conclusion of obviousness, as per MPEP § 2143.
Also in the same field of endeavor (e.g., packed data arithmetic) Karthikeyan discloses:
wherein the second matrix comprises a greater number of independent data values represented by the plurality of second data elements than a number of independent data values represented by the plurality of first data elements of the first matrix  (Karthikeyan discloses, at ¶ [0047], a first source with four elements and a second source with eight elements.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Das’s instructions to include instruction features disclosed by Karthikeyan because this modification merely entails a combination of prior art elements (cited above) according to known methods to yield predictable results, which is an exemplary rationale to support a conclusion of obviousness, as per MPEP § 2143.

Regarding claim 20, Das discloses:
an apparatus comprising: an instruction decoder to decode program instructions; processing circuitry to perform data processing in response to the program instructions decoded by the instruction decoder; and a plurality of registers to store operands for processing by the processing circuitry; in which: (Das discloses, at ¶¶ [0045]- [0047], a processor having a decode circuit, an execution circuit, and registers.);
in response to a mixed-element-size...instruction specifying a first operand comprising a plurality of first data elements representing a first number of independent data values, each first data element having a first data element size, and a second operand comprising a plurality of second data elements representing a second number of independent data values...and each second data element having a second data element size smaller than the first data element size, wherein the first operand and the second operand being stored in the registers, the instruction decoder is configured to control the processing circuitry to perform an...operation on a first vector formed of a plurality of first data elements of the first operand and a second vector formed of a plurality of second data elements of the second operand to generate a...[result] comprising a plurality of result data elements (Das discloses, at ¶¶ [0049]-[0050], the circuitry performs an operation on the elements of two vectors that are stored in registers where the elements of the second vector are smaller than those of the first vector, e.g., 4 bits and 8 bits respectively, in response to a single instruction, e.g., a VNNI instruction that specifies the operands, to generate a result.); and
in response to a same-element-size outer product instruction specifying a third operand comprising a plurality of third data elements representing a third number of independent data values, each third data element having a third data element size, and a fourth operand comprising a plurality of fourth data elements representing a fourth number of independent data values, the fourth number being the same as the third number and each fourth data element having a fourth data element size the same as the third data element size, wherein the third operand and the fourth operand being stored in the registers, the instruction decoder is configured to control the processing circuitry to perform an outer product operation on a third vector formed of a plurality of third data elements of the third operand and a fourth vector formed of a plurality of fourth data elements of the fourth operand to generate a two-dimensional result...comprising a plurality of result data elements (Das discloses, at ¶ [0043], an instruction for multiplying vectors that are both limited to having operands of the same size, e.g., 8 bits,, where the operands are stored in registers.), 
wherein the processing circuitry is configured to output each element of the two- dimensional result...of the same-element-size outer product instruction to a register element of a result register (Das discloses, at ¶ [0074], outputting the results to destination registers.); and 
in response to the mixed-element-size outer product instruction, the processing circuitry is configured to output two elements of the two-dimensional result...of the mixed-element- size outer product instruction to be stored separately in each register element of the result register (Das discloses, at ¶ [0074], outputting the results, including k elements to destination registers, where each element is stored separately in register elements.);
wherein the processing circuitry is configured to use the same result register having the same size for the same-element-size outer product instruction and the mixed-element-size outer product instruction (Das discloses, at Figures 4A-4F, storing results in destination registers. This discloses storing the results in the same register for the various instructions disclosed in the figures and corresponding text.).
Das does not explicitly disclose the aforementioned instruction is an outer product instruction, the aforementioned second number being greater than the aforementioned first number, the aforementioned result is a two dimensional result matrix, and wherein at least two of the plurality of result data elements are based on combinations of the same set of first data elements with different sets of second data elements, and at least two of the plurality of result data elements are based on combinations of different sets of first data elements with the same set of second data elements.
However, in the same field of endeavor (e.g., matrix multiplication) Fleischer discloses:
calculating outer products to produce two dimensional result matrices (Fleischer discloses, at ¶ [0012], performing an outer product operation, which generates a matrix.); and
at least two of the plurality of result data elements are based on combinations of the same set of first data elements with different sets of second data elements, and at least two of the plurality of result data elements are based on combinations of different sets of first data elements with the same set of second data elements (Fleischer discloses, at ¶ [0012], performing an outer product operation, which generates a matrix. By definition, this involves multiplying each element of a first column with each element of a second row. For example, a first element of A is multiplied with each element of B, which discloses the same set of first data elements with different sets of second data elements. Similarly, the second element of A is multiplied with each element of B, which discloses different sets of first data elements with the same set of second data elements.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Das’s instruction for outer products, as disclosed by Fleischer, because this modification merely entails a combination of prior art elements (cited above) according to known methods to yield predictable results, which is an exemplary rationale to support a conclusion of obviousness, as per MPEP § 2143.
Also in the same field of endeavor (e.g., packed data arithmetic) Karthikeyan discloses:
wherein the second operand comprises a greater number of independent data values than the first (Karthikeyan discloses, at ¶ [0047], a first source with four elements and a second source with eight elements.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Das’s instructions to include instruction features disclosed by Karthikeyan because this modification merely entails a combination of prior art elements (cited above) according to known methods to yield predictable results, which is an exemplary rationale to support a conclusion of obviousness, as per MPEP § 2143.

Regarding claim 21, Das, as modified, discloses the elements of claim 20, as discussed above. Das does not explicitly disclose the first operand comprises X subsets of first data elements, the second operand comprises Y subsets of second data elements, and the outer product operation generates X*Y result data elements each corresponding to a result of performing one of the instances of the outer product operation on a different combination of one of the X subsets of first data elements and one of the Y subsets of second data elements.
However, in the same field of endeavor (e.g., matrix multiplication) Fleischer discloses:
the first operand comprises X subsets of first data elements, the second operand comprises Y subsets of second data elements, and the outer product operation generates X*Y result data elements each corresponding to a result of performing one of the instances of the outer product operation on a different combination of one of the X subsets of first data elements and one of the Y subsets of second data elements (Fleischer discloses, at ¶ [0012], operands are 8x8 matrices, which discloses having X and Y subsets of elements, i.e., columns and rows, and performing outer product operations on different combinations of rows and columns to produce 64 elements, which is eight times eight.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Das’s instruction for outer products, as disclosed by Fleischer, because this modification merely entails a combination of prior art elements (cited above) according to known methods to yield predictable results, which is an exemplary rationale to support a conclusion of obviousness, as per MPEP § 2143.

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Das in view of Fleischer in view of Karthikeyan in view of US Publication No. 2015/018614295783 by Stephens et al. (hereinafter referred to as “Stephens”).
Regarding claim 19, Das discloses:
a non-transitory storage medium storing a computer program for controlling a host data processing apparatus to perform the method according to claim 18 (Das discloses, at ¶¶ [0188]- [0190], non-transitory media storing instructions to perform the disclosed method.).
Das does not explicitly disclose performing the aforementioned method within a virtual machine execution environment.
However, in the same field of endeavor (e.g., mixed size operations) Stephens discloses:
performing operations in a virtual machine environment (Stephens discloses, at ¶ [0047], performing operations in a virtual machine environment.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Das’s method to operate in a virtual machine environment, as disclosed by Stephens, because this modification merely entails a combination of prior art elements (cited above) according to known methods to yield predictable results, which is an exemplary rationale to support a conclusion of obviousness, as per MPEP § 2143.

Response to Arguments
On page 9 of the response filed September 15, 2022 (“response”), the Applicant argues “There are no examples in Das where a single register element is used for storing two separate output elements of a multiplication operation. In addition, Das does not teach reusing the same result register having the same size for same-element-instructions and mixed-element-size instructions.”
Though fully considered, the Examiner respectfully disagrees. Regarding the first point, the Examiner maintains that the Applicant’s claims, when considered in light of the disclosure, are reasonably interpreted as reading on Das’s disclosure. Specifically, the Applicant argues that the claims recite storing two output elements in a single register element. As shown in Applicant’s Figure 8, an example of this is storing two 16 bit output elements in one 32 bit register element.
Das discloses instructions that generate multiple result elements. See, e.g., Figure 4F and related description, which describe instructions that generate n result values. The instructions specify destination locations, which are understood to be packed vector registers. While the disclosed embodiments are described in terms of 32 bit output values, 32 bits is an example value, and other values are possible. See, e.g., See, e.g., ¶ [0083] and ¶ [0088], which disclose that vector elements used in Das’s disclosure, and the registers that store those elements, can be of various lengths, such as 16 or 32 bit elements stored in registers of 512 bits, or other lengths. Furthermore, packing data elements of various sizes into vector registers of various lengths is fundamentally well known, constituting implementation details or design choices. Therefore, when taken as a whole, Das discloses storing two 16 bit data elements in one 32 bit portion of a register. 
Regarding the second point, the Examiner likewise maintains that the Applicant’s claims, when considered in light of the disclosure, are reasonably interpreted as reading on Das’s disclosure. Specifically, the Applicant argues that the claims recite using the same register for the results of instructions that have same sized operands and for the results of instructions that have mixed sized operands. The written description indicates at page 24 that the same registers are used for both types of instructions. This means that rather than requiring separate register files, a single register file is used for both types of instructions.
Das discloses a number of instructions that have different mixed size elements. Das also discloses that allowing mixed size operands can improve performance over instructions that have require the operands to have the same size, as is traditional. All of the instructions described indicate a destination. Das does not indicate that the different sizes of operands would require special or distinct registers. To the contrary, it is evident that all of the instructions use the same storage, i.e., the vector register file described at, e.g., ¶ [0088] and Figure 10. Accordingly, the Applicant’s remarks are deemed unpersuasive. 

On page 10 of the response the Applicant argues that the cited references do not disclose or suggest “(i) reusing a result register for a same-element-size instruction and a mixed-element-size instruction or (ii) outputting two result matrix elements of a mixed-element-size instruction to each element of a result register used for a same-element-size instruction.”
Though fully considered, the Examiner respectfully disagrees. As indicated above, Das indicates that the various types of vector instructions use the same register file. The Examiner maintains that there is no basis in Das for suggesting that different types of vector instructions require different register files. Therefore, it is apparent that Das discloses using the same register file for results from the multiple disclosed vector instructions. 
Also indicated above, Das disclose outputting multiple result elements of an instruction. As these elements are smaller in size than any disclosed registers, it is evident that the result elements are packed into registers, which discloses outputting two elements to a result register. Storing, for example, two 32 bit data elements in 64 bits of a vector register discloses storing the two data elements in a single register element. These sizes are merely examples of the various sizes that Das discloses. Accordingly, the Applicant’s remarks are deemed unpersuasive. 

Conclusion
THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAWN DOMAN whose telephone number is (571)270-5677.  The examiner can normally be reached on Monday through Friday 8:30am-6pm Eastern Time.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached on 571-270-3995.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SHAWN DOMAN/
Primary Examiner, Art Unit 2183