DETAILED ACTION
This is in response to communication filed on 1/14/2022.
Status of Claims
Claims 1 – 18, 20, and 22 are pending, of which claims 1, 8, 15, and 22 are in independent form.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 8/20/2021 and 1/21/2022 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.


Claim Rejections - 35 USC § 112
In light of applicant’s amendments to the claims, the examiner withdraws the previous rejections to the claims under 35 USC 112.

Claim Objections
Claim 20 is objected to because of the following informalities:  claim 20 still states “the identified multi-dimensional register matrix destination operand” after claim 15 was amended to state “multi-dimensional Appropriate correction is required.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 – 18, 20, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Yadavalli, U.S. Patent Application 2017/0337156 (hereinafter referred to as Yadavalli) in view of Moyer, U.S. Patent Application 2005/0053012 (hereinafter referred to as Moyer), further in view of Hughes et al., U.S. Patent Application 2015/0052333 (hereinafter referred to as Hughes).

Referring to claim 1, Yadavalli discloses “A processor comprising: decode circuitry to decode an instance of a single instruction” (Fig. 3(a) microprocessor 300 and Fig. 2(c) decode, [0027] and [0032] instruction decoder) “having fields for an opcode, a multi-dimensional matrix destination operand identifier, and source memory information” (Fig. 2(b) opcode, source operand, destination = address of a register, pointer to an array or matrix location.  Fig. 4(b) LOADMX instruction loads matrix from SysMem location.  Figs. 5(a) and 5(b) an instruction is decoded, the instruction causing loading of an array from memory), “wherein the opcode is to indicate execution circuitry is to load groups of” “data elements from memory into configured rows of the identified multi-dimensional matrix destination operand” ((Fig. 4(b) LOADMX instruction loads matrix from SysMem location.  Also [0060] The contents of the data buffer [360] are read and transferred in plurality of chunks representing rows or columns or both of Matrix A into their location [310] in Matrix Space [301]); and “execution circuitry” (Fig. 3(a) microprocessor 300 with execution circuitry ALU, EXEC 351-358) “to execute the decoded instance of the single instruction according to the opcode to load groups of” “data elements from memory into configured rows of the identified multi-dimensional matrix destination operand” (Fig. 4(b) LOADMX instruction loads matrix from SysMem location.  Also [0060] The contents of the data buffer [360] are read and transferred in plurality of chunks representing rows or columns or both of Matrix A into their location [310] in Matrix Space [301] via a plurality of ports [320], [321], [326], [327] shown in FIG. 3(b)).
Yadavalli does not appear to explicitly disclose loading groups of strided data elements from memory into a multi-dimensional matrix destination operand.
	However, Moyer discloses “to load groups of strided data elements from memory into configured rows” of a “destination register” ([0044]-[0045] lstrmvex instruction is to load a stream of vector elements from memory into a destination register using a stride).
	It would have been obvious to one of ordinary skill in the art at the time of Applicant’s invention to combine Moyer’s teachings with Yadavalli so that an instruction 
	Using a stride is helpful when working with rows of data.  
Yadavalli and Moyer are analogous art because they are from the same field of endeavor, which is loading multi-dimensional data from memory.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Yadavalli and Moyer before him or her, to modify the teachings of Yadavalli to include the teachings of Moyer so that strides are used to move the data and a multi-dimensional matrix destination operand is used to receive the data.
The motivation for doing so would have been to increase the speed of access of the storage element that is to hold the multi-dimensional operand.
Neither Yadavalli nor Moyer appears to explicitly disclose “wherein a stride value is determined by shifting an index value provided by the instance of the single instruction by a scale value provided by the instance of the single instruction.”
However, Hughes discloses another method for loading multi-dimensional data from memory “wherein a stride value is determined by shifting an index value provided by the instance of the single instruction by a scale value provided by the instance of the single instruction” (Fig. 4 and [0049] gather stride instruction with destination operand, source address operands (base, displacement, index, and/or scale). Fig. 9 and [0084] scatter stride instruction with destination address operands (base, displacement, index, and/or scale). Figs. 14A-B vector friendly instruction format with [0143] scale field 1460 for scaling of the index field's content for memory address generation).

Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Yadavalli, Moyer, and Hughes before him or her, to modify the teachings of Yadavalli and Moyer to include the teachings of Hughes so that the process of loading strided data elements is accomplished by computations involving an index and a scale value.
The motivation for doing so would have been to utilize a known method of addressing that is known to be useful in accessing two-dimensional arrays (as described in ‘Addressing Modes’ by Dandamudi on page 12).
Therefore, it would have been obvious to combine Hughes with Yadavalli and Moyer to obtain the invention as specified in the instant claim.

	As per claims 2 – 4, as above, Yadavalli discloses “each data element of the destination multi-dimensional matrix destination operand” (Fig. 4(b) LOADMX instruction loads matrix from SysMem location.  Also [0060] The contents of the data buffer [360] are read and transferred in plurality of chunks representing rows or columns or both of Matrix A into their location [310] in Matrix Space [301] via a plurality of ports [320], [321] , [326], [327] shown in FIG. 3(b)).
	Also, as above, Moyer discloses loading groups of strided data elements from memory into configured rows of a “destination register” ([0044]-[0045] lstrmvex instruction is to load a stream of vector elements from memory into a destination register using a stride).
([0026] ds destination element size, [0025] vector element may be a word (32 bits). Note that Applicant's description states that a word is 16-bit, doubleword is 32-bit at Applicant’s [0089)).
Yadavalli and Moyer are analogous art because they are from the same field of endeavor, which is loading multi-dimensional data from memory.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Yadavalli and Moyer before him or her, to modify the teachings of Yadavalli to include the teachings of Moyer so that strides are used to move the data and a destination register multi-dimensional matrix operand is used to receive the data.
The motivation for doing so would have been to increase the speed of access of the storage element that is to hold the multi-dimensional operand.
As above, Yadavalli, Moyer, and Hughes are analogous art because they are from the same field of endeavor, which is loading multi-dimensional data from memory.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Yadavalli, Moyer, and Hughes before him or her, to modify the teachings of Yadavalli and Moyer to include the teachings of Hughes so that the process of loading strided data elements is accomplished by computations involving an index and a scale value.

Therefore, it would have been obvious to combine Hughes with Yadavalli and Moyer to obtain the invention as specified in the instant claim.

	As per claim 5, as above, Yadavalli discloses “the execution circuitry is to store each configured row into the identified multi-dimensional matrix destination operand” (Fig. 4(b) LOADMX instruction loads matrix from SysMem location.  Also [0060] The contents of the data buffer [360] are read and transferred in plurality of chunks representing rows or columns or both of Matrix A into their location [310] in Matrix Space [301] via a plurality of ports [320], [321] , [326], [327] shown in FIG. 3(b)).
	As above, it would have been obvious to one of ordinary skill in the art at the time of Applicant’s invention to combine Moyer’s teachings with Yadavalli so that an instruction is “to store each configured row into the identified multi-dimensional matrix destination operand.”
	Further, Yadavalli does not appear to explicitly disclose “update a counter value as each row is stored.”
	However, Moyer discloses “update a counter value as each row is stored” ([0040], [0042], [0045] counter used to keep track of ‘cnt’ and ‘rcnt’, and [0118] loads each row of matrix 102 in turn).
Yadavalli and Moyer are analogous art because they are from the same field of endeavor, which is loading multi-dimensional data from memory.

The motivation for doing so would have been to provide a simple means for monitoring the transfer of each row of data (as described by Moyer at [0042]).
As above, Yadavalli, Moyer, and Hughes are analogous art because they are from the same field of endeavor, which is loading multi-dimensional data from memory.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Yadavalli, Moyer, and Hughes before him or her, to modify the teachings of Yadavalli and Moyer to include the teachings of Hughes so that the process of loading strided data elements is accomplished by computations involving an index and a scale value.
The motivation for doing so would have been to utilize a known method of addressing that is known to be useful in accessing two-dimensional arrays (as described in ‘Addressing Modes’ by Dandamudi on page 12).
Therefore, it would have been obvious to combine Hughes with Yadavalli and Moyer to obtain the invention as specified in the instant claim.

	As per claim 6, as above, Yadavalli discloses “the identified multi-dimensional matrix destination operand” (Fig. 4(b) LOADMX instruction loads matrix from SysMem location.  Also [0060] The contents of the data buffer [360] are read and transferred in plurality of chunks representing rows or columns or both of Matrix A into their location [310] in Matrix Space [301] via a plurality of ports [320], [321] , [326], [327] shown in FIG. 3(b)).
	Also, Moyer discloses loading groups of strided data elements from memory into configured rows of a “destination register” ([0044]-[0045] lstrmvex instruction is to load a stream of vector elements from memory into a destination register using a stride).
	Further, Moyer discloses the “matrix destination operand is a plurality of registers configured to represent a matrix” (Figs. 12 - 26 and [0079] register file represents a matrix).
Yadavalli and Moyer are analogous art because they are from the same field of endeavor, which is loading multi-dimensional data from memory.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Yadavalli and Moyer before him or her, to modify the teachings of Yadavalli to include the teachings of Moyer so that registers create a destination register multi-dimensional matrix to receive the data.
The motivation for doing so would have been to increase the speed of access of the storage element that is to hold the multi-dimensional operand.
As above, Yadavalli, Moyer, and Hughes are analogous art because they are from the same field of endeavor, which is loading multi-dimensional data from memory.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Yadavalli, Moyer, and Hughes before him or her, to modify the teachings of Yadavalli and Moyer to include the 
The motivation for doing so would have been to utilize a known method of addressing that is known to be useful in accessing two-dimensional arrays (as described in ‘Addressing Modes’ by Dandamudi on page 12).
Therefore, it would have been obvious to combine Hughes with Yadavalli and Moyer to obtain the invention as specified in the instant claim.

	As per claim 7, Yadavalli discloses “source memory information” (Fig. 5(b) compute effective address of system memory location).
	Yadavalli does not appear to explicitly disclose “the source memory information includes a scale, an index, a base, and a displacement.”
	However, Moyer discloses “the source memory information includes a scale, an index, a base, and a displacement” (Fig. 7 and [0038] – [0040] cnt, rcnt, stride, skip, skip cnt).
As above, Yadavalli, Moyer, and Hughes are analogous art because they are from the same field of endeavor, which is loading multi-dimensional data from memory.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Yadavalli, Moyer, and Hughes before him or her, to modify the teachings of Yadavalli and Moyer to include the teachings of Hughes so that the process of loading strided data elements is accomplished by computations involving an index and a scale value.

Therefore, it would have been obvious to combine Hughes with Yadavalli and Moyer to obtain the invention as specified in the instant claim.

Referring to claim 8, claim 1 recites the corresponding limitations as that of claim 8. Therefore, the rejection of claim 1 applies to claim 8.

Note, claim 9 recites the corresponding limitations of claim 2. Therefore, the rejection of claim 2 applies to claim 9.

Note, claim 10 recites the corresponding limitations of claim 3. Therefore, the rejection of claim 3 applies to claim 10.

Note, claim 11 recites the corresponding limitations of claim 4. Therefore, the rejection of claim 4 applies to claim 11.

Note, claim 12 recites the corresponding limitations of claim 5. Therefore, the rejection of claim 5 applies to claim 12.

Note, claim 13 recites the corresponding limitations of claim 6. Therefore, the rejection of claim 6 applies to claim 13.

Note, claim 14 recites the corresponding limitations of claim 7. Therefore, the rejection of claim 7 applies to claim 14.

Referring to claim 15, claim 1 recites the corresponding limitations as that of claim 15. Therefore, the rejection of claim 1 applies to claim 15.
	Also, Moyer discloses “A non-transitory machine-readable medium storing an
instruction which causes a processor to perform a method” of claim 1 ([0022] instruction unit 30 fetches instructions from a memory, such as memory 12).

Note, claim 16 recites the corresponding limitations of claim 2. Therefore, the rejection of claim 2 applies to claim 16.

Note, claim 17 recites the corresponding limitations of claim 3. Therefore, the rejection of claim 3 applies to claim 17.

Note, claim 18 recites the corresponding limitations of claim 4. Therefore, the rejection of claim 4 applies to claim 18.

Note, claim 20 recites the corresponding limitations of claim 6. Therefore, the rejection of claim 6 applies to claim 20.

Referring to claim 22, Yadavalli discloses “a system comprising: a processor” (Fig. 3(a) microprocessor 300 in a system), the processor including “decode circuitry to decode an instance of a single instruction” (Fig. 3(a) microprocessor 300 and Fig. 2(c) decode, [0027] and [0032] instruction decoder) “having fields for an opcode, a multi-dimensional matrix destination operand identifier, and source memory information” (Fig. 2(b) opcode, source operand, destination = address of a register, pointer to an array or matrix location.  Fig. 4(b) LOADMX instruction loads matrix from SysMem location.  Figs. 5(a) and 5(b) an instruction is decoded, the instruction causing loading of an array from memory); “wherein the opcode is to indicate execution circuitry is to load groups of” “data elements from memory into configured rows of the identified multi-dimensional matrix destination operand” ((Fig. 4(b) LOADMX instruction loads matrix from SysMem location.  Also [0060] The contents of the data buffer [360] are read and transferred in plurality of chunks representing rows or columns or both of Matrix A into their location [310] in Matrix Space [301]); and “execution circuitry” (Fig. 3(a) microprocessor 300 with execution circuitry ALU, EXEC 351-358) “to execute the decoded instance of the single instruction according to the opcode to load groups of” “data elements from memory into configured rows of the identified multi-dimensional matrix destination operand” (Fig. 4(b) LOADMX instruction loads matrix from SysMem location.  Also [0060] The contents of the data buffer [360] are read and transferred in plurality of chunks representing rows or columns or both of Matrix A into their location [310] in Matrix Space [301] via a plurality of ports [320], [321] , [326], [327] shown in FIG. 3(b)).
Yadavalli does not appear to explicitly disclose loading groups of strided data elements from memory into a multi-dimensional matrix destination operand.
[0044]-[0045] lstrmvex instruction is to load a stream of vector elements from memory into a destination register using a stride).
	It would have been obvious to one of ordinary skill in the art at the time of Applicant’s invention to combine Moyer’s teachings with Yadavalli so that an instruction causes groups of strided data to be loaded into a multi-dimensional matrix destination operand.
	Using a stride is helpful when working with rows of data.  
Yadavalli and Moyer are analogous art because they are from the same field of endeavor, which is loading multi-dimensional data from memory.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Yadavalli and Moyer before him or her, to modify the teachings of Yadavalli to include the teachings of Moyer so that strides are used to move the data and a multi-dimensional matrix destination operand is used to receive the data.
The motivation for doing so would have been to increase the speed of access of the storage element that is to hold the multi-dimensional operand.
Neither Yadavalli nor Moyer appears to explicitly disclose “wherein a stride value is determined by shifting an index value provided by the instance of the single instruction by a scale value provided by the instance of the single instruction.”
However, Hughes discloses another method for loading multi-dimensional data from memory “wherein a stride value is determined by shifting an index value provided by the instance of the single instruction by a scale value provided by the instance of the Fig. 4 and [0049] gather stride instruction with destination operand, source address operands (base, displacement, index, and/or scale). Fig. 9 and [0084] scatter stride instruction with destination address operands (base, displacement, index, and/or scale). Figs. 14A-B vector friendly instruction format with [0143] scale field 1460 for scaling of the index field's content for memory address generation).

Also, neither Yadavalli nor Moyer appears to explicitly disclose another coprocessor or accelerator.  As such, Yadavalli/Moyer does not appear to explicitly disclose a system comprising “an accelerator coupled to the processor, the accelerator including” the features of claim 1.
However, Hughes discloses “A system comprising: a processor; and an accelerator coupled to the processor, the accelerator including” the features of claim 1 (Fig. 19 and [0252] — [0257] processors, additional processors, and graphics accelerators).
Yadavalli, Moyer, and Hughes are analogous art because they are from the same field of endeavor, which is loading multi-dimensional data from memory.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Yadavalli, Moyer, and Hughes before him or her, to modify the teachings of Yadavalli and Moyer to include the teachings of Hughes so that the process of loading strided data elements is accomplished by computations involving an index and a scale value.

Therefore, it would have been obvious to combine Hughes with Yadavalli and Moyer to obtain the invention as specified in the instant claim.

	Response to Arguments
Applicant argues, on page 9, that Moyer's lstrmvex has nothing to do with matrices.
The examiner disagrees.  First, it is understood that a vector is a matrix consisting of one row or one column.  ‘Brief Introduction to Vectors and Matrices’ archived on 12/8/2017 from University of North Florida unf.edu is cited as an evidentiary reference (see page 2).  Also, ‘Scalars and Vectors (... and Matrices)’ from Math Is Fun is cited as an evidentiary reference (see pages 6 – 7).  Thus, even if Moyer’s lstrmvex is solely dealing with vectors, a vector is a matrix.
Second, Moyer at [0075] - [0088] is using load and store instructions described above for matrix operations (lmvex).  Moyer at [0114] - [0123] is using load and store instructions described above for matrix operations (lstrmvex).

Applicant argues, on page 9, that Moyer's lstrmvex requires a lot more than a stride to work. Taking out the single aspect of lstrmvex is impermissibly picking-and-choosing.


Applicant’s arguments with respect to the newly added limitation regarding determining a stride have been considered but are moot because the new ground of rejection does not rely on the combination of references applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
	Evidentiary references cited above:
‘CHAPTER 1 - Brief Introduction to Vectors and Matrices’ from the University of North Florida, archived at unf.edu on December 8, 2017.
‘Scalars and Vectors (... and Matrices)’ from Math Is Fun, copyright 2017.
‘Addressing Modes – Chapter 5’ by Dandamudi, 1998.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to STEVEN G SNYDER whose telephone number is (571)270-1971.  The examiner can normally be reached on M-F 8:00am-4:30pm (flexible).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Henry Tsai can be reached on 571-272-4176.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


/STEVEN G SNYDER/Primary Examiner, Art Unit 2184