Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1, 3-24 are presented for examination.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 4/20/22 has been entered.

Claim Objections
Claim 5 is objected to because it is dependent from canceled claim 2.  Appropriate correction is required.








Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
	Claim 23 uses means plus function language, and is thereby interpreted according to 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The following is an explanation of the interpretation of the structures from the specification that correspond to the claimed means plus function limitations:
	“means for storing” corresponds to registers 20 of fig. 1
	“means for decoding” corresponds to decode unit 18 of fig. 1
	“means for performing” corresponds to processing circuitry 12 of fig. 1



Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3-11, 13, 15, 18-19, 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over Mahurin, US Patent Application Publication 2017/0046153 (hereinafter Mahurin) in view of Caprioli et al., US Patent Application Publication 2014/0095842 (hereinafter Caprioli).
	Regarding claim 1, Mahurin teaches:
A data processing apparatus comprising: register storage circuitry having a plurality of registers to store data elements; decoder circuitry responsive to a data processing instruction to generate control signals, the data processing instruction specifying in the plurality of registers: a first source register (see e.g. fig. 3, para. [0036]), a second source register and an output register, wherein the first source register comprises a plurality of first source data elements and the second source register comprises a plurality of respective second source data elements and each of the first source data elements corresponds to a respective second source data element of the plurality of respective second source data elements (see e.g. fig. 3, b[3] corresponds to Rt.b[3] etc.), and wherein the plurality of first source data elements and the respective plurality of respective second source data elements are grouped; and processing circuitry responsive to the control signals to perform, independently for each group of elements, a dot product operation comprising: extracting at least a first data element and a second data element from the plurality of first source data elements grouped (see e.g. fig. 3, para. [0036-40]); extracting at least a corresponding first data element and a corresponding second data element from the respective plurality of second source data elements grouped (see e.g. fig. 3, para. [0036-40]); performing a first multiply operation of multiplying together the first data element and the corresponding first data element (see e.g. fig. 3, para. [0036-40]); performing a second multiply operations of multiplying together the second data element and the corresponding second data element (see e.g. fig. 3, para. [0036-40]); summing results of the first multiply operation and the second multiply operation (see e.g. fig. 3, para. [0036-40]); and applying a result of the summing to an output element of the output register, wherein the output element is grouped into the output register (see e.g. fig. 3, para. [0036-40]).
	While Mahurin teaches lanes such as a first source divided into lanes of an upper half and lower half (see para. [0036], Vuu source vector with lower half Vu), Mahurin fails to explicitly teach each of the first source register, second source register, and output registers divided into a plurality of independent intra-register lanes, and elements are grouped into the plurality of independent intra-register lanes.
	Caprioli teaches dividing source and destination registers into independent lanes and grouping elements into the lanes (see e.g. para. [0042-3]).
	Before the effective filing date of the claimed invention it would have been obvious to one of ordinary skill in the art to combine the teachings of Mahurin and Caprioli such that each of the first source register, second source register, and output registers divided into a plurality of independent intra-register lanes, and elements are grouped into the plurality of independent intra-register lanes. This would have provided the clearly predictable result of performing the exact same operations on the same data as part of a lane. This would have also provided an advantage of allowing for additional boundaries to provide more control and flexibility over which elements are operated upon in a given operation.

Regarding claim 3, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1, wherein the output register is an accumulator register and the dot product operation is a dot product and accumulate operation which further comprises loading an accumulator value from the accumulator register and summing the results of the multiply operations with the accumulator value (see e.g. Mahurin fig. 3, para. [0036-40]). 
Regarding claim 4, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1, wherein the decoder circuitry is responsive to a further data processing instruction to generate further control signals, the data processing instruction specifying in the plurality of registers the output register and an accumulator register, and the processing circuitry is responsive to the further control signals to perform an accumulate operation comprising: loading an accumulator value from the accumulator register and a summation value from the intra-register lane of the output register; summing the accumulator value and the summation value; and storing a result of the summing in the accumulator register (see e.g. Mahurin fig. 3, para. [0036-40]).
Regarding claim 5, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 2, wherein widths of the first source register, of the second source register, and of the output register are equal (see e.g. Mahurin para. [0036]). 
Regarding claim 6, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 3, wherein widths of the first source register, of the second source register, of the output register, and of the accumulator register are equal (see e.g. Mahurin fig. 3, para. [0036-40]).
Regarding claim 7, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1, wherein a width of the first source register is equal to a combined size of all data elements extracted from the first source register in the dot product operation (see e.g. Caprioli para. [0043], full width used). 
Regarding claim 8, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1, wherein a width of the second source register is equal to a combined size of all data elements extracted from the second source register in the dot product operation (see e.g. Caprioli para. [0043], full width used). 
Regarding claim 9, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1, wherein the dot product operation further comprises extracting at least a third data element and a fourth data element from the plurality of first source data elements grouped into the intra-register lane, extracting at least a corresponding third data element and a corresponding fourth data element from the respective plurality of second source data elements grouped into the intra-register lane, performing further multiply operations of multiplying together the third data element and the corresponding third data element, and multiplying together the fourth data element and the corresponding fourth data element, and summing results of the further multiply operations with the results of the first multiply operation and the second multiply operation (see e.g. Mahurin fig. 3, para. [0036-40]). 
Regarding claim 10, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1, wherein a size of each intra-register lane is equal to a combined size of all data elements extracted from the intra-register lane of the first source register in the dot product operation (see e.g. Mahurin fig. 3, para. [0036-40], Caprioli para. [0042-3]). 
Regarding claim 11, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1, wherein a size of each intra-register lane is equal to a combined size of all data elements extracted from the intra-register lane of the second source register in the dot product operation (see e.g. Mahurin fig. 3, para. [0036-40], Caprioli para. [0042-3]). 
Regarding claim 13, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1, wherein the plurality of intra-register lanes have a 32-bit width and the extracting comprises extracting four 8-bit data elements from each intra-register lane of the first source register and the second source register (see e.g. Mahurin fig. 3, para. [0036-40]).
Regarding claim 15, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1, wherein the multiply operations and adding are integer operations (see e.g. Caprioli para. [0039]). 
 Regarding claim 18, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1, wherein the multiply operations and adding are floating-point operations (see e.g. Caprioli para. [0028]). 
Regarding claim 19, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 18, wherein the plurality of intra-register lanes have a 32-bit width (see e.g. Mahurin fig. 3, para. [0036-40]) and the extracting comprises extracting two 16-bit floating point data elements from each intra-register lane of the first source register and the second source register (see e.g. Caprioli para. [0028]). 
	Claim 21 is rejected for reasons corresponding to those given above for claim 1.
Claim 22 is rejected for reasons corresponding to those given above for claim 1.
Claim 23 is rejected for reasons corresponding to those given above for claim 1.


Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Mahurin in view of Caprioli, further in view of Ginzburg et al., US Patent Application Publication 2011/0153707 (hereinafter Ginzburg).
Regarding claim 12, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1.
Mahurin in view of Caprioli fails to explicitly teach wherein the data processing instruction specifies a repeated intra-register lane and a selected source register of the first source register and the second source register, and the processing circuitry is responsive to the control signals to reuse content of the repeated intra-register lane for all lanes of the selected source register.
Ginzburg teaches performing a vector multiply and accumulate operation while repeating a lane of data to reuse content for the vector operation (see e.g. fig. 6, para. [0039-41], element A(1,1) is used repeatedly to multiply with B(1,1), B(1,2), B(1,3), and B(1,4)).
Before the effective filing date of the claimed invention it would have been obvious to one of ordinary skill in the art to combine the teachings of Mahurin, Caprioli, and Ginzburg such that the data processing instruction specifies a repeated intra-register lane and a selected source register of the first source register and the second source register, and the processing circuitry is responsive to the control signals to reuse content of the repeated intra-register lane for all lanes of the selected source register. This would have simplified processing when one matrix value needs to be multiplied by a plurality of other matrix values. This would have also allowed for reducing the number of instructions/register accesses necessary to implement matrix multiplication when one matrix value needs to be multiplied by a plurality of other matrix values.

Claims 16, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Mahurin in view of Caprioli, further in view of Zohar et al., US Patent Application Publication 2014/0032881 (hereinafter Zohar).
Regarding claim 16, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1.
Mahurin in view of Caprioli fails to explicitly teach wherein values held in the first source register and in the second source register are signed values. 
Zohar teaches extracting signed values from a source vector register (see e.g. para. [0015], [0066]).
Before the effective filing date of the claimed invention it would have been obvious to one of ordinary skill in the art to combine the teachings of Mahurin, Caprioli, and Zohar such that values held in the first source register and in the second source register are signed values. This would have provided an advantage of being able to operate on additional instructions to improve the flexibility and capability of the system such as for certain multimedia applications such as discussed by Zohar.
Regarding claim 17, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1.
Mahurin in view of Caprioli fails to explicitly teach wherein values held in the first source register and in the second source register are unsigned values.
Zohar teaches extracting unsigned values from a source vector register (see e.g. para. [0015], [0066]).
Before the effective filing date of the claimed invention it would have been obvious to one of ordinary skill in the art to combine the teachings of Mahurin, Caprioli, and Zohar such that values held in the first source register and in the second source register are unsigned values. This would have provided an advantage of being able to operate on additional instructions to improve the flexibility and capability of the system such as for certain multimedia applications such as discussed by Zohar.


Claims 14, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Mahurin in view of Caprioli, further in view of In re Rinehart, 531 F.2d 1048, 189 USPQ 143 (CCPA 1976) (see MPEP 2144.04(IV).
Regarding claim 14, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 1, and the extracting comprises extracting four 16-bit data elements from each intra-register lane of the first source register and the second source register (see e.g. Caprioli para. [0028]).
Mahurin in view of Caprioli fails to explicitly teach wherein the plurality of intra-register lanes have a 64-bit width. 
In re Rinehart describes that "mere scaling up of a prior art process capable of being scaled up, if such were the case, would not establish patentability in a claim to an old process so scaled." 531 F.2d at 1053, 189 USPQ at 148.
Before the effective filing date of the claimed invention it would have been obvious to one of ordinary skill in the art to combine the teachings of Mahurin, Caprioli, and In re Rinehart such that the plurality of intra-register lanes have a 64-bit width. This would have provided the clearly predictable result of performing the exact same functionality on a different sized register/element.
Regarding claim 20, Mahurin in view of Caprioli teaches or suggests:
The data processing apparatus as claimed in claim 18, the extracting comprises extracting two 32-bit floating point data elements from each intra-register lane of the first source register and the second source register (see e.g. Caprioli para. [0028]).
Mahurin in view of Caprioli fails to explicitly teach wherein the plurality of intra-register lanes have a 64-bit width.
In re Rinehart describes that "mere scaling up of a prior art process capable of being scaled up, if such were the case, would not establish patentability in a claim to an old process so scaled." 531 F.2d at 1053, 189 USPQ at 148.
Before the effective filing date of the claimed invention it would have been obvious to one of ordinary skill in the art to combine the teachings of Mahurin, Caprioli, and In re Rinehart such that the plurality of intra-register lanes have a 64-bit width. This would have provided the clearly predictable result of performing the exact same functionality on a different sized register/element.

Claim 24 is rejected under 35 U.S.C. 103 as being unpatentable over Mahurin in view of Caprioli, further in view of Examiner’s taking of Official Notice.
Regarding claim 24, Mahurin in view of Caprioli teaches or suggests:
An instruction execution environment corresponding to the data processing apparatus of claim 1.
Mahurin in view of Caprioli fails to explicitly teach a virtual machine provided by a computer program executing upon a data processing apparatus, said virtual machine providing an instruction execution environment corresponding to the data processing apparatus of claim 1. However, a virtual machine program stored on a non-transitory computer-readable medium is known in the art. When executed, it allows one to emulate a particular environment within another environment. This allows, for instance, a single physical computer to be used in place of separate physical computers for each desired environment, which could save cost and space. As a result, before the effective filing date of the claimed invention it would have been obvious to one of ordinary skill in the art to modify Mahurin in view of Caprioli to include a non-transitory medium storing a virtual machine provided by a computer program executing upon a data processing apparatus, said virtual machine providing an instruction execution environment.




Response to Arguments
Applicant’s arguments with respect to claims 1, 3-24 have been considered but are moot because the new ground(s) of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHN M LINDLOF whose telephone number is (571)270-1024. The examiner can normally be reached M-F 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached on 5712724169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JOHN M LINDLOF/Primary Examiner, Art Unit 2183