DETAILED ACTION
Status of Claims 
Claims 1-27 have been considered. It is hereby acknowledged that the following papers have been received and placed of record in the file:
Abstract 							-Receipt Date 02/10/2020
Application Data Sheet 						-Receipt Date 02/10/2020
Claims 								-Receipt Date 02/10/2020
Drawings-only black and white line drawings			-Receipt Date 02/10/2020
Information Disclosure Statement (IDS) 				-Receipt Date 11/13/2020
Specification							-Receipt Date 02/10/2020

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 11/12/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Objections
Claims 1, 7-8, 17, 19, and 26 are objected to because of the following informalities:  
Claims 1, 8, and 19: ”a vector” should be “a respective vector” to be consistent with the usage of the term “the respective vector”
Claims 7, 17, and 26- “the vector” should be “the respective vector”
Claim 8- “wherein the execution” should be “wherein the executing” to be consistent with claim 1
Claim 8- “the one or more lanes of the each” should be “the one or more lanes of each”
Appropriate correction is required.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-5, 19-20, and 23-25 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by van Hook et al. US 6,266,758 (hereinafter, Hook).
Regarding claim 1, Hook teaches:
1. A method of storing values of registers, the method comprising: 
executing an interleaving store instruction (col 8 lines 55-57: shuffle instruction) on a processor comprising at least two registers with each of the at least two registers being configured to store a vector with a plurality of data elements (col 8 lines 55-57: the shuffle instruction selects data elements from two vector registers that each store a plurality of data elements, see also col 9 lines 14-19), wherein the executing of the interleaving store instruction includes: 
retrieving at least two data elements from the plurality of data elements of the respective vector of each of the at least two registers (col 8 lines 55-57: data elements are selected from the plurality of data elements of each of two vector registers); and 
storing the at least two data elements retrieved from the plurality of data elements of the respective vector of each of the at least two registers in a storage structure of the processor (col 9 lines 14-19: the data elements retrieved from the two registers are stored in a destination register, i.e. a storage structure of the processor), 
wherein each of the at least two data elements of a first register of the at least two registers are stored to interleave with each of the at least two data elements of a second register of the at least two registers (col 12 lines 40-55 and Fig. 10E: the data elements selected from first register vs are stored to interleave with the data elements of the second register vt).

	Regarding claim 2, Hook teaches:
2. The method of claim 1, wherein the at least two data elements are non-consecutive data elements (col 12 lines 40-55 and Fig. 10E: the data elements retrieved from vs and vt are non-consecutive in their source registers).

	Regarding claim 3, Hook teaches:
3. The method of claim 1, wherein the at least two data elements retrieved from the plurality of data elements of the respective vector of each of the at least two registers comprise even data elements of a respective register of the at least two registers (col 12 lines 40-55 and Fig. 10E: the data elements retrieved from the source registers comprise even data elements of register vs, i.e. a respective register of the two registers).


4. The method of claim 1, wherein the at least two data elements retrieved from the plurality of data elements of the respective vector of each of the at least two registers comprise every fourth data element of a respective register of the at least two registers (col 12 lines 40-55 and Fig. 10E: the data elements retrieved of the source registers comprise every fourth data element of vt, i.e. a respective register of the two registers).

	Regarding claim 5, Hook teaches: 
5. The method of claim 1, wherein a data element of the at least two data elements retrieved from the plurality of data elements of the respective vector of each of the at least two registers comprises a data element of a Byte, 2 Bytes, 4 Bytes, or 8 Bytes (col 12 lines 40-55: the data elements are 16-bit elements which is 2 Bytes).

	Regarding claim 19, Hook teaches:
19. A processor comprising, 
at least two registers (col 8 lines 55-57: the shuffle instruction selects data elements from two vector registers); and 
a storage structure coupled to the at least two registers (col 9 lines 14-19: the data elements retrieved from the two registers are stored in a destination register, i.e. a storage structure coupled to the two vector registers), 
wherein each of the at least two registers comprises a vector with a plurality of data elements (col 9 lines 14-19: the two registers each comprise a vector with a plurality of data elements), 
wherein the processor is configured to retrieve at least two data elements from the plurality of data elements of the respective vector of each of the at least two registers (col 8 lines 55-57: data elements are selected from the plurality of data elements of each of two vector registers), and store the retrieved data elements in the storage structure while executing an interleaving store instruction (col 9 lines 14-19: the data elements retrieved from the two registers are stored in a destination register, i.e. a storage structure of the processor, while executing a shuffle/interleaving store instruction), and 
wherein each of the at least two data elements of a first register of the at least two registers is configured to interleave with each of the at least two data elements of a second register of the at least two registers when stored in the storage structure (col 12 lines 40-55 and Fig. 10E: the data elements selected from first register vs are stored to interleave with the data elements of the second register vt when stored in the storage structure vd).

	Regarding claim 20, Hook teaches:
20. The system of claim 19, wherein the at least two data elements are non-consecutive data elements (col 12 lines 40-55 and Fig. 10E: the data elements retrieved from vs and vt are non-consecutive in their source registers).

	Regarding claim 23, Hook teaches:
23. The processor of claim 19, wherein the at least two data elements retrieved from the plurality of data elements of each of the at least two registers comprise even data elements of a respective register of the at least two registers (col 12 lines 40-55 and Fig. 10E: the data elements retrieved from the source registers comprise even data elements of register vs, i.e. a respective register of the two registers).


24. The processor of claim 19, wherein the at least two data elements retrieved from the plurality of data elements of each of the at least two registers comprise every fourth data elements of a respective register of the at least two registers (col 12 lines 40-55 and Fig. 10E: the data elements retrieved of the source registers comprise every fourth data element of vt, i.e. a respective register of the two registers).

	Regarding claim 25, Hook teaches:
25. The processor of claim 19, wherein a data element of the at least two data elements retrieved from the plurality of data elements of the respective vector of each of the at least two registers comprises a data element of a Byte, 2 Bytes, 4 Bytes, or 8 Bytes (col 12 lines 40-55: the data elements are 16-bit elements which is 2 Bytes).


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 6 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over van Hook et al. US 6,266,758 (hereinafter, Hook) in view of Gove et al. US 2012/0216011 (hereinafter, Gove ‘12).
Regarding claim 6, Hook teaches:
6. The method of claim 1, wherein the storage structure includes a memory (col 9 lines 14-19: the destination register vd is a memory of the processor) or a higher order cache of the processor, 
	Hook does not teach:
the method further comprises: 
generating a mask instruction; and 
executing the mask instruction to block a lane of the memory or the higher order cache, 
wherein the at least two data elements retrieved from the plurality of data elements of the respective vector of each of the at least two registers are stored in a non-blocked lane of the memory or the higher order cache.
	However, Gove ‘12 teaches:
generating a mask instruction ([0038] and [0050]: an instruction which specifies a mask, i.e. a mask instruction, is generated at 620); and 
executing the mask instruction to block a lane of the memory or the higher order cache ([0051]: the bit values of the mask determines which elements to perform an operation on, i.e. the mask blocks element lanes of the register, see also [0032]), 
wherein at least two data elements are stored in a non-blocked lane of the memory ([0041]: at least two elements are not blocked by the mask, i.e. are in a non-blocked lane, in the example of Fig. 5) or the higher order cache.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processor of Hook to include support for mask instructions as taught by Gove ‘12 such that at least two data elements that are shuffled into a register in Hook are selected to be operated on by a mask instruction while the other elements are blocked by the mask instruction. One of ordinary skill in the art would have been motivated to make this modification to 

Regarding claim 21, Hook teaches: 
21. The processor of claim 19, 
	Hook does not teach:
wherein the processor is configured to block a lane in the storage structure upon an execution of a mask instruction, and 
wherein the at least two data elements are configured to be stored in a non-blocked lane of the storage structure.
	However, Gove ‘12 teaches:
wherein the processor is configured to block a lane in the storage structure upon an execution of a mask instruction ([0038] and [0050]-[0051]: a lane of a register is blocked upon executing an instruction which specifies a mask, i.e. upon executing a mask instruction), and 
wherein at least two data elements are configured to be stored in a non-blocked lane of the storage structure ([0041]: at least two elements are not blocked by the mask, i.e. are in a non-blocked lane, in the example of Fig. 5).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processor of Hook to include support for mask instructions as taught by Gove ‘12 such that at least two data elements that are shuffled into a register in Hook are selected to be operated on by a mask instruction while the other elements are blocked by the mask instruction. One of ordinary skill in the art would have been motivated to make this modification to allow for selective vector operations on partially filled registers which would allow for reduced power consumption (Gove ‘12 [0012] and [0037]).

Claims 9-16 are rejected under 35 U.S.C. 103 as being unpatentable over van Hook et al. US 6,266,758 (hereinafter, Hook) in view of Andrews et al. US 5,768,564 (hereinafter, Andrews).


	Regarding claim 9, Hook teaches:
9. A system comprising, 
a processor of a second type (col 9 lines 1-6: the processor on which the shuffle instruction executes is a processor of a second type for executing that instruction),
an interleaving store instruction (col 8 lines 55-57: shuffle instruction),
wherein a second type processor comprises at least two registers (col 8 lines 55-57: the shuffle instruction selects data elements from two vector registers col 9 lines 14-19); 
wherein each of the at least two registers is configured to store a vector with a plurality of data elements (col 9 lines 14-19: the two vector registers each stores a vector with a plurality of elements); 
wherein the second type processor is configured to retrieve (col 8 lines 55-57: data elements are selected from the plurality of data elements of each of two vector registers) and store at least two data elements from the plurality of data elements of the respective vector of each of the at least two registers when executing the source file for the second type processor (col 9 lines 14-19: the data elements retrieved from the two registers are stored in a destination register, i.e. a storage structure of the processor); and 
wherein each of the at least two data elements of a first register of the at least two registers are stored to interleave with each of the at least two data elements of a second register of the at least two registers (col 12 lines 40-55 and Fig. 10E: the data elements selected from first register vs are stored to interleave with the data elements of the second register vt).
	Hook does not teach:
a source file for a processor of a first type; 
a translator configured to translate the source file for the first type processor into a source file for a processor of a second type; and 
a compiler configured to generate an execution file for the second type processor based on the translated source file; 
wherein the first type processor source file comprises an interleaving store instruction; 
	However, Andrews teaches:
a source file for a processor of a first type (col 1 lines 51-67: a program written in the original source language, i.e. a source file, is for a computer/processor of a first type for which a compiler is available for that language); 
a translator configured to translate the source file for the first type processor into a source file for a processor of a second type (col 1 lines 51-67: a translator converts the program/source file into intermediate code in a target language, i.e. into a source file for a target processor of a second type); and 
a compiler configured to generate an execution file for the second type processor based on the translated source file (col 1 lines 51-67: a compiler for the target processor converts the intermediate code into machine code, i.e. generates an execution file based on the translated source file); 
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system of Hook to include a translator and compiler to support source files written for processors of different types as taught by Andrews such that a shuffle instruction written in a language not supported by the processor of Hook would be translated and compiled for the 

	Regarding claim 10, Hook in view of Andrews teaches:
10. The system of claim 9, wherein the at least two data elements are non-consecutive data elements (Hook col 12 lines 40-55 and Fig. 10E: the data elements retrieved from vs and vt are non-consecutive in their source registers).

	Regarding claim 11, Hook in view of Andrews teaches:
11. The system of claim 9, 
wherein the data elements stored by the second type processor are configured to be same as data elements to be stored by the first type processor when the source file for the first type processor is executed by the first type processor (Andrews col 1 lines 51-67: the source file for the first type processor is translated and compiled to execute on the processor of Hook, thus data elements which are stored by the shuffle instruction in Hook after the translation are the same as data elements stored in the original source language).

	Regarding claim 12, Hook in view of Andrews teaches:
12. The system of claim 9 further comprising, a storage structure of the second type processor configured to store the interleaved data elements (Hook col 9 lines 14-19: the data elements retrieved from the two registers are stored in a destination register, i.e. a storage structure of the second type processor in Hook).

	Regarding claim 13, Hook in view of Andrews teaches:
13. The system of claim 12, wherein the storage structure includes a memory (Hook col 9 lines 14-19: the destination register vd is a memory of the processor) or a higher order cache of the second type processor.

	Regarding claim 14, Hook in view of Andrews teaches: 
14. The system of claim 9, wherein the at least two data elements retrieved from the plurality of data elements of the respective vector of each of the at least two registers comprise even data elements of a respective register of the at least two registers (Hook col 12 lines 40-55 and Fig. 10E: the data elements retrieved from the source registers comprise even data elements of register vs, i.e. a respective register of the two registers).

	Regarding claim 15, Hook in view of Andrews teaches:
15. The system of claim 9, wherein the at least two data elements retrieved from the plurality of data elements of the respective vector of each of the at least two registers comprise every fourth data elements of a respective register of the at least two registers (Hook col 12 lines 40-55 and Fig. 10E: the data elements retrieved of the source registers comprise every fourth data element of vt, i.e. a respective register of the two registers).

	Regarding claim 16, Hook in view of Andrews teaches:
16. The system of claim 9, wherein a data element of the at least two data elements retrieved from the plurality of data elements of the respective vector of each of the at least two registers comprises a data element of a Byte, 2 Bytes, 4 Bytes, or 8 Bytes (Hook col 12 lines 40-55: the data elements are 16-bit elements which is 2 Bytes).

Claim 22 is rejected under 35 U.S.C. 103 as being unpatentable over van Hook et al. US 6,266,758 (hereinafter, Hook) in view of Gove US 2013/0024647 (hereinafter, Gove ‘13).


	Regarding claim 22, Hook teaches:
22. The processor of claim 19, 
	Although Hook teaches that its register file is not limited to vector register files (col 4 lines 52-56), Hook does not teach:
wherein the storage structure comprises a cache configured to store the interleaved data elements.
	However, Gove ’13 teaches:
a storage structure comprises a cache configured to store data elements ([0089] and [0101]: vector registers are stored in L1 cache).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the vector register file of Hook to be a cache as taught by Gove ’13, such that the interleaved data elements stored in the destination vector register vd in Hook are stored in an L1 cache. One of ordinary skill in the art would have been motivated to make this modification to reduce that amount of space required to support vector registers (Gove ’13 [0105])




Allowable Subject Matter
s 7-8, 17-18, and 26-27 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  
The known prior art of record, taken alone or in combination, was not found to teach, in combination with other limitations in the claims, a size of each of two registers of a processor of a second type being larger than a size of a register of a first type processor, wherein a size of the vector of each of the two registers is the same as a size of a vector storable in the register of the first type processor, and wherein one or more lanes of each of the at least two registers are blocked to match the size of its respective vector, as described in claims 7, 17, and 26. In particular, while the prior art was found to generally teach converting a source file for a first type processor to a source file for a second type processor (Andrews) and masking lanes of a vector register (Gove ‘12), the prior art was not found to teach blocking lanes of a larger register of a second type processor to match a vector size that is storable in a smaller register of a first type processor. 
The follow prior art was found to be of closest relevance:
“Dynamic Translation of Structured Loads/Stores and Register Mapping for Architectures with SIMD Extensions”- maps multiple smaller guest registers to a host’s larger registers instead of blocking of lanes of the host’s larger registers to match the size of the smaller guest register or to match the size of a vector storable in a smaller guest register (section 3.3 Register Mapping)
Claims 8, 18, and 27 would also be allowable based on their dependence from claims 7, 17, and 26, respectively, which include allowable subject matter. 

Conclusion

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached on 5712724169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KASIM ALLI/Examiner, Art Unit 2183