DETAILED ACTION
Claims 1 and 5-23 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claims 1 and 20-21 are objected to because of the following informalities:
Either delete “each of” in the 2nd to last line OR replace “consist” with --consists-- in the last line.
Claim 17 is objected to because of the following informalities:
In line 7, the examiner asserts that it is inaccurate to state the result vector register is generated.  The register is fixed hardware that is not generated.  The examiner recommends replacing “the result vector register” with --a result vector--.  Or, applicant may claim generating a result vector to be stored in the result vector register.
Claim 18 is objected to because of the following informalities:
In line 7, the examiner again asserts that it is inaccurate to state the result vector register is generated.  The register is fixed hardware that is not generated.  The examiner recommends deleting --register-- (or making a compatible amendment based on the alternative amendment to claim 17 proposed above).
Claim 19 is objected to because of the following informalities:
In line 5, the examiner again asserts that it is inaccurate to state the result vector register is generated.  The register is fixed hardware that is not generated.  The examiner recommends replacing “the result vector register” with --a result
vector--.  Or, applicant may claim generating a result vector to be stored in the result vector register.
Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 

(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 

(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.

Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.  Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

Claim Rejections - 35 USC § 102/103
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5-8, 13-14, 16-18, and 20-21 are rejected under 35 U.S.C. 102(a)(1) as anticipated by Corbal et al., U.S. Patent Application Publication No. 2012/0254589 A1 (as cited by applicant and herein referred to as Corbal), or, in the alternative, under 35 U.S.C. 103 as obvious over Corbal in view of the examiner’s taking of Official Notice.
Referring to claim 1, Corbal has taught an apparatus, comprising:
a) a set of vector registers (see FIG.10, registers 1010);
b) one or more control registers (see FIG.10, registers 1015.  Also see FIG.2, and note the mask register, which is one of registers 1015 that controls the operation); and
c) processing circuitry to execute a sequence of instructions including a splice instruction (see the VALIGN instruction of FIG.2) identifying at least a first vector register (FIG.2, source 1) and at least one control register (FIG.2, mask register), the first vector register storing a first vector of data elements (FIG.2, elements A through P) having a vector length that is dependent on a number of data elements in the first vector and a size of the data elements (from paragraph [0030] and [0033] (first sentence), there are sixteen 32-bit elements in source 1, giving the vector a vector length of 512 bits, which would appear in a single zmm register), and the at least one control register storing control data identifying, independently of the size of the data elements, one or more data elements (see FIG.2 and note that at least the mask register identifies one or more elements.  Based on paragraphs [0029]-[0030], the identification is independent of the element size because the mask is element-based, not length based.  So, if there are sixteen 32-bit elements in a 512-bit zmm register, sixteen mask bits are used, one for each element.  Similarly, if there are sixteen 16-bit elements in a 256-bit ymm register (or sixteen 8-bit elements in a 128-bit xmm register), the same sixteen mask bits are used.  Thus, because any single mask bit could correspond to different-sized elements, the control register identifies data independent of element size);
d) the one or more data elements comprising one of: one data element within the first vector of data elements, or a plurality of data elements occupying sequential data element positions within the first vector of data elements (see FIG.2 and note that the mask value dictates the number and sequentiality of the data elements of the first vector.  With a mask of all ones, i.e. 0xFFFF (possible via paragraph [0150]), multiple sequential elements within the first vector are selected),
e) Regarding the limitation wherein the control data comprises location data identifying a location of a given data element within the first vector register and length data, different from the location data, identifying a number of data elements, including the given data element, to extract from the first vector of data elements, this is not patentable for multiple reasons:
e1) Under a first interpretation of Corbal, from the example in FIG.2, for a mask of all ones, for instance, the rightmost ‘1’ may be the location data (of the first element), and the remaining ones indicate the length of data to extract);
e2) Under a second interpretation of Corbal, the offset in paragraph [0029] is location data that generates a starting location (e.g. the location of element D in the example of FIG.2).  Further, the mask of FIG.2 is length data that indicates how much data in the first vector (source 1) is to be extracted.  A mask of all ones in FIG.2 indicates a length of 13, i.e., extraction of 13 elements from source 1 (elements D through P).  Under this second interpretation, Corbal has not taught that the offset is in a control register.  Instead, Corbal has taught that the offset is provided within the instruction as an immediate value.  However, it is known in the art to provide an operand in a register as opposed to an immediate (just as the sources and mask are provided in registers and not as immediates).  This is one of a limited number of finite options for providing operands that would have yielded predictable results (i.e., functionally equivalently providing the offset information) and would have been obvious to try.  By placing the offset in a register, bits can be freed up in an instruction for other information or future expansion, and the offset can be made bigger (since a register will have more bits than are available in an instruction).  A bigger offset could allow for this operation to be used for smaller element sizes, or for bigger registers, where the offset size would need to be larger.  As a result, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Corbal such that the offset (location data) is in a control register.
f) Corbal, alone or as modified, has further taught the processing circuitry being responsive to execution of the splice instruction to extract from the first vector each data element identified by the control data in the at least one control register and to output the extracted data elements within a result vector register of data elements that also contains data elements from a second vector (see FIG.2 and note that in addition to the elements from source 1, the result vector register (“DESTINATION”) includes elements from a second vector.  Note that the control data in this example may be just the ones corresponding to elements ED.  Alternatively, note that the mask can take on any N-bit value (the value in FIG.2 is merely an example).  There may be a mask of all ones.  In such a case, the control data may be the entire mask);
g) Corbal, alone or as modified, has further taught wherein:
g1) the processing circuitry is arranged to output each extracted data element within sequential data element positions of the result vector register starting from a first end of the result vector register.  In any example where only the rightmost N bits are ‘1’ in the mask (N > 1), each extracted element is sequentially stored to the result starting from the right end.  For instance, if the mask in FIG.2 were instead 0000000000011111, then HGFED would be extracted and stored as a sequence from the right end of the destination.  When the mask is all ones, extracted elements are similarly stored.
g2) the splice instruction further identifies a second vector register storing the second vector of data elements (see FIG.2, source 2), and the processing circuitry is responsive to execution of the splice instruction to include, at each data element position in the result vector register unoccupied by each extracted data element, a data element from the second vector of data elements (when the mask is all ones, for instance, the result vector will include only first vector elements, and second vector elements where first vector elements do not exist.  Taking the source values shown in FIG.2, but with a mask of all ones (again, any mask is possible, as it is dependent on program conditions), the result would be SRQPONMLKJIHGFED.  Thus, PONMLKJIHGFED are extracted from the first vector to occupy 13 of the 16 result vector positions.  The remaining 3 are filled with SRQ, which are elements from the second vector);
g3) the processing circuitry is arranged to include within the result vector register sequential data elements starting from a first end of the second vector of data elements (from the explanation above, when the mask is all ones, the result vector register includes elements QRS, which start from a right end of the second vector); and
g4) each of the first source vector register, the second source vector register and the result vector register consist of a same number of data elements (see FIG.2, which shows 16 elements in each of these registers).
	Under the first interpretation (e1), the claim is rejected under 35 USC 102.
	Under the second interpretation (e2), the claim is rejected under 35 USC 103.
Referring to claim 5, Corbal, alone or as modified, has taught an apparatus as claimed claim 1, wherein: said one or more control registers comprise at least one predicate register (see FIG.2, mask), each predicate register used to store predicate data for each data element position within a vector of data elements (there is one bit per element within a vector); and the at least one control register identified in the splice instruction comprises one of said at least one predicate register, the processing circuitry being responsive to execution of the splice instruction to determine from the predicate data each data element to be extracted from the first vector (see FIG.2).
Referring to claim 6, Corbal, alone or as modified, has taught an apparatus as claimed in claim 5, wherein the predicate data provides the location data and the length data used to determine the one or more data elements to be extracted from the first vector of data elements (see FIG.2.  The length data (mask or selected mask bits) and location data (offset or selected mask bits) are for each element (each element is shifted) and are used to determine what to extract).
Referring to claim 7, Corbal, alone or as modified, has taught an apparatus as claimed in claim 6, wherein the predicate data identifies a first extraction data element position and a last extraction data element position, and the processing circuitry determines, as the data elements to be extracted from the first vector of data elements, a sequence of data elements between the first extraction data element position and the last extraction data element position (see FIG.2).
Referring to claim 8, Corbal, alone or as modified, has taught an apparatus as claimed in claim 1, wherein: said one or more control registers comprises one or more scalar registers for storing data values (mask is a scalar register, and the offset is stored in a scalar register); the at least one control register identified in the splice instruction comprises at least one scalar register, the processing circuitry being responsive to execution of the splice instruction to use the data value in each of the at least one scalar register when determining each data element to be extracted from the first vector (see FIG.2).
Referring to claim 13, Corbal, alone or as modified, has taught an apparatus as claimed in claim 1, wherein the first vector register is a predicate register used to store predicate data for each data element position within a vector of data elements (source 1 may be considered a predicate register because the extraction of any element therein is predicated on the configuration of mask and offset.  Thus, source 1 is related to a predicated operation and thus contains predicate data for each position within its vector).
Referring to claim 14, Corbal, alone or as modified, has taught an apparatus as claimed in claim 13, wherein each data element within the first vector register comprises a single bit (this is inherent as the smallest unit of data is a bit.  Thus, any element must have at least one bit.  In Corbal, the elements comprise a single bit (say bit 0), plus some other number of bits to bring the element size to 8, 16, 32, 64, etc.).  The examiner notes that if applicant wants to limit this claim to setting forth only 1-bit elements, applicant should either replace “comprises” with
--consists of-- or insert --only-- or --no more than-- after “comprises”.
Referring to claim 16, Corbal, alone or as modified, has taught an apparatus as claimed in claim 1, wherein the processing circuitry comprises vector permute circuitry (The inherent circuitry carrying out the operation of FIG.2 can create many rearrangements and orders of data in source 1, source2, and destination.  Thus, it is vector permute circuitry.  Alternatively, this is a non-limiting label for the inherency circuitry in Corbal).
Referring to claim 17, Corbal, alone or as modified, has taught an apparatus as claimed in claim 16, wherein the vector permute circuitry comprises:
a) first shift circuitry to perform a first shift operation on the first vector of data elements (see FIG.2, the first vector in source 1 is shifted right by 3) and second shift circuitry to perform a second shift operation on the second vector of data elements (see FIG.2, the first vector in source 2 is shifted right by 3);
b) combination circuitry to generate the result vector register from the vectors output by the first and second shift circuitry (see FIG.2); and
c) analysis circuitry to analyse the control data in the at least one control register in order to determine said one or more data elements to be extracted from the first vector of data elements, and to issue control signals to control the operation of the first and second shift circuitry in dependence on said analysis (see FIG.2).
d) the processing circuitry is responsive to execution of the splice instruction to include, at each data element position in the result vector register unoccupied by the extracted data elements, a data element from the second vector of data elements (again, the mask could take on any value between 00…0 and 11…1 (FIG.2 merely shows one example).  When the mask is all 1s, for instance, then any result element not filled with an element of the first source would be filled with an element from the second source).
Referring to claim 18, Corbal, alone or as modified, has taught an apparatus as claimed in claim 17, wherein the vector permute circuitry further comprises: first mask circuitry to perform a first mask operation on the vector output by the first shift circuitry in order to produce a first masked vector, and second mask circuitry to perform a second mask operation on the vector output by the second shift circuitry in order to produce a second masked vector; the combination circuitry being arranged to generate the result vector by combining the first and second masked vectors (see FIG.2.  The rightmost half of the mask bits and corresponding circuitry mask the first vector and the leftmost half mask the second vector, the results of which are combined).
Claim 20 is rejected for similar reasons as claim 1.
Claim 21 is rejected for similar reasons as claim 1.  The examiner notes that the processing means for executing a sequence of instructions including a splice instruction, and for extracting and outputting, is interpreted under 35 U.S.C. 112(f) as at least permute unit 80, which performs the splice (see page 11, lines 24-28).  An execution unit such as this inherently includes circuitry, as is known in the art.  For instance, one example permute unit circuit is shown in FIG.5.  Another example is shown in FIG.7.  As the specific circuits of FIGs.5 and 7 (for shifting and masking, and a crossbar) are not required to carry out the splice of FIG.2, for instance, the permute unit is not limited to the structures of FIGs.5 and 7.  Instead, the permute unit is interpreted to be a broad circuit that performs the claimed functions such as those shown in FIG.2, and any equivalents thereof.  As discussed above, Corbal has taught such a circuit.

Claims 9-12, 15, 19, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Corbal in view of the examiner’s taking of Official Notice.
Referring to claim 9, Corbal, as modified, has taught an apparatus as claimed in claim 8, wherein the splice instruction identifies a first scalar register whose stored data value provides the location data and the length data used to determine the one or more data elements to be extracted from the first vector of data elements.  Again, as modified, Corbal has taught both the location data and length data each stored in a scalar register, as they are not vector values.
Referring to claim 10, Corbal, as modified, has taught an apparatus as claimed in claim 9, wherein the stored data values in the first and second scalar registers identify a first extraction data element position and a last extraction data element position, and the processing circuitry determines, as the data elements to be extracted, a sequence of data elements between the first extraction data element position and the last extraction data element position (see FIG.2.  Again, given an offset and a mask of all ones, this information identifies a first element, a last element, and extraction of elements therebetween).
Claim 11 is rejected for similar reasons as claim 9.  Note that any register may be called a predicate register or scalar register.
Referring to claim 12, Corbal, alone or as modified, has taught an apparatus as claimed in Claim 1, but has not taught wherein the first vector register and the second vector register are the same vector register.  However, either operand may identify any register and by identifying the same register, one could realize a rotation or permutation of elements within a single register.  Rotation and permutation are very well-known operations in the art, and Corbal’s instruction would serve dual purpose by allowing it to specify the same register as both vector operands.  That is, not only would it perform the extraction/merge of two different vectors as shown in FIG.2, it would also allow for rotation and/or permutation of a single vector, which again are known and useful operations.  Having this capability would effectively render specific rotation and/or permutation instructions redundant, and, thus, they could be removed to open up space for other unique instructions in the instruction set.  As a result, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Corbal such that the first vector register and the second vector register are the same vector register.
Referring to claim 15, Corbal, alone or as modified, has taught an apparatus as claimed in claim 1, but has not taught wherein the processing circuitry is arranged to execute the splice instruction in each of a plurality of iterations, and, in each iteration, control data in the at least one control register identified by the splice instruction identifies one or more data elements to be extracted from the first vector of data elements that differ from the one or more data elements identified for extraction during a preceding iteration.  However, loop execution using the same instructions on different data is known in the art.  This would allow large amounts of data to be extracted/merged in different ways without having to repeat the code and lengthen the program.  A loop is a concise form of programming to allow for repetition while keeping the program smaller.  Whether this instruction appears in a loop is a choice of the programmer and depends entirely on application.  As a result, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Corbal such that the processing circuitry is arranged to execute the splice instruction in each of a plurality of iterations, and, in each iteration, control data in the at least one control register identified by the splice instruction identifies one or more data elements to be extracted from the first vector of data elements that differ from the one or more data elements identified for extraction during a preceding iteration.
Referring to claim 19, Corbal, as modified, has taught an apparatus as claimed in claim 16, wherein the processing circuitry is responsive to execution of the splice instruction to include, at each data element position in the result vector unoccupied by the extracted data elements, a data element from the second vector of data elements (see the rejection of claim 1.  Again, the mask could take on any value between 00…0 and 11…1.  When the mask is all 1s, for instance, then any result element not filled with an element of the first source would be filled with an element from the second source.
Corbal hasn’t taught wherein the vector permute circuitry comprises: programmable crossbar circuitry to generate the result vector from the first vector of data elements and the second vector of data elements; and analysis circuitry to analyse the at least one control register in order to determine said one or more data elements to be extracted from the first vector of data elements, and to issue control signals to control the operation of the programmable crossbar circuitry in dependence on said analysis.  However, a programmable crossbar is well known in the art and allows for data to be quickly sent to any location the crossbar is connected to, thereby allowing for moving data to any other possible location.  This is particularly useful in Corbal where a given element in the first vector could be moved to any element location in the destination.  A crossbar being a known and fast implementation to rearrange data, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Corbal such that the vector permute circuitry comprises: programmable crossbar circuitry to generate the result vector from the first vector of data elements and the second vector of data elements; and analysis circuitry to analyse the at least one control register in order to determine said one or more data elements to be extracted from the first vector of data elements, and to issue control signals to control the operation of the programmable crossbar circuitry in dependence on said analysis.
Referring to claim 22, Corbal, alone or as modified, has taught the apparatus of claim 1, but has not taught a computer program stored on a non-transitory computer readable storage medium that, when executed by a data processing apparatus, provides a virtual machine which provides an instruction execution environment corresponding to the apparatus of claim 1.  However, a virtual machine program is known in the art.  When executed, it allows one to emulate a physical environment, for instance, in software, which relieves the need to actually have the physical environment in hardware.  As a result, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Corbal to include a computer program stored on a non-transitory computer readable storage medium that, when executed by a data processing apparatus, provides a virtual machine which provides an instruction execution environment corresponding to the apparatus of claim 1.

---------------------------------------------------------------------------------------------------------------------

Claim Rejections - 35 USC § 102/103
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 20-21, and 23 are rejected under 35 U.S.C. 102(a)(1) as anticipated by Tanaka et al., U.S. Patent Application Publication No. 2004/0068642 A1 (herein referred to as Tanaka), or, in the alternative, under 35 U.S.C. 103 as obvious over Tanaka in view of Corbal.
Referring to claim 1, Tanaka has taught an apparatus, comprising:
a) a set of vector registers (see FIG.78, vector registers Ra-Rc and Rx, which form at least part of a set);
b) one or more control registers (see FIG.78, Rx); and
c) processing circuitry to execute a sequence of instructions including a splice instruction (from paragraph [0103], FIG.78 shows execution of a bytesel instruction, which is a splice instruction that performs a permutation and would be within a sequence of executed instructions) identifying at least a first vector register (FIG.78, Ra) and at least one control register (FIG.78, Rx), the first vector register storing a first vector of data elements (FIG.78, elements a1 to a4) having a vector length that is dependent on a number of data elements in the first vector and a size of the data elements (the total vector Ra is 32 bits because there are four 8-bit elements),
d) Regarding the limitation the at least one control register storing control data identifying, independently of the size of the data elements, one or more data elements, this is not patentable for multiple reasons:
d1) From FIG.78, control register Rx includes control data that identifies one or more data elements.  The control data can be said to identify elements dependent on the number of elements in Ra and Rb, and not on the size of the elements.  That is, because there are eight elements to choose from among Ra and Rb, the control data includes 3-bit fields to select one of eight elements.  The size can be said to be inconsequential.  Under this interpretation, the claim is rejected under 35 U.S.C. 102.
d2) Alternatively, if one were to somehow interpret the Rx data as being vector size dependent, the examiner notes that this limitation is not taught by Tanaka.  However, this is only because Tanaka uses 32-bit registers.  Corbal, on the other hand, teaches different size registers in FIG.10 (128-bit xmm registers, 256-bit ymm registers, and 512-bit zmm registers) and the size of the data therein can be indicated by the instruction (paragraph [0029]).  To increase flexibility of the system and instruction in Tanaka, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Tanaka such that the instruction of FIG.78 is a select instruction that can indicate the size of the elements in ones of registers of varying size.  In other words, instead of just selecting among eight 8-bit elements in 32-bit registers in FIG.78 based on 3-bit fields in Rx, Tanaka could configurably select among eight 16-bit elements in 64-bit registers; eight 32-bit elements in 128-bit registers; eight 64-bit elements in 256-bit registers, and so on, all based on the same 3-bit fields in Rx.  Thus, here it can be seen that the control data in Rx is not size dependent because the same size fields in Rx can select among different sized elements.  Under this interpretation, the claim is rejected under 35 U.S.C. 103.
e) Tanaka has further taught the one or more data elements comprising one of: one data element within the first vector of data elements, or a plurality of data elements occupying sequential data element positions within the first vector of data elements (see FIG.78.  The one or more data elements identified could include just a4, or at least sequential elements a4 and a3),
f) Tanaka has further taught wherein the control data comprises location data identifying a location of a given data element within the first vector register (the rightmost 2 bits of Rx select the location) and length data, different from the location data, identifying a number of data elements, including the given data element, to extract from the first vector of data elements (every 3rd bit starting from the right in Rx is the location data.  In the example of FIGs.78-79B, assume the 12 rightmost bits in Rx are 110 111 010 011.  This sequence would form a result of b3-b4-a3-a4.  Starting from the right end of the 12-bit sequence, the 3rd bits are 1100.  The 00 identifies two elements from Ra, including the element at the identified location);
g) Tanaka has further taught the processing circuitry being responsive to execution of the splice instruction to extract from the first vector each data element identified by the control data in the at least one control register and to output the extracted data elements within a result vector register of data elements that also contains data elements from a second vector (see FIG.78 and the example given above.  Elements are extracted from first vector Ra and put into result vector register Rc, which may also include elements from second vector Rb); wherein:
g1) the processing circuitry is arranged to output each extracted data element within sequential data element positions of the result vector register starting from a first end of the result vector register (see FIG.78.  For some values in Rx, the extracted elements from Ra may start from the right end of Rc.  For instance, a4 may be the rightmost element in Rc and a3 may be the second rightmost element in Rc);
g2) the splice instruction further identifies a second vector register storing the second vector of data elements (FIG.78, Rb), and the processing circuitry is responsive to execution of the splice instruction to include, at each data element position in the result vector register unoccupied by each extracted data element, a data element from the second vector of data elements (again, for certain masks, where an element from Ra is not extracted, an element from Rb will be extracted.  From the example above, the rightmost element of Rc may be a4, the next may be a3, and if those are the only elements from Ra, the two leftmost elements of Rc will be from Rb);
g3) the processing circuitry is arranged to include within the result vector register sequential data elements starting from a first end of the second vector of data elements (again, any values may be in Rx, including those that select elements from Rb starting from a first end of Rb); and
g4) each of the first source vector register, the second source vector register and the result vector register consist of a same number of data elements (see FIG.78).
Claim 20 is rejected for similar reasons as claim 1.
Claim 21 is rejected for similar reasons as claim 1.  The examiner notes that the processing means for executing a sequence of instructions including a splice instruction, and for extracting and outputting, is interpreted under 35 U.S.C. 112(f) as at least permute unit 80, which performs the splice (see page 11, lines 24-28).  An execution unit such as this inherently includes circuitry, as is known in the art.  For instance, one example permute unit circuit is shown in FIG.5.  Another example is shown in FIG.7.  As the specific circuits of FIGs.5 and 7 (for shifting and masking, and a crossbar) are not required to carry out the splice of FIG.2, for instance, the permute unit is not limited to the structures of FIGs.5 and 7.  Instead, the permute unit is interpreted to be a broad circuit that performs the claimed functions such as those shown in FIG.2, and any equivalents thereof.  As discussed above, Tanaka has taught such a circuit.
Referring to claim 23, Tanaka, alone or as modified, has taught an apparatus as claimed in claim 1, wherein:
a) the first vector of data elements has a first end of the first vector of data elements corresponding to the first end of the second vector of data elements (see FIG.78, both Ra and Rb have a right end (having elements a4/b4)) and a second end of the first vector of data elements at an opposite end to the first end of the first vector of data elements (see FIG.78, Ra has a left end which is opposite to right end (having element a1)); and
b) the processing circuitry is responsive to the control data identifying the one or more data elements as excluding at least one data element at the second end of the first vector of data elements (the control data in Rx may be set such that at least element a1 at the left end is excluded from extraction), to include, at each data element position in the result vector register unoccupied by the one or more data elements, the sequential data elements extracted from the first end of the second vector of data elements (again, the control data in Rx may be set so that positions in Rc not occupied with elements from the first vector are filled with sequential elements from the second vector starting from the right end.  An example result would be Rc = b3-b4-a3-a4).

Claim 22 is rejected under 35 U.S.C. 103 as being unpatentable over Tanaka, alone or in view of Corbal, and in view of the examiner’s taking of Official Notice.
Referring to claim 22, Tanaka, alone or as modified, has taught the apparatus of claim 1, but has not taught a computer program stored on a non-transitory computer readable storage medium that, when executed by a data processing apparatus, provides a virtual machine which provides an instruction execution environment corresponding to the apparatus of claim 1.  However, a virtual machine program is known in the art.  When executed, it allows one to emulate a physical environment, for instance, in software, which relieves the need to actually have the physical environment in hardware.  As a result, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify Tanaka to include a computer program stored on a non-transitory computer readable storage medium that, when executed by a data processing apparatus, provides a virtual machine which provides an instruction execution environment corresponding to the apparatus of claim 1.

Examiner Note
Due to time constraints, the examiner has not fully addressed all dependent claims with respect to Tanaka.  Instead, only the independent claims and at least the new dependent claim have been examined with respect to Tanaka.  However, this is not an indication that the remining dependent claims are allowable over Tanaka.  Should applicant amend to overcome Corbal (in a manner other than simply writing a dependent claim into independent form), the examiner may reject additional dependent claims with respect to Tanaka at that time, where possible.  Any dependent claim rejections based on Tanaka will be necessitated by applicant's amendment to overcome Corbal.

Response to Arguments
On page 10 of applicant’s response, applicant requests consideration of the 476-page Intel document.
The examiner directs applicant’s attention to paragraph 4 of the final rejection mailed on April 1, 2021.  Single page 7-29 (chapter 7, page 29) of this Intel reference has already been considered and was cited by the examiner on May 1, 2019.  Thus, the citation by applicant of this document is a duplicate.  If applicant wants pages other than page 7-29 considered, applicant will need to submit another IDS with those other pages cited.

On pages 11-12 of applicant’s response, applicant’s argument with respect to the second case is persuasive.  Consequently, the portion of the rejection based on the second case has been withdrawn.

On page 12 of applicant’s response, applicant argues that when the mask is all 1s (e.g. 0xFFFF), the mask is disabled and the result would be the example of FIG.1, not FIG.2.
The examiner notes that applicant is incorrectly conflating a disabled mask with not using a mask.  These are not the same thing.  The instruction of FIG.2 will always identify a mask, and that mask could happen to be all ones, in which case, the mask is still used, but nothing is masked out, i.e., masking is effectively disabled. 

On page 12 of applicant’s response, applicant argues that if the examiner is only able to rely on a single case (mask of all ones), then the mask cannot be reasonably construed as providing length data or position data as it imparts no data.
The examiner respectfully disagrees.  A mask of all ones indicates length/position as set forth in parts e1 and e2 in the claim 1 rejection.  Again, as long as a mask is used, it imparts information, even is that information means no masking.  For a mask to not be length/position information, it would have to not even be specified by the instruction (such as in FIG.1, which is not being relied upon by the examiner).

On page 13 of applicant’s response, applicant argues that the Office Action does not describe how the Official Notice is being applied.
The examiner disagrees.  In part (e2) of the claim 1 rejection, the examiner takes Official Notice by asserting that it is well known in the art to provide an operand in a register versus as an immediate value.  The examiner also notes that additional prior art for this has not yet been cited because applicant has not adequately traversed the Official Notice (as explained in the previous action).  Applicant has also not stated why the noticed features are not well known in the art, which is part of a proper traversal (MPEP 2144.03(C)).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to David J. Huisman whose telephone number is 571-272-4168.  The examiner can normally be reached on Monday-Friday, 9:00 am-5:30 pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta, can be reached at 571-270-3995.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/David J. Huisman/Primary Examiner, Art Unit 2183