DETAILED ACTION
Claims 1, 3-9, and 11-20 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on March 29, 2021, has been entered.
 
Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors.  Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.
The disclosure submitted on March 29, 2021, is objected to for the following:
In paragraph [00128], applicant now states that cache 1313 is a shared L2 cache unit.  Given that FIG.13 shows a single core 1390, this amendment now newly sets forth that a shared L2 cache unit is within a core.  This does not appear to be supported by the original disclosure.  35 U.S.C. 132(a) states that no amendment 
--Core 1390 includes cache 1313.--.
Paragraph [00136], and its use of number 1413, is objected to for similar reasons.
In paragraph [00253], 2nd to last line, insert --factor-- where “scale” used to be.
Applicant has amended original paragraph [00253] instead of amended paragraph [00253] submitted on October 20, 2020.
In paragraphs [00257], applicant replaced 2550 with 2554.  Why is 2554 used when the same paragraph previously uses 2558 to identify what is presumably the same field?  Number 2554 is associated with a BETA field.
In paragraphs [00270], applicant replaced 2550 with 2554.  Why is 2554 used when the same paragraph previously uses 2559A to identify what is presumably the same field?  Number 2554 is associated with a BETA field.
Paragraph [00297] does not reflect changes made to the same paragraph on October 2, 2020.  Some language was reverted back to the original language without appropriate markings (37 CFR 1.121).
In paragraph [00297], the examiner is confused as to what scale field is being referred to.  Applicant now associates the scale field of the SIB byte with field 2560.  Isn’t this paragraph discussing the scale (SS) field 2652 of the SIB byte?
Paragraph [0322] does not reflect changes made to the same paragraph on October 2, 2020.  Some language was reverted back to the original language without appropriate markings (37 CFR 1.121).
Paragraphs [00365]-[00504] include similar language found in the claims, and the claims include a number of issues that have required correction, as set forth in a number of Office Actions.  Thus, these paragraphs must be corrected for similar reasons as the claims (e.g. to fix grammar, improve clarity, etc.).
Appropriate correction is required.
A substitute specification excluding the claims is required pursuant to 37 CFR 1.125(a) because there have been a significant number of amendments (more of which are needed), deletion of at least one paragraph, and amendments that are not properly indicated with underline/strike-through, among other issues.  Thus, to reduce chance for confusion at printing, a substitute specification is appropriate.
A substitute specification must not contain new matter.  The substitute specification must be submitted with markings showing all the changes relative to the immediate prior version of the specification of record.  The text of any added subject matter must be shown by underlining the added text. The text of any deleted matter must be shown by strike-through except that double brackets placed before and after the deleted characters may be used to show deletion of five or fewer consecutive characters.  The text of any deleted subject matter must be shown by being placed within double brackets if strike-through cannot be easily perceived.  An accompanying clean version (without markings) and a statement that the substitute specification contains no new matter must also be supplied.  Numbering the paragraphs of the specification of record is not considered a change that must be shown.

Drawings
Replacement FIG.6 is objected to because of the following minor informalities:
In FIG.6, why is there a colon before each “N” in the last column of the matrix 611?  It appears the colons should be deleted.
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference character(s) not mentioned in the description: 2652.
The replacement FIGs are objected to under 37 CFR 1.84(u)(1) because they do not appear in order.  That is, FIGs.27-35 appear at the end (after FIG.42) instead of in their numbered order.  It would appear that applicant needs to submit a full set of replacement drawings in the proper order.
All replacement FIGs submitted on October 2, 2020, and March 29, 2021, are objected to for failing to comply with 37 CFR 1.84(a)(1) and 37 CFR 1.84(l), which requires the drawings be in black, and that all drawings be made by a process which will give them satisfactory reproduction characteristics.  Every line, number, and letter must be durable, clean, solid black (except for color drawings), sufficiently dense and dark, and uniformly thick and well-defined.  The weight of all lines and letters must be heavy enough to permit adequate reproduction.  This requirement applies to all lines however fine, to shading, and to lines representing cut surfaces in sectional views.  The drawings are pixelated because applicant did not use black (RGB = 000), despite appearing black to the naked eye.  This has been confirmed by the examiner through use of an Adobe color inspection tool on applicant’s submitted pdf file.  When black is not used, the dithering used to convert applicant's grayscale image to black and white will add white pixels to try to estimate applicant's "gray" color, and the final drawings may not print properly.  Therefore, applicant must be sure to use only black and white.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

Claims 16 and 18-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Gueron et al., U.S. Patent Application Publication No. 2015/0277912 A1 (herein referred to as Gueron).
Referring to claim 16, Gueron has taught a processor (FIG.1, processor 100) comprising:
decode circuitry (FIG.1, decoder 104) to decode an instruction (FIG.1, instruction 102, which may be the “single source sort indexes” instruction of FIG.3 (and paragraph [0048]), or the “single source sort indexes and data elements” instruction of FIG.4 (and paragraph [0056])), the instruction to include a first field to identify a location of a source vector (see FIG.1 and FIG.3 (or FIG.4), field 310 (or 410), and paragraphs [0042] and [0048] (or [0057]); the instruction would include a first field to identify a location 110 of source vector 310 (or 410).  Location 110 may be a register (as shown in FIG.1), memory location, or other storage location), a second field to identify a location of a destination vector (see FIG.1 and FIG.3 (or FIG.4), and paragraphs [0042] and [0050] (or [0058]); the instruction would include a second field to identify a location 114 of destination vector 314 (or 414).  Location 114 may be a register (as shown in FIG.1), memory location, or other storage location), and an opcode to indicate to execution circuitry to execute the decoded instruction (the instruction of FIG.3 (or FIG.4) has an inherent opcode to control the execution circuitry in the desired manner) to index values of the source vector and store a result of the indexing in the destination vector by generating, per each element of the source vector, an index value using one or more comparisons (see FIG.3 (or FIG.4).  An index is generated for each source element and stored in the destination.  From paragraph [0044], this is done by sort circuitry/logic such as a compare and swap chain.  To sort values, inherent comparisons must be performed to determine the order of any two given values), wherein a type of comparison to be performed for the generation of an index value is set by one of the opcode and an immediate (as discussed above, the instruction of FIG.3 (or FIG.4) causes an inherent comparison to create sorted indices/elements (under a first interpretation, this is the type of comparison is that designed to be performed in response to this ; and
b) the execution circuitry to execute the decoded instruction as indicated by the opcode (see FIG.1, execution unit 106).
Claims 18-19 are respectively rejected for similar reasons as claims 4-5 (see below).
Referring to claim 20, Gueron has taught the processor of claim 16, wherein the type of at least one of the one or more comparisons is one of equal to, greater than, greater than or equal to, less than, and less than or equal to (this is inherent.  Any system that sorts must have logic to perform at least one of the comparison types.  The system cannot sort without determining values relative to one another).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3-9, and 11-20 are rejected under 35 U.S.C. 103 as being unpatentable over Gueron, in view of Mohamed, U.S. Patent Application Publication No. 2008/0104374 A1, and the examiner’s taking of Official Notice.
Referring to claim 1, Gueron has taught a processor (FIG.1, processor 100) comprising:
a) decode circuitry (FIG.1, decoder 104) to decode an instruction (FIG.1, instruction 102, which may be the “single source sort indexes and data elements” instruction of FIG.4 and paragraph [0056]), the instruction to include a first field to identify a location of a source vector (see FIGs.1 and 4, and paragraphs [0042] and [0057]; the instruction would include a first field to identify a location 110 of source vector 410.  Location 110 may be a register (as shown in FIG.1), memory location, or other storage location), a second field to identify a location of a destination vector (see FIGs.1 and 4, and paragraphs [0042] and [0059]; the instruction would include a second field to identify a location 116 of destination vector 416.  Location 116 may be a register (as shown in FIG.1), memory location, or other storage location), and an opcode to indicate to execution circuitry to execute the decoded instruction (the instruction of FIG.4 has an inherent “single source sort indexes and data elements” opcode) to sort values of the source vector and store a result of the sort in the destination vector (see FIG.4 and note the values from source 410 are sorted and stored as destination vector 416); and
b) the execution circuitry to execute the decoded instruction as indicated by the opcode (see FIG.1, execution unit 106).
c) Gueron has also taught generating, per each element of the source vector, an index value (see FIG.4; note generation of index values 414, one for each element of the source vector 410).
d) Gueron has not explicitly taught (is silent with respect to) the generating using one or more comparisons of the element to itself and to other data elements of the source vector.  However, Mohamed, in FIG.14, has taught a hardware sorter that generates the same type of index values 1410, one for each of multiple source elements 102, using multiple comparisons of an element to itself and to other elements (for instance, at the top left, 18 is compared to itself.  Below that, 18 is compared to remaining elements -11, 21, 18, 3, etc.  This occurs for each element in a matrix of comparators).  Since Gueron and Mohamed generate index values of the same type, Mohamed’s sorter is usable in Gueron to arrive at the expected result.  As explained in paragraph [0002], this sorter provides for fast hardware sorting that is useful in a variety of using one or more comparisons of the element to itself and to other data elements of the source vector.
e) Gueron is also silent with respect to wherein a type of comparison to be performed for the generation of an index value is set by one of the opcode and an immediate.  However, note that Mohamed’s sorter can be used with different types of comparisons.  See FIG.14, and note the top operation, which is a greater-than comparison.  FIG.13 also shows it can do a less-than operation.  The greater-than and less-than operations allows for sorting in increased and decreased order (see paragraph [0041]).  For instance, sorting in the order opposite that shown in FIG.14 would be performed by indicating a less-than operation, and counting the number of 1s below the diagonal in output 1402.  The examiner notes that such a sorter would provide Gueron the ability to perform multiple comparison/sorting operations with tie-break functionality.  In addition, it is known in the art to distinguish operations (including types of comparisons) via opcode or immediate fields of an instruction.  This would allow Gueron’s instruction to directly control the sorter as opposed to having to look in a separate mode register, or to retrieve data from a general purpose registers.  Consequently, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Gueron such that a type of comparison to be performed for the generation of an index value is set by one of the opcode and an immediate.  This would allow, for instance, targeted comparison/sorting operation.  For instance, one could perform a greater-than operation (using a greater-than opcode/immediate) to sort in one direction, and a less-than operation (using a less-than opcode/immediate) to sort in the opposite direction. 
permuting the values of the elements of the source vector based upon the index values for the elements.  Instead, with respect to the operation of FIG.4, Gueron discusses storing the index values, and permuting the source values, but not necessarily permuting based upon the index values for the elements.  However, in paragraph [0053], Gueron sets forth that when a different instruction (that in FIG.3) stores only index values (and not sorted source values), a subsequent permutation instruction uses those index values to store sorted source values.  Thus, Gueron teaches permuting based on the index values.  One would expect the permutation to work the same regardless of whether it is performed on its own (following the instruction of FIG.3), or as part of a combined operation (performed by the instruction of FIG.4).  Even if this wasn’t strongly implied by Gueron, the examiner notes that this is a trivial implementation that would have been obvious to try to realize the desired result.  It is a predictable solution with an expectation of success (because paragraph [0053] indicates the solution works).  As a result, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gueron for permuting the values of the elements of the source vector based upon the index values for the elements when executing the instruction of FIG.4.
Referring to claim 3, Gueron, as modified, has taught the processor of claim 1, but has not taught wherein the processor is a graphics processing unit (GPU) that supports ray tracing.  However, Gueron has taught that the processor could be a graphics processor (see at least paragraph [0037]).  Ray tracing is a known concept in the art for rendering graphics by taking into account paths of light and how they interact with other objects.  It allows for a high degree of virtual realism compared to other rendering techniques.  In addition, ray tracing is known to use sorting to improve data locality and performance.  As a result, it would have been the processor is a graphics processing unit (GPU) that supports ray tracing.
Referring to claim 4, Gueron, as modified, has taught the processor of claim 1, wherein the locations of the destination vector and the source vector are vector registers (see FIG.1, which shows the vectors in packed registers, and paragraph [0042]).
Referring to claim 5, Gueron, as modified, has taught the processor of claim 1, wherein the location of the destination vector is a vector register (see FIG.1, which shows the destination vector 116 in a packed/vector register) and the location of the source vector is at least one location in memory (see paragraph [0042].  Instead of a register, as shown in FIG.1, source 110 could be stored in a memory location).
Referring to claim 6, Gueron, as modified, has taught the processor of claim 1, wherein the type of comparison is one of equal to, greater than (see FIG.14 of Mohamed, and note that comparison 1404 involves a greater-than (XJ > XI) comparison), greater than or equal to, less than, and less than or equal to.
Referring to claim 7, Gueron, as modified, has taught the processor of claim 1, wherein to break ties between comparison results, the execution circuitry is to perform a first comparison between elements and a second comparison between elements (see FIG.14 of Mohamed, and note that a greater-than comparison (1404) is first performed between elements, and because there are ties in the comparison results 1406 (e.g. there are three 4s and two 1s), a second comparison (equal-to) 1402 is performed to break the tie and realize the final indexes 1408. 
Referring to claim 8, Gueron, as modified, has taught the processor of claim 1, wherein the execution circuitry comprises matrix operations circuitry (see Mohamed, FIG.14, which makes use of a comparator matrix to perform the comparisons (see paragraph [0038]).
Claims 9 and 11-13 are respectively rejected for similar reasons as claims 1 and 3-5.
Claim 14 is rejected for similar reasons found in claims 1 and 7.
Claim 15 is rejected for similar reasons found in claims 1 and 8.
Claim 16 is alternatively rejected for similar reasons as claim 1.  Note that claim 16 does not actually require the permutation based upon the index values.  Thus, part (f) of the claim 1 rejection is not applicable here.
Claims 17-20 are alternatively and respectively rejected for similar reasons as claims 3-6.

Claims 9 and 12-14 are alternatively rejected under 35 U.S.C. 103 as being unpatentable over Gueron.
Claim 9 is rejected for similar reasons found in the rejection of claim 16 and claim 1 (part (f)).  Note, that since claim 9 does not require the comparisons of claim 1, Mohamed is unnecessary to reject this claim.
Claims 12-14 are respectively rejected for similar reasons as claims 4-5 and 20.

Claims 11, 15, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Gueron in view of the examiner’s taking of Official Notice.
Claim 11 is rejected for similar reasons as claim 3.
Referring to claim 15, Gueron, as modified, has taught the processor of claim 9, but, based on the interpretation taken for the alternative rejection of claim 9, has not taught wherein the execution circuitry comprises matrix operations circuitry, when interpreted to perform operations on matrices.  However, such circuitry is known in the art to carry out well known matrix math operations such as matrix multiplication, matrix addition, etc.  As a result, in order to make Gueron useful for matrix mathematics, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gueron such that the execution circuitry comprises matrix operations circuitry for performing operations on matrices.
Claim 17 is rejected for similar reasons as claim 3.

Response to Arguments
On page 15 of applicant’s response, applicant notes that no changes have been made to paragraphs [00365]-[00504].
The examiner’s objection is maintained.  As previously stated, there are other issues in these paragraphs that require correction.  For example, in paragraphs [00368], [00377], [00388], and many others, “source vectors” needs to be replaced with --source vector--.  Please review all objections/112(b) rejections (throughout prosecution) for other issues.

On page 15 of applicant’s response, applicant notes that no changes have been made to FIG.6.
The examiner’s objection is maintained.

On page 15 of applicant’s response, applicant argues that there is nothing in 37 CFR 1.84(u) that requires the sheets be ordered.
The examiner respectfully disagrees.  37 CFR 1.84(u)(1) states “The different views must be numbered in consecutive Arabic numerals, starting with 1, independent of the numbering of the sheets and, if possible, in the order in which they appear on the drawing sheet(s).”  Applicant’s original drawings were in order.  This is proof that consecutively numbering the drawings in the order they appear is possible.

On page 16 of applicant’s response, applicant argues the 102 rejection, stating that a type of comparison to be performed being set by one of the opcode and immediate has not been asserted or described.
The examiner respectfully disagrees.  See the final rejection mailed on November 27, 2020, paragraph 17, part (a), near the end.

On pages 17-18 of applicant’s response, applicant argues the 103 rejection, stating that it is unclear as to what relevance Mohamed’s comparisons have to dictating the comparison type using an immediate or opcode.  Applicant states that Mohamed does not use an opcode or immediate to dictate the type of comparison.
As stated, it would have been obvious to use Mohamed’s sorter in Gueron to generate the index values.  Mohamed’s sorter is flexible and allows for different types of comparisons to be performed.  As such, the sorter inherently requires input to control which type of comparison to perform.  When Gueron’s sort instruction uses Mohamed’s sorter to generate the desired output, the instruction, at least via opcode, will tell the sorter how to operate to generate the result.  Also, the examiner notes that an opcode and/or immediate value is known to provide control.  An example is the IA-64 compare instruction, which has various bits within the instruction that 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to David J. Huisman whose telephone number is 571-272-4168.  The examiner can normally be reached on Monday-Friday, 9:00 am-5:30 pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta, can be reached on 571-270-3995.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/David J. Huisman/Primary Examiner, Art Unit 2183