DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

RESPONSE TO ARGUMENTS
Applicant's arguments filed 9/9/2022 have been fully considered but they are not persuasive.

In response to applicant’s arguments with regard to the independent claim 1 rejected under 35 U.S.C. 103(a) that the combination of the references does not teach/suggest the claimed feature “… a plurality of arithmetic logic units … a circuit coupled from the first vector register and the second vector register to a first arithmetic logic unit of the plurality of arithmetic logic units … the circuit further couples the first vector register and the second vector register to a second arithmetic logic unit of the plurality of arithmetic logic units …” because Mimar does not cure the defects of BURGESS’ defect of having a single execute state 12 as Mimar does not show any detail about its multiple ALUs or how the circuitry might be connected to those ALUs; applicant's arguments have fully been considered, but are not found to be persuasive.
The examiner respectfully disagrees, and to further clarify, Mimar’s plurality of processing element with corresponding ALU is part a SIMD unit (Mimar, Fig. 1; and [0025]), wherein it is well-known to one of ordinary skilled in the art that the SIMD unit’s processing elements carry out parallel processing/operations. To further clarify, High-Performance embedded computing by Jõao M.P. Cardoso, Jośe Gabriel F. Coutinho, and Pedro C. Diniz shows how SIMD unit’s plurality of processing elements (e.g. associated with plurality of Op 1) may be connected to the vectors (e.g. associated with Vector A and Vector B) in Figure 2.10. Additionally, Wikipedia: Single instruction, multiple data: also discloses Single Instruction, Multiple Data (SIMD) include multiple processing elements that perform the same operation on multiple data points simultaneously (On Page 1).
As applicant appears to be applying the above arguments for independent claim 1 towards independent claims 15 and 18, the examiner will also apply the above response for independent claim 1 towards independent claims 15 and 18.

I. DOUBLE PATENTING
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claim 15 rejected on the ground of nonstatutory double patenting as being unpatentable over claims 8-9 of U.S. Patent No. 10,877,925. Although the claims at issue are not identical, they are not patentably distinct from each other because U.S. Patent No. 10,877,925 teaches/suggests an apparatus, comprising: 
a plurality of vector registers configured to store a plurality of vectors of data elements respectively (e.g. vector registers with data to be communicated to corresponding 2:1 multiplexer) (claim 8); 
a register configured to store an index (e.g. VFR value from VFR) (claims 8-9); 
a plurality of processing units configured to operate in parallel (e.g. ALUs) (claim 8); and 
a circuit (e.g. 2:1 multiplexors) controlled by the register and coupled between the plurality of vector registers and the plurality of processing units, the circuit configured to provide a plurality of input vectors to the plurality of processing units respectively, wherein the circuit is configured to generate each respective input vector in the plurality of input vectors using: a first portion selected according to the index from a first vector register in the plurality of vector registers; and a second portion selected according to the index from a second vector register in the plurality of vector registers (claims 8-9). (Please note that as both the instant and patented applications claimed similar subject matters, the examiner is selecting one of the independent claims from the instant and patented applications for the instant double patenting rejection)

II. REJECTIONS BASED ON PRIOR ART
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over BURGESS et al. (US Pub.: 2019/0213009) in view of Mimar (US Pub.: 2013/0212354).

As per claim 1, BURGESS teaches/suggests an apparatus, comprising: a first vector register configured to store a first vector having first data elements of a predetermined number (e.g. associated with one of the registers 22 in Fig. 3: [0061]); a second vector register configured to store a second vector having second data elements of the predetermined number (e.g. associated with another one of the registers 22 in Fig. 3: [0061]); a third register configured to store an index (e.g. associated with DLR 46 in Fig. 3: [0061]); an arithmetic logic unit (e.g. associated with execution stage 12 in Fig. 1: [0051]-[0053]; and [0061]); and a circuit (e.g. associated with Fig. 3, ref. 44) coupled from the first vector register and the second vector register (e.g. associated with registers 22 in Fig. 3) to the arithmetic logic unit (e.g. associated with execution stage 12 in Fig. 1), the circuit configured to provide a third vector, the third vector including data elements selected according to the index from the first vector and the second vector (e.g. associated with multiplexer 44 outputting to the execution stage 12 base on the bit in the DLR 46), the circuit further couples the first vector register and the second vector register accordingly (Fig. 1; Fig. 3; [0051]-[0053] and [0061]).
BURGESS does not expressly teach the apparatus comprising: 
a plurality of arithmetic logic units; and
coupled to a first arithmetic logic unit of the plurality of arithmetic logic units, to provide to the first arithmetic logic unit to generate an output vector, further couples to a second arithmetic logic unit of the plurality of arithmetic logic units.
Mimar teaches/suggests an apparatus, comprising: a plurality of arithmetic logic units (e.g. associated with a plurality of processing elements of the SIMD unit for parallel operations); and coupled to a first arithmetic logic unit of the plurality of arithmetic logic units (e.g. associated with being coupled to a first processing element of the plurality of processing elements), to provide to the first arithmetic logic unit to generate an output vector (e.g. associated with output of vector operation unit 180 in Fig. 1), further couples to a second arithmetic logic unit of the plurality of arithmetic logic units (e.g. associated a second processing element of the plurality of processing elements) (Fig. 1; and [0025]-[0029], wherein it is well-known to one of ordinary skilled in the art that the SIMD unit’s processing elements carry out parallel processing/operations: please see High-Performance embedded computing, Section 2.3.1; Figure 2.10; and Wikipedia: Single instruction, multiple data, page 1). 
It would have been obvious for one of ordinary skill in this art, before the effective filing date of the claimed invention, to include Mimar’s processing architecture including the vector operation unit into BURGESS’ apparatus for the benefit of implementing an efficient sorting of data array elements (Mimar, [0006]-[0007]) to obtain the invention as specified in claim 1.

As per claim 2, BURGESS and Mimar teach/suggest all the claimed features of claim 1 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein the third vector has the predetermined number of third data elements, including a first portion selected from the first vector and a second portion selected from the second vector (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 3, BURGESS and Mimar teach/suggest all the claimed features of claim 2 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein in the third vector the second portion follows the first portion as a vector input to the arithmetic logic unit (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 4, BURGESS and Mimar teach/suggest all the claimed features of claim 3 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein in the first vector register the first portion starts at the index; and in the second vector register the second portion ends before the index (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 5, BURGESS and Mimar teach/suggest all the claimed features of claim 4 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein the circuit includes a multiplexer configured to select the first portion and the second portion from the first vector register and the second vector register to form the third vector as the vector input to the arithmetic logic unit (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 6, BURGESS and Mimar teach/suggest all the claimed features of claim 5 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein the apparatus further comprises: a fourth vector register configured to store a fourth vector having fourth data elements of the predetermined number; and wherein the circuit is further configured to couple the second vector register and the fourth vector register to the second arithmetic logic unit, the circuit including a multiplexer configured to select data elements according to the index from the second vector register and the fourth vector register to provide a fifth vector as a vector input to the second arithmetic logic unit (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features. 

As per claim 7, BURGESS and Mimar teach/suggest all the claimed features of claim 6 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein the fifth vector has the predetermined number of fifth data elements, including a third portion selected from the second vector register and a fourth portion selected from the fourth vector register (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 8, BURGESS and Mimar teach/suggest all the claimed features of claim 7 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein in the fifth vector the fourth portion follows the third portion as the vector input to the second arithmetic logic unit; in the second vector register the third portion starts at the index; and in the fourth vector register the fourth portion ends before the index; the second portion and the third portion are non-overlapping portions in the second vector register; and the second portion and the third portion provide the second data elements stored in the second vector register (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 9, BURGESS and Mimar teach/suggest all the claimed features of claim 8 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein the first arithmetic logic unit and the second arithmetic logic unit process the third vector and the fourth vector respectively in parallel (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 10, BURGESS and Mimar teach/suggest all the claimed features of claim 5 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein the apparatus further comprises: a fourth vector register configured to store a fourth vector having fourth data elements of the predetermined number; and wherein the circuit is further configured to couple the fourth vector register and the first vector register to the second arithmetic logic unit, the circuit including a multiplexer configured to select data elements according to the index from the fourth vector register and the first vector register to provide a fifth vector as a vector input to the second arithmetic logic unit (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 11, BURGESS and Mimar teach/suggest all the claimed features of claim 10 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein the fifth vector has the predetermined number of fifth data elements, including a third portion selected from the fourth vector register and a fourth portion selected from the first vector register (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 12, BURGESS and Mimar teach/suggest all the claimed features of claim 11 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein in the fifth vector the fourth portion follows the third portion as the vector input to the second arithmetic logic unit; in the fourth vector register the third portion starts at the index; and in the first vector register the fourth portion ends before the index (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 13, BURGESS and Mimar teach/suggest all the claimed features of claim 12 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein the first portion and the fourth portion are non-overlapping portions in the first vector register; the first portion and the fourth portion provide the first data elements stored in the first vector register; and the first arithmetic logic unit and the second arithmetic logic unit process the third vector and the fifth vector respectively in parallel (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 14, BURGESS and Mimar teach/suggest all the claimed features of claim 5 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein the apparatus further comprises: a second arithmetic logic unit, wherein the circuit is further configured to couple the second vector register and the first vector register to the second arithmetic logic unit, the circuit including a multiplexer configured to select data elements according to the index from the second vector register and the first vector register to provide a fourth vector as a vector input to the second arithmetic logic unit, the first arithmetic logic unit and the second arithmetic logic unit configured to process the third vector and the fourth vector respectively in parallel (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 15, BURGESS teaches/suggests an apparatus, comprising: a plurality of vector registers configured to store a plurality of vectors of data elements respectively (e.g. associated with registers 22 in Fig. 3: [0061]); a register configured to store an index (e.g. associated with DLR 46 in Fig. 3: [0061]); a processing unit (e.g. associated with execution stage 12 in Fig. 1: [0051]-[0053]; and [0061]); and a circuit (e.g. associated with Fig. 3, ref. 44) controlled by the register (e.g. associated with DLR 46 in Fig. 3: [0061]) and coupled between the plurality of vector registers (e.g. associated with registers 22 in Fig. 3: [0061]) and the processing unit, wherein the circuit is configured to generate input vector using: a first portion selected according to the index from a first vector register in the plurality of vector registers; and a second portion selected according to the index from a second vector register in the plurality of vector registers (e.g. associated with multiplexer 44 outputting to the execution stage 12 base on the bit in the DLR 46) (Fig. 1; Fig. 3; [0051]-[0053] and [0061]).
BURGESS does not teach the apparatus, comprising: 
a plurality of processing units configured to operate in parallel; and
the circuit configured to provide a plurality of input vectors to the plurality of processing units respectively, wherein the circuit is configured to generate each respective input vector in the plurality of input vectors.
Mimar teaches/suggests an apparatus, comprising: a plurality of processing units configured to operate in parallel (e.g. associated with SIMD unit’s vector operation unit 180 in Fig. 1, wherein it is well-known to one of ordinary skilled in the art that the SIMD unit’s processing elements carry out parallel processing/operations: please see High-Performance embedded computing, Section 2.3.1; Figure 2.10; and Wikipedia: Single instruction, multiple data, page 1); and the circuit configured to provide a plurality of input vectors (e.g. associated with output from select logic #1 150 and select logic #2 160 in Fig. 1) to the plurality of processing units (e.g. associated with vector operation unit 180 in Fig. 1) respectively, wherein the circuit is configured to generate each respective input vector in the plurality of input vectors accordingly (Fig. 1; and [0025]-[0029]).
It would have been obvious for one of ordinary skill in this art, before the effective filing date of the claimed invention, to include Mimar’s processing architecture including the vector operation unit and selection logics into BURGESS’ apparatus for the benefit of implementing an efficient sorting of data array elements (Mimar, [0006]-[0007]) to obtain the invention as specified in claim 15.

As per claim 16, BURGESS and Mimar teach/suggest all the claimed features of claim 15 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein each vector of data elements stored in a respective vector register in the plurality of vector registers has two portions identified by the index; and the circuit is configured to provide the two portions respective in two different input vectors in the plurality of input vectors (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 17, BURGESS and Mimar teach/suggest all the claimed features of claim 16 above, where BURGESS and Mimar further teach/suggest the apparatus comprising wherein the circuit includes a plurality of multiplexers configured to provide the plurality of input vectors to the plurality of processing units respectively in parallel (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 18, claim 18 is rejected in accordance to the same rational and reasoning as the above rejection of claims 1 and 15.

As per claim 19, BURGESS and Mimar teach/suggest all the claimed features of claim 18 above, where BURGESS and Mimar further teach/suggest the method comprising wherein the generating comprises: selecting, using a plurality of multiplexers in the circuit and according to the index, data elements from the plurality of vector registers to generate the plurality of input vectors respectively in parallel (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.

As per claim 20, BURGESS and Mimar teach/suggest all the claimed features of claim 18 above, where BURGESS and Mimar further teach/suggest the method comprising wherein the generating comprises: retrieving a plurality of data elements from the plurality of vector registers respectively in parallel; and directing, using a plurality of multiplexers in the circuit and according to the index, the plurality of data elements to the plurality of processing units respectively in parallel (BURGESS, Fig. 1; Fig. 3; [0051]-[0053]; [0061]; and Mimar, Fig. 1; [0025]-[0029]), wherein if would have been obvious to one of ordinary skilled in the art to further implement the above claimed features.



III. PERTINENT RELETD PRIOR ART
High-Performance embedded computing: discloses Single Instruction, Multiple Data (SIMD) units that perform same operation on multiple data concurrently, wherein the SIMD unit may executes four operations in parallel (Section 2.3.1 and Figure 2.10, on page 1).
Wikipedia: Single instruction, multiple data: discloses Single Instruction, Multiple Data (SIMD) include multiple processing elements that perform the same operation on multiple data points simultaneously (On Page 1).

IV. CLOSING COMMENTS

CONCLUSION
STATUS OF CLAIMS IN THE APPLICATION
The following is a summary of the treatment and status of all claims in the application as recommended by M.P.E.P.  707.07(i):
CLAIMS REJECTED IN THE APPLICATION
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
    
DIRECTION OF FUTURE CORRESPONDENCES
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHUN KUAN LEE whose telephone number is (571)272-0671.  The examiner can normally be reached on Monday-Friday.				
IMPORTANT NOTE
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Idriss Alrobaye can be reached on (571) 270-1023.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CHUN KUAN LEE/Primary Examiner
Art Unit 2181                                                                                                                                                                                                        October 26, 2022