DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

CONTINUED EXAMINATION UNDER 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/12/2021 has been entered.

RESPONSE TO ARGUMENTS
Applicant’s arguments with respect to claims 1-2, 4-8, 10-15, 17, and 19-23 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

I. OBJECTIONS TO THE CLAIMS
Claim 12 is objected to because of the following informalities:  
in claim 12, line 48, “… conveying a processing operation …” should be replaced with -… conveying the processing operation …-.
Appropriate correction is required.

II. REJECTIONS BASED ON PRIOR ART
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 4-8, 10-15, 17, and 19-23 are rejected under 35 U.S.C. 103 as being unpatentable over Applicant’s Admitted Prior Art (AAPA) in view of Greg et al. (US Pub.: 2008/0155135),  Mizrahi (US Patent 7,334,091), Qui et al. (US Pub.: 2013/0145124), and Park et al. (US Patent 5,799,163)

As per claim 1, AAPA teaches/suggests a processor comprising: an execution unit operable to execute programs to perform processing operations (Drawings, Fig. 1, ref. 4); and one or more slave accelerators each operable to perform respective processing operations (Drawings, Fig. 1, ref. 6-8); wherein: the execution unit is operable to communicate with the slave accelerators to cause the slave accelerators to 
AAPA do not teach the processor comprising:
local storage operable to store data for including in a communication pending the inclusion of those data;
providing data into the local storage; 
to retrieve the data from the local storage;
to fetch: in a first data fetching mode, a single operand value for each of a plurality of threads; and in a second data fetching mode, plural operand values for each thread of a subset of the threads; and
to send in a different order to the order in which data are fetched into the local storage;
having a processing operation, and the processing operation that is to be performed.
Greg teaches/suggests a processor comprising: local storage (e.g. associated with Fig. 2, ref. 2160-2168) operable to store data for including in a communication 
Mizrahi teaches/suggests a system comprising: to send in a different order to the order in which data are fetched (e.g. equate to performing out-of-order read operation with the queue) (Fig. 1-3; Fig. 8; col. col. 1, ll. 37-50; and col. 2, l. 3 to col. 3, l. 34).
Qui teaches/suggests a system comprising: to fetch: in a first data fetching mode, a single operand value for each of a plurality of threads (e.g. correspond to thread that access one operand to be retrieved, as each thread may execute an instruction that access one or more operands to be retrieved); and in a second data fetching mode, plural operand values for each thread of a subset of the threads (e.g. correspond to thread that access operands to be retrieved, as each thread may execute an instruction that access one or more operands to be retrieved) ([0004]-[0005]; and [0075]-[0079])
Park teaches/suggests a system comprising: having a processing operation (e.g. association with instruction with corresponding addresses for source operands and opcode), and the processing operation that is to be performed (e.g. equate to instruction that is to be carried out by processor) (col. 1, ll. 23-40).
It would have been obvious for one of ordinary skill in this art, before the effective filing date of the claimed invention, to include Greg’s queuing architecture between the processors, Mizrahi’s queue/buffer operations Qui’s fetching operations and Park’s instruction with opcode and operands into AAPA’s processor communicating architecture for the benefit of facilitating design flexibility of the architecture (Greg, [0019]), implementing a robust FIFO queue that support first-in-first-out read and out-of-Mizrahi, col. 1, ll. 38-50), retrieving multiple operands in a single register access operation without resource conflict (Qui, [0010]), and allowing the processor to properly carry out the instruction (Park, col. 1, ll. 23-40) to obtain the invention as specified in claim 1.

As per claim 2, AAPA, Greg, Mizrahi, Qui and Park teach/suggest all the claimed features of claim 1 above, where AAPA, Greg, Mizrahi and Qui further teach/suggest the processor comprising wherein the message generation circuit comprises: a data request interface via which the data fetching circuit can send requests for operand data from the storage of the execution unit where that data is stored; and a response data interface that interfaces with the local storage of the message generation circuit, via which the requested data values are returned and stored in the local storage of the message generation circuit (AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; [0033]; [0057]; Mizrahi, Fig. 1-3; Fig. 8; col. col. 1, ll. 37-50; col. 2, l. 3 to col. 3, l. 34; and Qui, [0004]-[0005]; [0075]-[0079]), functionally equate to the proper transferring of data via the corresponding buffering architecture.

As per claim 4, AAPA, Greg, Mizrahi, Qui and Park teach/suggest all the claimed features of claim 1 above, where AAPA, Greg, Mizrahi and Qui further teach/suggest the processor comprising wherein the message generation circuit is operable to fetch in a single fetch cycle: a single operand value for each thread of a thread group that the execution unit is configured to handle (AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; [0033]; [0057]; Mizrahi, Fig. 1-3; Fig. 8; col. Qui, [0004]-[0005]; [0075]-[0079]), functionally equate to the proper transferring of data via the corresponding buffering architecture.

As per claim 5, AAPA, Greg, Mizrahi, Qui and Park teach/suggest all the claimed features of claim 1 above, where AAPA, Greg, Mizrahi and Qui further teach/suggest the processor comprising wherein each message that is sent to a slave accelerator has the capacity to convey the value of a particular operand for only a subset of the threads of a thread group that the execution unit is configured to handle (AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; [0033]; [0057]; Mizrahi, Fig. 1-3; Fig. 8; col. col. 1, ll. 37-50; col. 2, l. 3 to col. 3, l. 34; and Qui, [0004]-[0005]; [0075]-[0079]), wherein it would have been obvious that the resulting combination of the references would further teaches/suggests the above claimed features.

As per claim 6, AAPA, Greg, Mizrahi, Qui and Park teach/suggest all the claimed features of claim 1 above, where AAPA, Greg, Mizrahi and Qui further teach/suggest the processor comprising wherein the local storage of the message generation circuit is in the form of a lookup table having a plurality of entries, with each entry able to store one or more data values, and having associated with it one or more of: an index value identifying the entry in the lookup table; an indication of whether the entry in question is valid; and an indication of whether the entry is free to receive new data (AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; Mizrahi, Fig. 1-3; Fig. 8; col. col. 1, ll. 37-50; col. 2, l. 3 to col. 3, l. 34; and Qui, [0004]-[0005]; [0075]-[0079]), functionally equate to the proper transferring of data via the corresponding buffering architecture.

As per claim 7, AAPA, Greg, Mizrahi, Qui and Park teach/suggest all the claimed features of claim 1 above, where AAPA, Greg, Mizrahi and Qui further teach/suggest the processor comprising wherein there is a fixed mapping between the relative operand position for a message instruction and the entry in the local storage of the message generation circuit where the values for that operand will be stored (AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; [0033]; [0057]; Mizrahi, Fig. 1-3; Fig. 8; col. col. 1, ll. 37-50; col. 2, l. 3 to col. 3, l. 34; and Qui, [0004]-[0005]; [0075]-[0079]), functionally equate to the proper transferring of data via the corresponding buffering architecture.

As per claim 8, AAPA, Greg, Mizrahi, Qui and Park teach/suggest all the claimed features of claim 1 above, where AAPA, Greg, Mizrahi and Qui further teach/suggest the processor comprising wherein the mapping between the relative operand positions for a message instruction and the entries in the local storage of the message generation circuit where the values for the operands will be stored are set in use, when operand values are to be stored in the local storage (AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; [0033]; [0057]; Mizrahi, Fig. 1-3; Fig. 8; col. col. 1, ll. 37-50; col. 2, l. 3 to col. 3, l. 34; and Qui, [0004]-[0005]; [0075]-

As per claim 10, AAPA, Greg, Mizrahi, Qui and Park teach/suggest all the claimed features of claim 21 above, where AAPA, Greg, Mizrahi and Qui further teach/suggest the processor comprising wherein the data fetching circuit conveys the identity of the local storage entry that has been allocated to store the operand values to the message sending circuit by writing the identity of the allocated local storage entry into a queue from which the message sending circuit will then retrieve the local storage entry identity when it is sending a message that requires the operand values to a slave accelerator to thereby identify the local storage entry where the operand values are stored (AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; [0033]; [0057]; Mizrahi, Fig. 1-3; Fig. 8; col. col. 1, ll. 37-50; col. 2, l. 3 to col. 3, l. 34; and Qui, [0004]-[0005]; [0075]-[0079]), functionally equate to the proper transferring of data via the corresponding buffering architecture.

As per claim 11, AAPA, Greg, Mizrahi, Qui and Park teach/suggest all the claimed features of claim 1 above, where AAPA, Greg and Qui further teach/suggest the processor comprising wherein the processor is a graphics processor (AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; [0033]; [0057]; and Qui, [0004]-[0005]; [0024]-[0025]; [0028]; [0075]-[0079]).

AAPA, Greg, Mizrahi, Qui and Park further teach/suggest the method comprising wherein in a first data fetching mode, a single operand value for each of a plurality of threads of a thread group that the execution unit is configured to handle is fetched in a single fetch cycle; and in a second data fetching mode, plural operand values for each thread of a subset of the threads of a thread group that the execution unit is configured to handle is fetched in a single fetch cycle; wherein in one message sending mode, the order in which data values are sent in messages to the slave accelerator is different to the order in which those data values are fetched into the local storage of the message generation circuit (AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; [0033]; [0057]; Mizrahi, Fig. 1-3; Fig. 8; col. col. 1, ll. 37-50; col. 2, l. 3 to col. 3, l. 34; Qui, [0004]-[0005]; [0075]-[0079]; and Park, col. 1, ll. 23-40).

As per claims 13-15, claims 13-15 are rejected in accordance to the same rational and reasoning as the above rejection of claims 6-8.

As per claim 17, AAPA, Greg, Mizrahi, Qui and Park teach/suggest all the claimed features of claim 22 above, where AAPA, Greg, Mizrahi and Qui further teach/suggest the method comprising: sending with a request for operand values to be fetched, the identity of the local storage entry that has been allocated to store the operand values; and using the identity of the local storage entry that has been allocated to store the operand values sent with the fetch request to identify which local storage AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; [0033]; [0057]; Mizrahi, Fig. 1-3; Fig. 8; col. col. 1, ll. 37-50; col. 2, l. 3 to col. 3, l. 34; and Qui, [0004]-[0005]; [0075]-[0079]), wherein it would have been obvious for the resulting combination of the references to further teach/suggest the above claimed features by functionally equating to the proper transferring of data via the corresponding buffering architecture.

As per claim 19, AAPA, Greg, Mizrahi, Qui and Park teach/suggest all the claimed features of claim 22 above, where AAPA, Greg, Mizrahi and Qui further teach/suggest the method comprising: conveying the identity of the local storage entry that has been allocated to store the operand values to the message sending process by writing the identity of the allocated local storage entry into a queue from which it can be retrieved when sending a message that requires the operand values to a slave accelerator to thereby identify the local storage entry where the operand values are stored (AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; [0033]; [0057]; Mizrahi, Fig. 1-3; Fig. 8; col. col. 1, ll. 37-50; col. 2, l. 3 to col. 3, l. 34; and Qui, [0004]-[0005]; [0075]-[0079]), wherein it would have been obvious for the resulting combination of the references to further teach/suggest the above claimed features by functionally equating to the proper transferring of data via the corresponding buffering architecture.



As per claim 21, AAPA, Greg, Mizrahi, Qui and Park teach/suggest all the claimed features of claim 1 above, where AAPA, Greg, Mizrahi and Qui further teach/suggest the processor comprising wherein the data fetching circuit is configured to, when the values of an operand are to be fetched into the local storage, first obtain an entry in the local storage in which the operand values will be stored (Mizrahi, associated with obtaining address information associated with write pointer (220)); and the data fetching circuit is configured to convey the identity of the local storage entry that has been allocated to store the operand values to the message sending circuit, such that the message sending circuit can then use that identity of the local storage entry to retrieve the operand values from the local storage (Mizrahi, associated with reading data value from the queue (200) with address information associated with  read pointer (210) that is being conveyed to multiplexer (800) in Figure 8) (AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; [0033]; [0057]; Mizrahi, Fig. 1-3; Fig. 8; col. col. 1, ll. 37-50; col. 2, l. 3 to col. 3, l. 34; and Qui, [0004]-[0005]; [0075]-[0079]).

As per claim 22, AAPA, Greg, Mizrahi, Qui and Park teach/suggest all the claimed features of claim 12 above, where AAPA, Greg, Mizrahi and Qui further teach/suggest the method comprising, when the values of an operand are to be fetched into the local storage, first allocating (Mizrahi, associated with the pointer pointing to AAPA, Drawings, Figure 1; Specification, [0001]-[0017]; Greg, Fig. 2-3; [0006]; [0018]-[0028]; [0033]; [0057]; Mizrahi, Fig. 1-3; Fig. 8; col. col. 1, ll. 37-50; col. 2, l. 3 to col. 3, l. 34; and Qui, [0004]-[0005]; [0075]-[0079]).

As per claim 23, claim 23 is rejected in accordance to the same rational and reasoning as the above rejection of claim 21.
III. CLOSING COMMENTS
CONCLUSION
STATUS OF CLAIMS IN THE APPLICATION
The following is a summary of the treatment and status of all claims in the application as recommended by M.P.E.P.  707.07(i):
CLAIMS REJECTED IN THE APPLICATION
Per the instant office action, claims 1-2, 4-8, 10-15, 17, and 19-23 have received a first action on the merits and are subject of a first action non-final.
    
DIRECTION OF FUTURE CORRESPONDENCES
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHUN KUAN LEE whose telephone number is (571)272-0671.  The examiner can normally be reached on Monday-Friday.				
IMPORTANT NOTE
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Idriss Alrobaye can be reached on (571) 270-1023.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
/CHUN KUAN LEE/Primary Examiner
Art Unit 2181                                                                                                                                                                                                        November 18, 2021