DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

I. ELECTION / RESTRICTION
Claims 12-17 are withdrawn from further consideration pursuant to 37 CFR 1.142(b) as being drawn to a nonelected Specie II, there being no allowable generic or linking claim. Election was made without traverse in the reply filed on 1/20/2021.

II. REJECTIONS BASED ON PRIOR ART
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

1-11 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Ronchetti et al. (US Patent 6,301,654) in view of Barsness et al. (US Pub.: 2009/0240930) and Tam et al. (US Patent 10,007,521). 

As per claim 1, Ronchetti teaches/suggests an apparatus comprising: having a symbolic store address (e.g. equate to tag assigned to store instruction) for a store instruction to be executed (e.g. as assigned tag would have been buffered with the store instruction in the instruction queue or corresponding buffer/queue for comparison) (col. 1, ll. 38-64; col. 2, l. 16 to col. 4, l. 34; and col. 5, ll. 3-31).
Ronchetti does not teach the apparatus comprising: 
a plurality of execution lanes to perform parallel execution of instructions; and 
a unified symbolic store address buffer coupled to the plurality of execution lanes, the unified symbolic store address buffer comprising a plurality of entries each for storing and to be executed by at least some of the plurality of execution lanes.
Barsness teaches/suggests an apparatus comprising: a plurality of execution lanes to perform parallel execution of instructions (e.g. associated with parallel processing mode, wherein processing of each processor corresponds to one of the execution lanes); and being coupled to the plurality of execution lanes, and to be executed by at least some of the plurality of execution lanes ([0005]; [0026]-[0031]; [0045]; [0063]-[0064]; and [0105]-[0106]). 
Tam teaches/suggests an apparatus comprising: a unified symbolic store address buffer (e.g. equate to register file 108) being coupled accordingly, the unified symbolic store address buffer comprising a plurality of entries each for storing (e.g. 
It would have been obvious for one of ordinary skill in this art, before the effective filing date of the claimed invention, to include Barsness’s parallel processing architecture and Tam’s register buffering operations into Ronchetti’s system for the benefit of obtaining the results faster (Barsness, [0005]) and enhancing performance (Barsness, [0064]), and increasing issue width without significantly increasing area occupied by the processing system, without significantly increasing power dissipation of the processing system, and without significantly impacting logic complexity of the various elements of the processing system (Tam, col. 4, ll. 5-17) of the processing system  to obtain the invention as specified in claim 1.

As per claim 2, Ronchetti, Barsness and Tam teach/suggest all the claimed features of claim 1 above, where Ronchetti, Barsness and Tam further teach/suggest the apparatus further comprising a scheduler to generate the symbolic store address based on at least some address fields of the store instruction, the symbolic store address comprising a plurality of fields including a displacement field, a base register field, and an index register field (Ronchetti, col. 1, ll. 38-64; col. 2, l. 16 to col. 4, l. 34; col. 5, ll. 3-31; Barsness, [0005]; [0026]-[0031]; [0045]; [0063]-[0064]; [0105]-[0106]; and Tam, Fig. 1; col. 3, l. 14 to col. 6, l. 26), wherein it would have been obvious to one of ordinary skilled in the art to implement the above claimed features. 

Ronchetti, Barsness and Tam teach/suggest all the claimed features of claim 2 above, where Ronchetti, Barsness and Tam further teach/suggest the apparatus comprising wherein the plurality of fields further includes a scale factor field and an operand size field (Ronchetti, col. 1, ll. 38-64; col. 2, l. 16 to col. 4, l. 34; col. 5, ll. 3-31; Barsness, [0005]; [0026]-[0031]; [0045]; [0063]-[0064]; [0105]-[0106]; and Tam, Fig. 1; col. 3, l. 14 to col. 6, l. 26), wherein it would have been obvious to one of ordinary skilled in the art to implement the above claimed features. 
 
As per claim 4, Ronchetti, Barsness and Tam teach/suggest all the claimed features of claim 2 above, where Ronchetti, Barsness and Tam further teach/suggest the apparatus comprising wherein the scheduler is, for a load instruction following the store instruction in program order, to generate a symbolic load address for the load instruction based on at least some address fields of the load instruction and access the unified symbolic store address buffer based on the symbolic load address, to determine whether the load instruction conflicts with an in-flight store instruction (Ronchetti, col. 1, ll. 38-64; col. 2, l. 16 to col. 4, l. 34; col. 5, ll. 3-31; Barsness, [0005]; [0026]-[0031]; [0045]; [0063]-[0064]; [0105]-[0106]; and Tam, Fig. 1; col. 3, l. 14 to col. 6, l. 26), wherein it would have been obvious to one of ordinary skilled in the art to implement the above claimed features. 
 
As per claim 5, Ronchetti, Barsness and Tam teach/suggest all the claimed features of claim 4 above, where Ronchetti, Barsness and Tam further teach/suggest the apparatus comprising wherein in response to a determination that the load Ronchetti, col. 1, ll. 38-64; col. 2, l. 16 to col. 4, l. 34; col. 5, ll. 3-31; Barsness, [0005]; [0026]-[0031]; [0045]; [0063]-[0064]; [0105]-[0106]; and Tam, Fig. 1; col. 3, l. 14 to col. 6, l. 26), wherein it would have been obvious to one of ordinary skilled in the art to implement the above claimed features. 
 
As per claim 6, Ronchetti, Barsness and Tam teach/suggest all the claimed features of claim 4 above, where Ronchetti, Barsness and Tam further teach/suggest the apparatus comprising wherein in response to a determination that the load instruction does not conflict with the in-flight store instruction, the scheduler is to speculatively dispatch the load instruction to the plurality of execution lanes (Ronchetti, col. 1, ll. 38-64; col. 2, l. 16 to col. 4, l. 34; col. 5, ll. 3-31; Barsness, [0005]; [0026]-[0031]; [0045]; [0063]-[0064]; [0105]-[0106]; and Tam, Fig. 1; col. 3, l. 14 to col. 6, l. 26), wherein it would have been obvious to one of ordinary skilled in the art to implement the above claimed features. 
 
As per claim 7, Ronchetti, Barsness and Tam teach/suggest all the claimed features of claim 6 above, where Ronchetti, Barsness and Tam further teach/suggest the apparatus comprising wherein in response to the speculative dispatch of the load instruction, at least some of the plurality of execution lanes are to compute a lane load address for the load instruction, execute the load instruction and store the lane load address into a memory order queue of the execution lane (Ronchetti, col. 1, ll. 38-64; Barsness, [0005]; [0026]-[0031]; [0045]; [0063]-[0064]; [0105]-[0106]; and Tam, Fig. 1; col. 3, l. 14 to col. 6, l. 26), wherein it would have been obvious to one of ordinary skilled in the art to implement the above claimed features. 
 
As per claim 8, Ronchetti, Barsness and Tam teach/suggest all the claimed features of claim 7 above, where Ronchetti, Barsness and Tam further teach/suggest the apparatus comprising wherein at retirement of the store instruction, each of the plurality of execution lanes is to compute a lane store address for the store instruction and determine based at least in part on contents of the memory order queue whether one or more load instructions conflict with the store instruction (Ronchetti, col. 1, ll. 38-64; col. 2, l. 16 to col. 4, l. 34; col. 5, ll. 3-31; Barsness, [0005]; [0026]-[0031]; [0045]; [0063]-[0064]; [0105]-[0106]; and Tam, Fig. 1; col. 3, l. 14 to col. 6, l. 26), wherein it would have been obvious to one of ordinary skilled in the art to implement the above claimed features. 
 
As per claim 9, Ronchetti, Barsness and Tam teach/suggest all the claimed features of claim 8 above, where Ronchetti, Barsness and Tam further teach/suggest the apparatus comprising wherein in response to a determination of the conflict in a first execution lane, the first execution lane is to flush the one or more load instructions from the first execution lane (Ronchetti, col. 1, ll. 38-64; col. 2, l. 16 to col. 4, l. 34; col. 5, ll. 3-31; Barsness, [0005]; [0026]-[0031]; [0045]; [0063]-[0064]; [0105]-[0106]; and Tam, Fig. 
 
As per claim 10, Ronchetti, Barsness and Tam teach/suggest all the claimed features of claim 1 above, where Ronchetti, Barsness and Tam further teach/suggest the apparatus comprising wherein the apparatus is to dynamically disable speculative execution of load instructions based at least in part on a performance metric of an application in execution (Ronchetti, col. 1, ll. 38-64; col. 2, l. 16 to col. 4, l. 34; col. 5, ll. 3-31; Barsness, [0005]; [0026]-[0031]; [0045]; [0063]-[0064]; [0105]-[0106]; and Tam, Fig. 1; col. 3, l. 14 to col. 6, l. 26), wherein it would have been obvious to one of ordinary skilled in the art to implement the above claimed features. 
 
As per claim 11, Ronchetti, Barsness and Tam teach/suggest all the claimed features of claim 10 above, where Ronchetti, Barsness and Tam further teach/suggest the apparatus comprising wherein the performance metric comprises a mis-speculation rate (Ronchetti, col. 1, ll. 38-64; col. 2, l. 16 to col. 4, l. 34; col. 5, ll. 3-31; Barsness, [0005]; [0026]-[0031]; [0045]; [0063]-[0064]; [0105]-[0106]; and Tam, Fig. 1; col. 3, l. 14 to col. 6, l. 26), wherein it would have been obvious to one of ordinary skilled in the art to implement the above claimed features. 

As per claim 18, Ronchetti teaches/suggests a system comprising: a scheduler to generate (e.g. equate to dispatcher unit 271), at store address dispatch of a store instruction to be executed and prior to computation of a lane store address for the store 
Ronchetti does not teach the system comprising:
a processor comprising: 
a host processor comprising a plurality of cores, wherein a first core is to execute a first thread; and 
a data parallel cluster coupled to the host processor, the data parallel cluster comprising: 
a plurality of execution lanes to perform parallel execution of instructions of a second thread related to the first thread; 
being executed by the plurality of execution lanes and operated upon by each of the plurality of execution lanes; and 
a unified symbolic store address buffer coupled to the plurality of execution lanes to store accordingly;  and 
a system memory coupled to the processor.
Barsness teaches/suggests a system comprising: a processor comprising: a host processor comprising a plurality of cores (e.g. associated with node with plurality of processors/cores), wherein a first core is to execute a first thread (e.g. associated with master processor/core executing a thread and may spawn threads for cooperate 
Tam teaches/suggests a system comprising: a unified symbolic store address buffer (e.g. equate to register file 108) coupled to store (e.g. associated with the register file storing tags) (Fig. 1; and col. 3, l. 14 to col. 6, l. 26).
It would have been obvious for one of ordinary skill in this art, before the effective filing date of the claimed invention, to include Barsness’s parallel processing architecture and Tam’s register buffering operations into Ronchetti’s system for the benefit of obtaining the results faster (Barsness, [0005]) and enhancing performance (Barsness, [0064]), and increasing issue width without significantly increasing area occupied by the processing system, without significantly increasing power dissipation of the processing system, and without significantly impacting logic complexity of the various elements of the processing system (Tam, col. 4, ll. 5-17) of the processing system  to obtain the invention as specified in claim 18.

As per claim 19, Ronchetti, Barsness and Tam teach/suggest all the claimed features of claim 18 above, where Ronchetti, Barsness and Tam further teach/suggest Ronchetti, col. 1, ll. 38-64; col. 2, l. 16 to col. 4, l. 34; col. 5, ll. 3-31; Barsness, [0005]; [0026]-[0031]; [0045]; [0063]-[0064]; [0105]-[0106]; and Tam, Fig. 1; col. 3, l. 14 to col. 6, l. 26), wherein it would have been obvious for the combination of the references to further teach/suggest the above claimed features as conflicts between load instruction and stores instruction are detected. 
 
As per claim 20, Ronchetti, Barsness and Tam teach/suggest all the claimed features of claim 19 above, where Ronchetti, Barsness and Tam further teach/suggest the system comprising wherein in response to a determination that the load instruction does not conflict with the in-flight store instruction, the plurality of execution lanes are to compute a lane load address for the load instruction, speculatively execute the load instruction and store the lane load address in a memory order queue of the execution lane, and at retirement of the store instruction compute the lane store address for the store instruction and determine, based at least in part on contents of the memory order queue, whether one or more load instructions conflict with the store instruction (Ronchetti, col. 1, ll. 38-64; col. 2, l. 16 to col. 4, l. 34; col. 5, ll. 3-31; Barsness, [0005]; [0026]-[0031]; [0045]; [0063]-[0064]; [0105]-[0106]; and Tam, Fig. 1; col. 3, l. 14 to col. 

III. PERTINENT PRIOR ART NOT RELIED UPON
Kesiraju et al. (US Pub: 2020/0272597)
IV. CLOSING COMMENTS
CONCLUSION
STATUS OF CLAIMS IN THE APPLICATION
The following is a summary of the treatment and status of all claims in the application as recommended by M.P.E.P.  707.07(i):
CLAIMS REJECTED IN THE APPLICATION
Per the instant office action, claims 1-11 and 18-20 have received a first action on the merits and are subject of a first action non-final.

DIRECTION OF FUTURE CORRESPONDENCES
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHUN KUAN LEE whose telephone number is (571)272-0671.  The examiner can normally be reached on Monday-Friday.				
IMPORTANT NOTE
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Idriss Alrobaye can be reached on (571) 270-1023.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
/CHUN KUAN LEE/Primary Examiner
Art Unit 2181                                                                                                                                                                                                        February 12, 2021