Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/28/2020 has been entered.

Response to Amendment
The amendment filed on 12/28/2020 has been entered and made of record. Claims 2-22, 24-25, 29-30, 32-33, 35-37, 39-40 and 42 are amended. Claims 23, 31 and 38 are cancelled. Claims 21-22, 24-30, 32-37 and 39-42 are pending.

Response to Arguments
Applicant’s arguments with respect to claims 21, 29 and 36 have been fully considered but they are moot because the arguments do not apply to the references being used in the current rejection

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 8/19/2020 is being considered by the examiner.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 21-22, 24-25, 27-30, 32-33, 35-37, 39-40 and 42 are rejected under 35 U.S.C. 103 as being unpatentable over Jensen et al. (US 7,657,891) in view of Joy et al. (US 2003/0191927) and Mutlu et al. (US 2009/0138670).
As to Claim 21, Jensen teaches an apparatus comprising: 
one or more processors including a graphics processor, the one or more processors including a plurality of processing units, a plurality of thread schedulers to schedule threads for the plurality of processing units, and a processing pipeline (Jensen discloses a multithreading processor includes an execution pipeline and a thread scheduler that dispatches instructions of the threads to ; and 
a plurality of shared function units associated with graphics processing in the processing pipeline, each shared function unit of the plurality of shared function units being shared by two or more of the plurality of processing units of the one or more processors (Jensen discloses “A multithreaded microprocessor typically allows the multiple threads to share the functional units of the microprocessor (e.g., instruction fetch and decode units, caches, branch prediction units, and load/store, integer, floating-point, SIMD, etc. execution units) in a concurrent fashion” in C2L15-19; see also shared functional units in C6L9-16; execution pipeline in Abstract. Joy further teaches a first/second graphic unit in [0161]. Here, Jensen’s processing pipeline can be modified by the teaching of Joy to process graphics processing pipeline);
wherein the one or more processors are to: detect and observe stall signals communicated from one or more shared function units of the plurality of shared function units to the processing units of the one or more processors (Jensen discloses “The execution pipeline detects a stalling event in response to an instruction issued to the execution pipeline. The instruction is included in one of the plurality of threads defined as the stalling thread” in C3L13-16; “The method includes detecting a stalling event, in response to an instruction dispatched to the execution 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Jensen with the teaching of Joy so as to share the same processor pipeline in multithreading for graphical pipeline application (Joy, Abstract).
Jensen and Joy don’t directly use the claim language “estimate stalls”. The combination of Mutlu further teaches following limitations:
estimate stalls that may occur further down the processing pipeline for one or more threads of a plurality of threads from operation of the one or more shared function units, wherein estimation of stalls that may occur is based on the detected stall signals from the one or more shared function units and on which shared function units of the plurality of function units are to receive the one or more threads in processing (Jensen discloses “The stall likelihood priority generator 1104 generates a stall likelihood priority 1102 for the instruction 1114 based on the register usage information and based on processor state information 1112 received from the microprocessor 100 pipeline” in C25L9-13; “The stall likelihood priority 1102 indicates the likelihood that the instruction will be executed without stalling based on its register usage” in C25L38-40; see also Fig 11. Mutlu further discloses a thread analysis component 412 associated with the scheduling component 410 to maintain two values for each thread: Tshared and Talone and to express the estimated memory-related stall-time experienced by the thread in the shared DRAM system (or an approximation thereof) when running alongside other threads… “Based on these two estimates, the can compute a memory-slowdown S for each thread, where S=Tshare/Talone. In one example, a thread has high memory-slowdown S if memory-related stall-time actually experienced by the thread is high and the stall time would have been low without interference caused by other threads” in [0042]; “to estimate slowdowns of respective threads… and/or prioritize commands based on the slowdowns of the threads they belong to” in [0067]; “determining the threads having the highest slowdown (Smax) and the lowest slowdown (Smin) from among all threads” in [0045]. Here, Mutlu explains how to estimate stalls in the processing pipeline for one or more threads), and 
schedule threads of the plurality of threads based at least in part on the estimated stalls that may occur for the one or more threads of the plurality of threads, wherein scheduling threads of the plurality of threads includes the one or more processors to assign priority levels to threads of the plurality of threads based at least in part on the estimated stalls that may occur for the one or more threads of the plurality of threads (Jensen further discloses “The execution pipeline detects a stalling event in response to an instruction issued to the execution pipeline… The processor also includes a thread scheduler, coupled to the execution pipeline, that issues to the execution pipeline instructions of the plurality of threads other than the stalling thread, in response to the execution pipeline indicating the stalling thread. An advantage of the present invention is that by detecting a stalling event in the execution pipeline and flushing the instruction from the execution pipeline to enable instructions of other threads to be dispatched to and executed in the execution pipeline.. the execution pipeline distinguishes between instructions of the stalling thread and the other threads stall likelihood priority generator 1104 generates a the pre-decoded register usage information 1106 is provided stall likelihood priority 1102 for the instruction 1114 based on the register usage information and based on processor state information 1112 received from the microprocessor 100 pipeline” in C25L9-13; “the stall likelihood priority 1102 comprises two bits, creating four priority levels, and is generated by the stall likelihood priority generator 1104 as follows. An instruction is assigned the highest stall likelihood priority 1102 if it is guaranteed not to stall… An instruction is assigned the lowest stall likelihood priority 1102 if it is guaranteed to stall (C25L40-53). Joy also discloses “A second thread-switching operation is "semi-oblivious" thread-switching for use with an existing "pipeline stall" signal (if any). The pipeline stall signal operates in two capacities, first as a notification of a pipeline stall, and second as a thread select signal between threads… A third thread-switching operation is an "intelligent global scheduler" thread-switching in which a thread switch decision is based on a plurality of signals including: (1)… cache miss stall signal…( 4) a thread priority signal” in [0021]; “A thread control logic may select a particular thread that is to execute with priority in comparison to other threads” in [0024, 0106]; “The thread switch logic 610 permits control of thread selection based on priority of a particular thread via signals to the thread priority terminal” in [0105]. Mutlu further explains how to estimate stalls for one or more threads by a thread analysis component 412 in [0042, 0045, 0067]; prioritize the threads based on the slowdowns of the threads in [0067, 0072, 0076].)


As to Claim 22, Jensen in view of Joy and Mutlu teaches the apparatus of claim 21, wherein scheduling threads of the plurality of threads includes the one or more processors to select a first thread of the plurality of threads to be scheduled and a second thread of the plurality of threads to be ignored based at least in part on the priority levels assigned to the threads of the plurality of threads (Jensen discloses “the stall likelihood priority 1102 comprises two bits, creating four priority levels, and is generated by the stall likelihood priority generator 1104 as follows. An instruction is assigned the highest stall likelihood priority 1102 if it is guaranteed not to stall… An instruction is assigned the lowest stall likelihood priority 1102 if it is guaranteed to stall (C25L40-53). Joy also discloses a thread switching with a pipeline stall signal in [0021]. Mutlu, [0067, 0072, 0076]).

As to Claim 24, Jensen in view of Joy and Mutlu teaches the apparatus of claim 21, wherein the assignment of priority levels includes determining whether a thread is deserving of a high priority assignment or a low priority assignment based at least in part on the estimated stalls that may occur for the one or more threads of the plurality of threads (Jensen, col 25. Joy, [0021, 0024, 0105-0106]. Mutlu, [0067, 0072, 0076]).

As to Claim 25, Jensen in view of Joy and Mutlu teaches the apparatus of claim 24, wherein the high priority and low priority assignments are further determined based on one or more predetermined thresholds indicating a level of stalls for threads that is acceptable or unacceptable for scheduling of the threads (Jensen discloses a stall likelihood priority with four priority levels in col 25. Joy discloses a thread selection based on thread priority in [0105-0106]. Mutlu, [0067, 0072, 0076]).

As to Claim 27, Jensen in view of Joy and Mutlu teaches the apparatus of claim 21, wherein the plurality of shared function units include one or more of a sampler, a data port, a shared local memory, and a pixel/color pipe (Joy, Fig 3 & 9. Mutlu, Fig 8.)

As to Claim 28, Jensen in view of Joy and Mutlu teaches the apparatus of claim 21, wherein the graphics processor is co-located with an application processor on a common semiconductor package (Joy discloses on-chip multiprocessors in Abstract, [0016, 0040]. Mutlu discloses Chip Multiprocessor (CMP) in [0002].)

Claim 29 recites similar limitations as claim 21 but in a method form. Therefore, the same rationale used for claim 21 is applied.
Claim 30 is rejected based upon similar rationale as Claim 22.
Claim 32 is rejected based upon similar rationale as Claim 24.
Claim 33 is rejected based upon similar rationale as Claim 25.

Claim 35 is rejected based upon similar rationale as Claim 27.
Claim 36 recites similar limitations as claim 21 but in a machine-readable medium form. Therefore, the same rationale used for claim 21 is applied.
Claim 37 is rejected based upon similar rationale as Claim 22.

Claim 39 is rejected based upon similar rationale as Claim 24.
Claim 40 is rejected based upon similar rationale as Claim 25.

Claim 42 is rejected based upon similar rationale as Claim 27.

Claims 26, 34 and 41 are rejected under 35 U.S.C. 103 as being unpatentable over Jensen in view of Joy, Mutlu and Savransky et al. (US 2014/0192066).
As to Claim 26, Jensen in view of Joy and Mutlu teaches the apparatus of claim 21, wherein the plurality of processing units includes a plurality of streaming multiprocessors (SMs) (Savransky teaches a streaming multiprocessor (SM) and a scheduler in Fig 2-4, see also [0048].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Jensen, Joy and Mutlu 

Claim 34 is rejected based upon similar rationale as Claim 26.
Claim 41 is rejected based upon similar rationale as Claim 26.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WEIMING HE whose telephone number is (571)270-1221.  The examiner can normally be reached on Monday-Friday, 8:30am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached on 571-272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-




/WEIMING HE/
Primary Examiner, Art Unit 2612