DETAILED ACTION
This action is responsive to the Application filed 03/03/2022.
According to a preliminary amendment, claims 1-20 have been cancelled and added claims 21-40 are pending herein for prosecution on merits.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the "right to exclude" granted by a patent and to prevent possible harassment by multiple assignees.  See In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970);and, In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent is shown to be commonly owned with this application.  See 37 CFR 1.130(b).
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer.  A terminal disclaimer signed by the assignee must fully comply with 37 CFR 3.73(b).

Claims 21, 29, 37 are provisionally rejected under the judicially created doctrine of obviousness-type double patenting (ODP) as being (respectively) unpatentable over claims 1, 9, 17 of U.S. Patent No. 11,281,496 (hereinafter ‘496) in view of Li et al, USPubN: 2019/0087985 (herein Li).
 Although the conflicting claims are not identical, they are not patentably distinct from each other because of the following observations. Following are but a few examples as to how the certain claims from the instant invention and from the above copending application are conflicting with each other.
	Instant claim 37					‘496 claim 17
A method comprising: 
receiving threads for a plurality of tensor operations (see
 103 analysis A from below) for scheduling; 

A method comprising: receiving a plurality of thread 
groups for scheduling, each thread group including
 multiple threads; 
scheduling the threads for the tensor operations 
(see obviousness analysis A), including assigning the threads among a set of tensors cores (see obviounsess analysis A) of a plurality of graphics processors;
scheduling the plurality of thread groups including 
assigning the thread groups among a plurality of graphics processors;
storing data in one or more caches for the tensor operations (see Analysis A) that are scheduled for processing;
storing, in one or more caches, data for the thread groups that are scheduled for processing;
wherein the scheduling of the threads for the plurality of 
tensor operations (see Analysis A) includes applying a bias 
for assigning threads to tensor cores of the plurality of 
tensor cores (see Analysis A) according to a cache locality that is utilized in the one or more caches.
wherein the scheduling of the plurality of thread groups includes applying a bias for assigning thread groups of 
the plurality of thread groups to processors of the 
plurality of graphics processors according to a cache 
locality that is utilized in the one or more caches


	Obviousness Analysis A:
	‘496 does not recite receiving threads for a plurality of tensor operations and scheduling the threads for tensor operations; nor does ‘496 recite scheduling the threads for the tensor operations by assigning the threads among set of tensor cores (of a plurality the graphics processors); nor does ‘496 recite scheduling in terms of applying a bias for assigning threads to tensor cores of the plurality thereof.
	Graphics applications with use of multi-threaded execution of tasks scheduled on graphics processors such as cores particularly dedicated to perform tensor operations is shown in Li; that is, pipeline for (inverse) graphics rendering via iterating scenes refinement per a TensorFlow optimization framework (para 0037) includes floating-point capable cores and tensor cores, the Tensor cores (para 0106) operative with thread warps (para 0101) to efficiently perform convolution operations, deep inferencing (para 0107) and/or 32-bit precision (arithmetic) operations or dimentional matrix load spanning across multi-threaded warp (para 0108); hence tensor operations to be scheduled via a TensorFlow framework using threaded groups executing of one or more of a plurality of tensor cores is recognized.
	Thus, as graphics rendering entails precision operation, convolution inferencing, and matrix type of complex load that can be more efficiently handled by appropriate scheduling of threaded groups executing on special Tensor cores as set forth above, it would have been at the time the invention was made for one skill in the art to implement the thread scheduling in ‘496 so that for pipelining of threads to carry out complex graphics tasks loads, deep inferencing and floating-point arithmetic operations on host graphics cores, threads destined therefor would be selected at configuration time (e.g via a TensorFlow framework) for a plurality of tensor operations – as in Li framework - where scheduling the threads – as in ‘496 -  would be geared for tensor operations in terms of assigning the threads among one or more cores of a plurality of tensor cores – as in Li – according to a arrangement – as per ‘496 - that applies a bias associated with assigning threads to the one or more tensor cores – as shown in Li – in view of the load complexity requirements and cooperative context of selected thread groups – as in Li use of warp; because
	organizing graphics tasks in line with application contexts burdened with floating-point complexity, concurrent load distribution and iterative loop or computation-intensive convolution associated with (graphics) rendering tasks via use of multi-thread selective arrangement in form of cooperative groups to carry out a particular type of graphics operation requiring a given complexity in terms of pipeline scheduling of threaded groups as set forth above for tensor operations to be realized on tensor cores particularly selected for hosting respective thread groups or warp execution (as per Li) would not only make proficient use of the very processing capability of the tensor cores to handle iterations of large width arithmetic instruction tasks (e.g. iterate convolution of matrices type computations) but would also permit intelligent allocation of threads according to proper cooperative size groups ( in line with the requirement of the graphics rendering type) where effect of ordering thread group execution would substantially achieve a time-efficient completion of a particular aspect (e.g. a scenic rendering) of the graphics operation whose requirement type defines a particular level of parallel processing and computation-intensive load to be envisioned for the assignment for the designated thread group.
	Therefore, instant (method) claim 37 would be deemed obvious over the subject matter recited per ‘496 (method) claim 17 for the reasons set forth above.
		Instant claim 21					‘496 claim 1
An apparatus comprising:
a plurality of processors including a plurality of graphics processors to process data, the graphics processors including a set of tensor cores (see 103 Analysis A), wherein the plurality of processors are to schedule threads for a plurality of tensor operations (see Analysis A) for processing by the plurality of tensor cores (refer to Analysis A from above);
An apparatus comprising: 
a plurality of processors including a plurality of graphics processors to process data, wherein the plurality of processors are to schedule a plurality of thread groups 
for processing by the plurality of graphics processors, 
each thread group including multiple threads; 

a memory; and one or more caches for storage of data for 
the plurality of graphics processors, the one or more caches including storage of data for the tensor operations (see 
Analysis A from above) scheduled for processing by the plurality of processors;
a memory; and one or more caches for storage of data 
for the plurality of graphics processors, the one or more caches including storage of data for the thread groups scheduled for processing by the plurality of processors;
wherein the one or more caches utilize a cache locality in 
which spatial locality of caching of data within the one or 
more caches is based at least in part on relationships in 
thread assignment;
wherein the one or more caches utilize a cache locality in which spatial locality of caching of data for thread 
groups within the one or more caches is based at least in 
part on relationships in thread group assignment, 
wherein the scheduling of the threads by the plurality of processors includes applying a bias for assigning threads to tensor cores (refer to Analysis A) according to the cache 
locality utilized for the one or more caches.
wherein the scheduling of the plurality of thread groups 
includes the plurality of processors to apply a bias for assigning thread groups to processors of the plurality of graphics processors according to the cache locality utilized for the one or more caches.


	Hence, on basis of the obviousness Analysis A from above, instant claim 21 is deemed an obvious variant to the subject matter of ‘496 claim 1
	Instant (medium) claim 29 recites the same step actions of instant claim 1, whereas ‘496 (medium) claim 9 recites the same step actions as ‘496 claim 1; therefore, instant claim 29 is deemed obvious over the subject matter of ‘496 claim 9, for the same reasons set forth above. 
	Dependent (instant) claims 21-28, 30-36, 38-40 are therefore unpatentable for being dependent upon a rejected base claim as set forth per the ODP rejection from above.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Tuan A Vu whose telephone number is (571) 272-3735.  The examiner can normally be reached on 8AM-4:30PM/Mon-Fri.
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Chat Do can be reached on (571)272-3721.
The fax phone number for the organization where this application or proceeding is assigned is (571) 273-3735 ( for non-official correspondence - please consult Examiner before using) or 571-273-8300 ( for official correspondence) or redirected to customer service at 571-272-3609.
Any inquiry of a general nature or relating to the status of this application should be directed to the TC 2100 Group receptionist: 571-272-2100.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Tuan A Vu/
Primary Examiner, Art Unit 2193
Decembre 17, 2022