Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 08 May 2021 has been entered.

 Response to Arguments
Applicant's arguments filed 08 May 2021 have been fully considered but they are not persuasive.

In response to applicant’s argument on numbered Pages 10-11 “Applicant has reviewed the portions of Cui cited by the Examiner in the Advisory Action and asserts that the Examiner is disregarding Applicant's claim language. As indicated in Applicant's independent claims, the claimed invention considers the locality of processor cores to a "data storage resource" when determining whether to redispatch a task from one processor core to another. Cui, by contrast, simply considers the latency when migrating a task form one processor core to another and not the locality between a processor core and a "data storage resource." Thus, Applicant's claimed invention and Cui determine the locality of two entirely different things”, examiner respectfully disagrees and notes the following:
The locality of the processor is one of the factors considered by Cui. As noted on Pg. 11 in Section 4.1. Multicore Platform and Operating System, "The effectiveness of our lock-contention-aware scheduler is verified on Linux 2.6.29.4-based AMD 32-core and Intel 32-core NUMA systems. Each will be introduced in turn. For the AMD 32-core system, there are eight Opteron 8347HE chips and each chip has four cores. Each core owns a private L1 data cache, L1 instruction cache, and L2 cache. The size of each L1 cache is 64K bytes, while the size of each L2 cache is 512K bytes. Four cores on the same chip share the same L3 cache. The size of the L3 cache is 2M bytes. Intra-chip cores and separate chips are connected by the internal crossbar switch and HyperTransport, respectively. The 32G memory is partitioned into 8 banks, where each bank connects to one of the 8 chips. The task migration overhead on this system is also measured. The latency is 9.3 microseconds between two cores on the same chip, 9.7 microseconds between two cores one hop away, 10.2 microseconds two hops away, and 11.6 microseconds three hops away". Cui is operating a NUMA type system in which the shared 32G memory is partitioned into 8 banks, each bank is associated with a particular processor core. The overhead of passing a task from one core to another has been measured by Cui, and is incorporated into the algorithm (9.7 microseconds from core A to core B, 10.2 microseconds to get from Core A to Core C, or 11.6 microseconds to get 
	
	As the argument for all other claims are substantially similar to the argument for claim 1 above, Examiner also respectfully disagrees for at least the same reasons as above.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.




Claim(s) 1-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Cui et al. in the work entitled  "Lock-contention-aware scheduler: A scalable and energy-efficient method for addressing scalability collapse on multicore systems" hereinafter referred to as Cui.

	Regarding claim 15, Cui teaches A system for reducing lock contention in a data storage system, the system comprising:
a data storage system comprising a first processor core, a second processor core, and a data storage resource (Cui Pg. 11, Section 4.1. Multicore Platform and Operating System, "The effectiveness of our lock-contention-aware scheduler is verified on Linux 2.6.29.4- based AMD 32-core and Intel 32-core NUMA systems. Each will be introduced in turn. For the AMD 32-core system, there are eight Opteron 8347HE chips and each chip has four cores. Each core owns a private L1 data cache, L1 instruction cache, and L2 cache. The size of each L1 cache is 64K bytes, while the size of each L2 cache is 512K bytes. Four cores on the same chip share the same L3 cache. The size of the L3 cache is 2M bytes. Intra-chip cores and separate chips are connected by the internal crossbar switch and HyperTransport, respectively. The 32G memory is partitioned into 8 banks, where each bank connects to one of the 8 chips. The task migration overhead on this system is also measured. The latency is 9.3 microseconds between two cores on the same chip, 9.7 microseconds between two cores one hop away, 10.2 microseconds two hops away, and 11.6 microseconds three hops away"), the data storage system configured to:
dispatch, on the first processor core, a task configured to acquire a lock on the data storage resource (Cui Pg. 5, Section 3.2. Scalability Collapse Detection, "The lock-contention-aware scheduler identifies a task as lock intensive and migrates it to an SSC when the task spends a considerable amount of time on waiting for locks"; Cui is teaching that a task is scheduled to run on a core, and that the task requires a lock in protected storage); 
determine whether the first processor core is local with respect to the data storage resource, wherein locality of a processor core to the data storage resource is based on at least one of a distance between a processor core and the data storage resource, and an efficiency with which a processor core can access the data storage resource (Cui Pg. 5, Section 3.2. Scalability Collapse Detection, "The lock-contention-aware scheduler identifies a task as lock intensive and migrates it to an SSC when the task spends a considerable amount of time on waiting for locks"; Cui Pg. 11, Section 4.1. Multicore Platform and Operating System, "The task migration overhead on this system is also measured. The latency is 9.3 microseconds between two cores on the same chip, 9.7 microseconds between two cores one hop away, 10.2 microseconds two hops away, and 11.6 microseconds three hops away"; Cui is teaching that that determines how much time would be spent waiting to acquire a lock on memory in order to perform a task. A key component of that wait time is determining if it would be faster to transfer the thread to a different core. As each core has its own local memory, by factoring in the time it takes to transfer the thread to a different core, the distance between any processor core and the target memory resource is accounted for. In addition, determining which option allows for faster processing of the thread is a measure of efficiency, so both of the current options for determining locality as claimed are taught); 
if the first processor core is not local with respect to the data storage resource, re-dispatch the task on the second processor core that is local with respect to the data storage resource (Cui Pg. 5, Section 3.2. Scalability Collapse Detection, "The lock-contention-aware scheduler identifies a task as lock intensive and migrates it to an SSC when the task spends a considerable amount of time on waiting for locks. To qualify whether a task is lock intensive, we calculated its percentage of lock-waiting time during each time slice"; Cui Pg. 11, Section 4.1. Multicore Platform and Operating System, "The task migration overhead on this system is also measured. The latency is 9.3 microseconds between two cores on the same chip, 9.7 microseconds between two cores one hop away, 10.2 microseconds two hops away, and 11.6 microseconds three hops away"; Cui is teaching that a task is scheduled to run on a core, and that the task requires a lock in protected storage. If the first core is not able to obtain the lock on the protected storage within the time allotted, then the task is sent to a core dedicated to lock contention); and
if the first processor core is local with respect to the data storage resource, execute the task on the first processor core (Cui Pg. 5, Section 3.2. Scalability Collapse Detection, "The lock-contention-aware scheduler identifies a task as lock intensive and migrates it to an SSC when the task spends a considerable amount of time on waiting for locks"; Cui Pg. 11, Section 4.1. Multicore Platform and Operating System, "The task migration overhead on this system is also measured. The latency is 9.3 microseconds between two cores on the same chip, 9.7 microseconds between two cores one hop away, 10.2 microseconds two hops away, and 11.6 microseconds three hops away"; Cui is teaching that a task is scheduled to run on a core, and that the task requires a lock in protected storage. If the time spend acquiring the lock does not breach a threshold, then the task is performed on the original core).

Independent claims 1 and 8 have substantially the same scope and limitations as claim 15 as they are respectively the corresponding Method and Computer Program Product claims. Therefore, claims 1 and 8 are rejected under 35 U.S.C. 102(a)(1) for at least the same reasons as above.

The system of claim 15, wherein the first processor core is located on a first processor chip of the data storage system and the second processor core is located on a second processor chip of the data storage system (Cui Pg. 11, Section 4.1. Multicore Platform and Operating System, "The effectiveness of our lock-contention-aware scheduler is verified on Linux 2.6.29.4- based AMD 32-core and Intel 32-core NUMA systems. Each will be introduced in turn. For the AMD 32-core system, there are eight Opteron 8347HE chips and each chip has four cores"; Pg. 11, Section 4.1. Multicore Platform and Operating System, "The task migration overhead on this system is also measured. The latency is 9.3 microseconds between two cores on the same chip, 9.7 microseconds between two cores one hop away, 10.2 microseconds two hops away, and 11.6 microseconds three hops away"; Cui notes that tasks may be migrated as many as three physical chips away from the originating chip in this evaluation).

Dependent claims 2, 3, 9 and 10 have substantially the same scope and limitations as claim 16 as they are respectively the corresponding Method and Computer Program Product claims. Therefore, claims 2, 3, 9 and 10 are rejected under 35 U.S.C. 102(a)(1) for at least the same reasons as above.

Regarding claim 17, Cui teaches The system of claim 15, wherein the storage resource is a memory (Cui Pg. 11, Section 4.1. Multicore Platform and Operating System, "The effectiveness of our lock-contention-aware scheduler is verified on Linux 2.6.29.4- based AMD 32-core and Intel 32-core NUMA systems. Each will be introduced in turn. For the AMD 32-core system, there are eight Opteron 8347HE chips and each chip has four cores. Each core owns a private L1 data cache, L1 instruction cache, and L2 cache. The size of each L1 cache is 64K bytes, while the size of each L2 cache is 512K bytes. Four cores on the same chip share the same L3 cache. The size of the L3 cache is 2M bytes. Intra-chip cores and separate chips are connected by the internal crossbar switch and HyperTransport, respectively. The 32G memory is partitioned into 8 banks, where each bank connects to one of the 8 chips. The task migration overhead on this system is also measured";).

Dependent claims 4 and 11 have substantially the same scope and limitations as claim 17 as they are respectively the corresponding Method and Computer Program Product claims. Therefore, claims 4 and 11 are rejected under 35 U.S.C. 102(a)(1) for at least the same reasons as above.

Regarding claim 18, Cui teaches The system of claim 15, wherein re-dispatching the task on the second processor core comprises re-dispatching the task on the second processor core only if effort required to acquire the lock is above a selected threshold (Cui Pg. 5, Section 3.2. Scalability Collapse Detection, "The lock-contention-aware scheduler identifies a task as lock intensive and migrates it to an SSC when the task spends a considerable amount of time on waiting for locks. To qualify whether a task is lock intensive, we calculated its percentage of lock-waiting time during each time slice"; Cui is teaching that a task is running on a core, and that the task requires a lock in protected storage. If the first core is not able to obtain the lock on the protected storage, then the task is sent to a different core that owns the lock).

Dependent claims 5 and 12 have substantially the same scope and limitations as claim 18 as they are respectively the corresponding Method and Computer Program Product claims. Therefore, claims 5 and 12 are rejected under 35 U.S.C. 102(a)(1) for at least the same reasons as above.

Regarding claim 19, Cui teaches The system of claim 18, wherein the effort required is measured by at least one of a number of clock cycles needed to acquire the lock, a number of acquisition attempts needed to acquire the lock, and an amount of time needed to acquire the lock (Cui Pg. 5, Section 3.2. Scalability Collapse Detection, "The lock-contention-aware scheduler identifies a task as lock intensive and migrates it to an SSC when the task spends a considerable amount of time on waiting for locks. To qualify whether a task is lock intensive, we calculated its percentage of lock-waiting time during each time slice"; Cui is teaching that a task is running on a core, and that the task requires a lock in protected storage. If the time spend acquiring the lock does not breach a threshold, then the task is performed on the original core, if the time threshold is breached, the task is sent to a new core).



Regarding claim 20, Cui teaches The system of claim 15, wherein acquiring the lock on the data storage resource comprises acquiring the lock to access data on the data storage resource (Cui Pg. 12, Section , "All microbenchmarks are implemented as multiprocess programs and synchronized using the same framework. The single counter benchmark has each process increase the same counter protected by a spin lock in the kernel space; mmapbench has each process map the same continuous 500MBytes with the MAP SHARED flag from a file, touch each page by reading the first byte, and destroy the mapping; sockbench has each process create a socket and then close the result. Two parameters can be tuned in all these microbenchmarks. One is the number of processes that is currently running and the other is the number of operations each process performs in a test. Parallel postmark is a multithreaded benchmark, which has the capability of simulating file servers providing email and netnews services. Each thread in parallel postmark executes transactions repeatedly on an independent set of files (between 0.5 and 10K bytes in size) and each transaction is made up of two steps: (1) creating or deleting a file (2) reading or appending a file. Files, file I/O operations (e.g., create and read), and file sizes are chosen from a uniform distribution"; Cui is benchmarking the time required ).

Dependent claims 7 and 14 have substantially the same scope and limitations as claim 20 as they are respectively the corresponding Method and Computer Program Product claims. Therefore, claims 7 and 14 are rejected under 35 U.S.C. 102(a)(1) for at least the same reasons as above.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DUSTIN B FULFORD whose telephone number is (571)272-7229.  The examiner can normally be reached on M-Th 9am-3pm EST.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached on (571) 270-7519.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 






/D.B.F./Examiner, Art Unit 2132                                                                                                                                                                                                        
/MASUD K KHAN/Primary Examiner, Art Unit 2132