DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending in this application.

Information Disclosure Statement
The IDS filed on 10/28/2020 has been considered. 

Specification
The disclosure is objected to because of the following informalities:
in paragraph [0038] line 13 “Job 5” should be “Job 6”
the description in paragraph [0057] of Fig. 5 does not match with Fig. 5 and does not match with the description of Fig. 5 in paragraphs [0051-0052]
in paragraphs [0051-0052] they describe that each compute node 110 has 32 processing cores, Job A has 128 ranks, Job B has 32 ranks, Job C has 20 ranks, Job D has 128 ranks, Job E has 32 ranks, and Job F has 24 ranks.
however, in paragraph [0057] it discloses that the same Jobs A, B, C, D, E, and F are described, but Job A only has 32 ranks, Job B has 8 ranks, Job C has 3 ranks, and Job D has 32 ranks
paragraph [0057] is discussing node packing-based scheduling and the left hand side of Fig. 5 does not match with the description in paragraph [0057]
for example, Job A has 32 ranks and each compute node has 32 processing cores, but Fig. 5 shows that Job A takes up 4 compute nodes with node packing-based scheduling
Therefore, lines 11-18 of paragraph [0057] should be replaced with “In other words, for this example, Job A has 128 ranks, and the 128 ranks are scheduled to be processed by the minimal set of nodes, i.e., four nodes at thirty-two ranks per node. As also illustrated by the schedule 400, the thirty-two ranks for Job B are scheduled for processing by a single compute node N05 (i.e., the minimal set of nodes to process Job B); the twenty ranks for Job C are scheduled for processing by a single compute node N06 (i.e., the minimal set of nodes to process Job C); the 128 ranks for Job D are scheduled for processing by compute nodes N07, N08, N09, and N10”. 
Appropriate correction is required. However, applicant is reminded of not to introduce new matter in the specification.

Claim Objections
Claim 3 is objected to because of the following informalities: “ranks of the first job relatively to the plurality of ranks of the second job” should be “ranks of the first job relative to the plurality of ranks of the second job”.  Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
As per claim 1:
	Line 2 recites “receiving a request to process a first job on a cluster” and line 7 recites “scheduling processing of the first job” but it is unclear what is doing these processes. 

As per claims 1 and 17 (line numbers refer to claim 1):
	Line 4 recites “plurality of ranks can be equally divided among a minimal subset of nodes” but it is unclear what it means by “equally divided” (ie. If there are two nodes, 9 ranks, a first node that has a capacity for 8 ranks and a second node has a capacity for 8 ranks, what would it mean to equally divide the 9 ranks?). 
Lines 4 and 10 recite “minimal subset of nodes”, but it is unclear what is meant by minimal subset of nodes (ie. Does each node have a certain capacity for ranks and do the plurality of ranks occupy a minimum number of nodes such that the capacity for each node of the minimum number of nodes is filled?). 
	Lines 5-6 recite “all processing cores of the minimal set of nodes correspond to the plurality of ranks” but it is unclear what this means (ie. Is there a one to one mapping between all processing cores of the minimal set of nodes and the plurality of ranks? Does each node of the minimal set of nodes have processing cores?).

As per claim 2:
	Line 4 recites “on the same nodes” which lacks antecedent basis. 
	Lines 4-5 recite “the processing of the plurality of ranks of the first job overlaps, on the same nodes, the processing of the plurality of ranks of the second job in time” but it is unclear whether what this means (ie. Do the set of nodes that process the plurality of ranks of the first job also process the plurality of ranks of the second job and is the time in which all of the plurality of ranks of the first job and all of the plurality of ranks of the second job are being processed overlapping?).

As per claim 4:
	Line 5 recites “the multiple ranks” which lacks antecedent basis.

As per claim 9:
	Lines 1-2 recite “the number of the set of nodes” which lacks antecedent basis.

As per claim 12:
	Line 6 recites “the plurality of ranks is divisible into equal segments” but it is unclear what this means (ie. Are the plurality of ranks divided into groups with each group having an equal number of ranks?).
	Line 8 recites “the number” which lacks antecedent basis.
	Line 9 recites “the segment” and it is unclear which segment this refers to because there are multiple segments. 
Lines 12-14 recite “assigning a number of ranks of the plurality of ranks to the given node less than the total number of processing cores of the given node” but it is unclear what is less than the total number of processing cores of the given node (ie. Does this mean that each of a number of ranks are assigned to a processing core of the given node and not all of the processing cores of the given node are used?).
	

As per claims 18-20 (line numbers refer to claim 17):
	Line 1 recites “the storage medium of claim 17” and this lacks antecedent basis since claim 17 recites “a non-transitory storage medium” which is different.

Dependent claims 3, 5-8, 10, 11, and 13-16 fail to resolve the deficiencies of claims 1 and 12, so they are rejected for the same reasons as claims 1 and 12 above. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (abstract idea) without significantly more. 

As per claim 1, in step 1 of the 101 analysis, the examiner has determined that the claim
is directed to a method. Therefore, the claim is directed to one of the four statutory categories of
invention.
In step 2A prong 1 of the 101 analysis, the examiner has determined that the claim recites
a judicial exception. Specifically, the limitation “in response to the request, scheduling processing of the first job, wherein the scheduling of processing of the first job comprises distributing processing of the plurality of ranks across a set of nodes of the plurality of nodes greater in number than the minimal subset of nodes” recites a mental process. Humans use their minds to think of how jobs should be scheduled my merely associating jobs with nodes. The limitation “the plurality of ranks can be equally divided among a minimal subset of nodes of the plurality of nodes such that all processing cores of the minimal set of nodes correspond to the plurality of ranks” is an intended use claim and does not have patentable weight. Even if it was positively recited, it can be considered a mental process because one can evaluate how ranks can be divided among a minimal subset of nodes.

In step 2A prong 2 of the 101 analysis, the examiner has determined that the additional
elements, alone or in combination do not integrate the judicial exceptions into a practical
application for the following rationale:
The limitation "receiving a request to process a first job on a cluster" represents an insignificant, extra-solution activity. The term "extra-solution activity" can be understood as "activities incidental to the primary process or product that are merely a nominal or tangential addition to the claim" (MPEP 2106.05(g)). The examiner has determined that the limitation "receiving a request to process a first job on a cluster" are directed to mere data gathering activities which is a category of insignificant extra-solution activities (MPEP 2106.05(g)). 
The limitations “wherein the first job comprises a plurality of ranks” and “the cluster comprises a plurality of nodes” merely describe attributes of the technological environment in with the abstract idea is operating. The courts have identified that generally linking the use of a judicial exception into a technological environment do not integrate a judicial exception into a practical application (MPEP 2106.04(d)(I)).
In step 2B of the 101 analysis, the examiner has determined that the additional elements,
alone or in combination do not recite significantly more than the abstract ideas identified above for the following rationale:
The limitation "receiving a request to process a first job on a cluster" represents an insignificant, extra-solution activity. The limitation "receiving a request to process a first job on a cluster" are well-understood, routine, or conventional because they are directed to "receiving or transmitting data" (MPEP 2106.05(d)). These are additional elements that the courts have recognized as well understood, routine, or conventional (MPEP 2106.05(d)). The citation of court cases in the MPEP meets the Berkheimer evidentiary burden since citation of a court case in the MPEP is one of the 4 types of evidentiary support that can be used to prove that the additional elements are well-understood, routine, or conventional (see 125 USPQ2d 1649 Berkheimer v. HP, Inc.). Thus, the limitation does not amount to significantly more than the abstract idea. 
The limitations “wherein the first job comprises a plurality of ranks” and “the cluster comprises a plurality of nodes” merely describe attributes of the technological environment and therefore do not amount to significantly more than the exception itself (MPEP 2106.05(h)).

As per claims 2 and 3, they further describe the abstract idea of scheduling. 
As per claim 4, it further describes the abstract idea of scheduling and further describes attributes of the technological environment.
As per claim 5, it further describes attributes of the technological environment.
As per claims 6-9, they further describe the abstract idea of scheduling. 
As per claim 10, it further describes attributes of the technological environment.
As per claim 11, it further describes the abstract idea of scheduling and further describes attributes of the technological environment.

As per claim 12, in step 1 of the 101 analysis, the examiner has determined that the claim
is directed to a system. Therefore, the claim is directed to one of the four statutory categories of
invention.
In step 2A prong 1 of the 101 analysis, the examiner has determined that the claim recites
a judicial exception. 
Specifically, the limitation “in response to the request, scheduling processing of the first job, wherein the scheduling comprises distributing processing of the plurality of ranks across the plurality of nodes including assigning a number of ranks of the plurality of ranks to the given node less than the total number of processing cores of the given node” is a mental process. Humans use their minds to think of how jobs should be scheduled my merely associating jobs with nodes.
In step 2A prong 2 of the 101 analysis, the examiner has determined that the additional
elements, alone or in combination do not integrate the judicial exceptions into a practical
application for the following rationale:
The limitations “a processor; and a memory to store instructions that, when executed by the processor, cause the processor to” apply judicial exceptions on generic computing components. "Alappat 's rationale that an otherwise ineligible algorithm or software could be made patent-eligible by merely adding a generic computer to the claim was superseded by the Supreme Court's Bilski and Alice Corp. decisions" so therefore applying judicial exceptions on a processor and memory which are generic computing components does not integrate the judicial exceptions into a practical application (MPEP 2106.05(b)).
The limitation "receiving a request to process a first job on a cluster" represents an insignificant, extra-solution activity. The term "extra-solution activity" can be understood as "activities incidental to the primary process or product that are merely a nominal or tangential addition to the claim" (MPEP 2106.05(g)). The examiner has determined that the limitation "receiving a request to process a first job on a cluster" are directed to mere data gathering activities which is a category of insignificant extra-solution activities (MPEP 2106.05(g)). 
The limitations “wherein the first job comprises a plurality of ranks, the plurality of ranks is divisible into equal segments, the cluster comprises a plurality of nodes, and a given node of the plurality of nodes has a total number of processing cores that corresponds with the number of ranks of the segment” merely describe attributes of the technological environment in with the abstract idea is operating. The courts have identified that generally linking the use of a judicial exception into a technological environment do not integrate a judicial exception into a practical application (MPEP 2106.04(d)(I)).
In step 2B of the 101 analysis, the examiner has determined that the additional elements,
alone or in combination do not recite significantly more than the abstract ideas identified above for the following rationale:
The limitations “a processor; and a memory to store instructions that, when executed by the processor, cause the processor to” apply judicial exceptions on a generic computing components and therefore do not provide significantly more.
The limitation "receiving a request to process a first job on a cluster" represents an insignificant, extra-solution activity. The limitation "receiving a request to process a first job on a cluster" are well-understood, routine, or conventional because they are directed to "receiving or transmitting data" (MPEP 2106.05(d)). These are additional elements that the courts have recognized as well understood, routine, or conventional (MPEP 2106.05(d)). The citation of court cases in the MPEP meets the Berkheimer evidentiary burden since citation of a court case in the MPEP is one of the 4 types of evidentiary support that can be used to prove that the additional elements are well-understood, routine, or conventional (see 125 USPQ2d 1649 Berkheimer v. HP, Inc.). Thus, the limitation does not amount to significantly more than the abstract idea. 
The limitations “wherein the first job comprises a plurality of ranks, the plurality of ranks is divisible into equal segments, the cluster comprises a plurality of nodes, and a given node of the plurality of nodes has a total number of processing cores that corresponds with the number of ranks of the segment” merely describe attributes of the technological environment and therefore do not amount to significantly more than the exception itself (MPEP 2106.05(h)).

As per claim 13, it further describes the abstract idea of scheduling.
As per claim 14, it further describes attributes of the technological environment.
As per claim 15, it further describes the abstract idea of scheduling.
As per claim 16, it further describes attributes of the technological environment.

As per claim 17, in step 1 of the 101 analysis, the examiner has determined that the claim
is directed to a non-transitory storage medium. Therefore, the claim is directed to one of the four statutory categories of invention.
	Claim 17 is rejected in a similar manner to claim 1. Additionally, claim 17 recites “in response to the second request, schedule processing of the second job to coincide with the processing of the first job, wherein the scheduling of processing of the second job comprises distributing processing of the plurality of ranks of the second job across the set of nodes” which is a mental process. The limitation “a non-transitory storage medium storing machine-readable instructions that, when executed by a machine, cause the machine to” is a generic computing component so it does not integrate the judicial exception into a practical application and does not recite significantly more than the abstract idea. The limitation “receive a second request to process a second job on the cluster” is an insignificant extra solution activity so it does not integrate the judicial exception into a practical application and does not recite significantly more than the abstract idea.

As per claim 18, it further describes the abstract idea of scheduling.
As per claim 19, it further describes the abstract idea of scheduling.
As per claim 20, it further describes the abstract idea of scheduling.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-3, 5, 6, 8-14, and 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Slrum (Consumable Resources in Slurm) in view of Marchand (WO 2020251850 A1).
Slrum was cited in the IDS filed on 10/28/2020.
As per claim 1, Slrum teaches the invention substantially as claimed including a method comprising: receiving a request to process a first job on a cluster, the cluster comprises a plurality of nodes; and in response to the request, scheduling processing of the first job, wherein the scheduling of processing of the first job comprises distributing processing across a set of nodes of the plurality of nodes greater in number than the minimal subset of nodes (page 5 lines 15-19 The example cluster is composed of 4 nodes (10 CPUs in total): linux01 (with 2 processors), linux02 (with 2 processors), linux03 (with 2 processors), and linux04 (with 4 processors); page 7 line 4-5 Job 2 is running on nodes linux01 to linux04. Job 2's allocation is the same as for Slurm's default allocation which is that it uses one CPU on each of the 4 nodes; page 7 lines 14-15 Once Job 2 is running, Job 3 is scheduled onto node linux01, linux02, and Linux03 (using one CPU on each of the nodes); page 6 line 3 The four jobs have been launched and 3 of the jobs are now pending; Job 2 is spread across nodes linux01 to linux04 which means that the processing of Job 2 takes up more than the minimal subset of nodes since Job 2 is utilizing 4 CPUs and node linux04 has 4 CPUs.).

Slrum fails to teach wherein the first job comprises a plurality of ranks, the plurality of ranks can be equally divided among a minimal subset of nodes of the plurality of nodes such that all processing cores of the minimal set of nodes correspond to the plurality of ranks, distributing processing of the plurality of ranks across a set of nodes.

However, Marchand teaches wherein the first job comprises a plurality of ranks, the plurality of ranks can be equally divided among a minimal subset of nodes of the plurality of nodes such that all processing cores of the minimal set of nodes correspond to the plurality of ranks, distributing processing of the plurality of ranks across a set of nodes (Fig. 10; [0044] The terms MPI rank, MPI thread, MPI process, tasks, and process can be used interchangeably to identify any executable running on a computing capable device. The executable can itself be a piece of software; [0053] include numerical expressions (e.g. a number) that can specify the MPI rank position difference between communicating MPI processes; [0056] Here we see that 1024 MPI processes were spread on 16 compute nodes with 64 cores each; [0056] #Ranks=1024 #Nodes=16 #cores/node=64; abstract lines 6-7 a number of dimensions of the tasks. The tasks can be assigned based on a minimization of a number of communications between the nodes; [0058] If Sub-gridding Optimization module 110 and Process Placement Optimization module 120 are integrated as a pre-execution plug-in module, then the number of computer nodes utilized can be set by the user at job submission time, and can be adjusted lower by the workload manager utility; In one example, there are 1024 ranks, 16 nodes, and 64 cores/node and the 1024 ranks are equally divided among a minimal subset of nodes since 1024 divided by 64 is 16.).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined Slrum with the teachings of Marchand to improve MPI process placement (see Marchand [0067] improved MPI process placement). 

	
As per claim 2, Slrum and Marchand teach the method of claim 1. Slrum specifically teaches further comprising scheduling processing of a second job, wherein the scheduling of processing of the second job comprises distributing processing of the second job across the set of nodes, wherein the processing of the first job overlaps, on the same nodes, the processing of the second job in time (page 8 line 3 Job 2, Job 3, and Job 4 are now running concurrently on the cluster; page 7 line 4 Job 2 is running on nodes linux01 to linux04; page 7 lines 14-16 Once Job 2 is running, Job 3 is scheduled onto node linux01, linux02, and Linux03 (using one CPU on each of the nodes) and Job 4 is scheduled onto one of the remaining idle CPUs on Linux04).
Additionally, Marchand teaches distributing processing of a plurality of ranks, processing of the plurality of ranks, processing of the plurality of ranks (Fig. 10; [0056] #Ranks=1024 #Nodes=16 #cores/node=64; [0056] Here we see that 1024 MPI processes were spread on 16 compute nodes with 64 cores each).

As per claim 3, Slrum and Marchand teach the method of claim 2. Slrum specifically teaches wherein the scheduling further comprises staggering start times of the first job relatively to the second job (page 7 lines 14-16 Once Job 2 is running, Job 3 is scheduled onto node linux01, linux02, and Linux03 (using one CPU on each of the nodes) and Job 4 is scheduled onto one of the remaining idle CPUs on Linux04).
Additionally, Marchand teaches the plurality of ranks (Fig. 10; [0056] #Ranks=1024 #Nodes=16 #cores/node=64; [0056] Here we see that 1024 MPI processes were spread on 16 compute nodes with 64 cores each).

As per claim 5, Slrum and Marchand teach the method of claim 1. Slrum specifically teaches wherein each node of the set of nodes corresponds to a different operating system instance of a plurality of operating system instances (page 5 lines 15-19 The example cluster is composed of 4 nodes (10 CPUs in total): linux01 (with 2 processors), linux02 (with 2 processors), linux03 (with 2 processors), and linux04 (with 4 processors)).

As per claim 6, Slrum and Marchand teach the method of claim 1. Slrum specifically teaches wherein the scheduling further comprises selecting the set of nodes for the scheduling based on each node of the set of nodes being idle before processing of the first job begins (page 7 lines 5-9 Once Job 2 is scheduled and running, nodes linux01, linux02 and linux03 still have one idle CPU each and node linux04 has 3 idle CPUs. The main difference between this approach and the exclusive mode approach described above is that idle CPUs within a node are now allowed to be assigned to other jobs.).

As per claim 8, Slrum and Marchand teach the method of claim 1. Slrum teaches further comprising: determining a first scheduling policy (page 5 lines 15-19 The example cluster is composed of 4 nodes (10 CPUs in total): linux01 (with 2 processors), linux02 (with 2 processors), linux03 (with 2 processors), and linux04 (with 4 processors); page 7 line 4 Job 2 is running on nodes linux01 to linux04; page 7 lines 14-16 Once Job 2 is running, Job 3 is scheduled onto node linux01, linux02, and Linux03 (using one CPU on each of the nodes) and Job 4 is scheduled onto one of the remaining idle CPUs on Linux04.).
Additionally, Marchand teaches modifying the first scheduling policy to provide a second scheduling policy based on an observed performance of the cluster (Figs 8 and 10; [0052] Real-Time Acquired information 220, which can be obtained by polling compute nodes once MPI processes have been dispatched, or any other suitable mechanism which can provide information regarding the compute node configurations, interconnect configuration, etc. This information can be passed to Sub-gridding Optimization module 110, which can then proceed to Process Placement Optimization module 120 to determine an exemplary sub-gridding solution that can minimize inter-node communications based the operational environment; [0067] Figure 10 shows an exemplary diagram illustrating the data grid 700 from Figure 7 showing improved MPI process placement according to an exemplary embodiment of the present disclosure. As shown in Figure 10, the grouping of MPI processes is different from what is shown in Figure 8; [0086] Thus, it can be possible to optimize communication exchanges in more than one level of communications at a time within a single application. The exemplary system, method and computer-accessible medium, according to an exemplary embodiment of the present disclosure, can also utilize knowledge of the application and system characteristics in order to provide statistics gathering, performance monitoring and analysis).

As per claim 9, Slrum and Marchand teach the method of claim 1. Slrum specifically teaches wherein the scheduling further includes selecting the number of the set of nodes to coincide with a number of nodes per job stripe (page 3 lines 10-11 using the default node selection scheme; page 7 line 4 Job 2 is running on nodes linux01 to linux04; page 5 lines 15-19 The example cluster is composed of 4 nodes (10 CPUs in total): linux01 (with 2 processors), linux02 (with 2 processors), linux03 (with 2 processors), and linux04 (with 4 processors); page 5 lines 6-7 Job 2 and Job 3 call for the node count to equal the processor count).
Additionally, Marchand teaches wherein the scheduling further includes selecting the number of the set of nodes to coincide with a user-specified preference of a number of nodes ([0006] As for MPI, the placement of processes can be left to the workload manager, although OpenMPI can provide the ability to manually override that placement through user- supplied rank files; [0058] the number of computer nodes utilized can be set by the user at job submission time, and can be adjusted lower by the workload manager utility.).

As per claim 10, Slrum and Marchand teach the method of claim 1. Slrum specifically teaches wherein each node of the plurality of nodes comprises a plurality of central processing unit (CPU) packages (page 5 lines 15-19 The example cluster is composed of 4 nodes (10 CPUs in total): linux01 (with 2 processors), linux02 (with 2 processors), linux03 (with 2 processors), and linux04 (with 4 processors)).
Additionally, Marchand teaches a plurality of graphics processing unit (GPU) packages, field programable gate arrays (FPGAs), or other node accelerators ([0036] The exemplary definition of computing device can also include but is not limited to accelerators, such as, but not limited to, floating point unit, graphics processing unit, field programmable gate array).

As per claim 11, Slrum and Marchand teach the method of claim 1. Slrum specifically teaches wherein the first job is part of a plurality of jobs to be scheduled, the method further comprising: determining a scheduling policy based on at least one characteristic of the cluster and at least one characteristic of the plurality of jobs; and performing the scheduling in response to the scheduling policy (page 7 line 4 Job 2 is running on nodes linux01 to linux04; page 7 lines 14-16 Once Job 2 is running, Job 3 is scheduled onto node linux01, linux02, and Linux03 (using one CPU on each of the nodes) and Job 4 is scheduled onto one of the remaining idle CPUs on Linux04; page 8 lines 1-2 Once Job 2 finishes, Job 5, which was pending, is allocated available resources and is then running).

As per claim 12, Slrum teaches the invention substantially as claimed including a system comprising: a processor; and a memory to store instructions that, when executed by the processor, cause the processor to (page 1 lines 12-14 Consumable resources has been enhanced with several new resources --namely CPU (same as in previous version), Socket, Core, Memory as well as any combination of the logical processors with Memory; page 8 line 13 users have mpi/threaded/openMP programs): receive a request to process a first job on a cluster, the cluster comprises a plurality of nodes, and a given node of the plurality of nodes has a total number of processing cores; in response to the request, scheduling processing of the first job, wherein the scheduling comprises distributing processing across the plurality of nodes including assigning processing to the given node using less than the total number of processing cores of the given node (page 5 lines 15-19 The example cluster is composed of 4 nodes (10 CPUs in total): linux01 (with 2 processors), linux02 (with 2 processors), linux03 (with 2 processors), and linux04 (with 4 processors); page 7 line 4-5 Job 2 is running on nodes linux01 to linux04. Job 2's allocation is the same as for Slurm's default allocation which is that it uses one CPU on each of the 4 nodes; page 7 lines 14-15 Once Job 2 is running, Job 3 is scheduled onto node linux01, linux02, and Linux03 (using one CPU on each of the nodes); page 6 line 3 The four jobs have been launched and 3 of the jobs are now pending).

Slrum fails to teach wherein the first job comprises a plurality of ranks, the plurality of ranks is divisible into equal segments, and a given node of the plurality of nodes has a total number of processing cores that corresponds with the number of ranks of the segment; distributing processing of the plurality of ranks across the plurality of nodes including assigning a number of ranks of the plurality of ranks to the given node.

However, Marchand teaches wherein the first job comprises a plurality of ranks, the plurality of ranks is divisible into equal segments, and a given node of the plurality of nodes has a total number of processing cores that corresponds with the number of ranks of the segment; distributing processing of the plurality of ranks across the plurality of nodes including assigning a number of ranks of the plurality of ranks to the given node (Fig. 10; [0044] The terms MPI rank, MPI thread, MPI process, tasks, and process can be used interchangeably to identify any executable running on a computing capable device. The executable can itself be a piece of software; [0053] include numerical expressions (e.g. a number) that can specify the MPI rank position difference between communicating MPI processes; [0056] Here we see that 1024 MPI processes were spread on 16 compute nodes with 64 cores each; [0065] Figure 8 illustrates a common default MPI process mapping for a decomposed data grid. As shown in Figure 8, each compute node can 805 have six (6) sub-domains; abstract lines 6-7 a number of dimensions of the tasks).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined Slrum with the teachings of Marchand to improve MPI process placement (see Marchand [0067] improved MPI process placement). 
	
As per claim 13, Slrum and Marchand teach the system of claim 12. Slrum teaches wherein the instructions, when executed by the processor, further cause the processor to: schedule processing of a second job, comprising distributing processing of the second job across the plurality of nodes, wherein the processing of the first job overlaps in time with the processing of the second job (page 7 lines 14-16 Once Job 2 is running, Job 3 is scheduled onto node linux01, linux02, and Linux03 (using one CPU on each of the nodes) and Job 4 is scheduled onto one of the remaining idle CPUs on Linux04; page 8 line 3 Job 2, Job 3, and Job 4 are now running concurrently on the cluster).
Additionally, Marchand teaches distributing processing of a plurality of ranks, processing of the plurality of ranks, processing of the plurality of ranks (Fig. 10; [0056] Here we see that 1024 MPI processes were spread on 16 compute nodes with 64 cores each).

As per claim 14, it is a system claim of claim 5, so it is rejected for the same reasons as claim 5 above. 

As per claim 15, Slrum and Marchand teach the system of claim 12. Slrum specifically teaches wherein the instructions, when executed by the processor, further cause the processor to schedule processing of the first job based on a determined scheduling policy, wherein the determined scheduling policy specifies a number of nodes per job stripe (page 7 line 4 Job 2 is running on nodes linux01 to linux04; page 5 lines 15-19 The example cluster is composed of 4 nodes (10 CPUs in total): linux01 (with 2 processors), linux02 (with 2 processors), linux03 (with 2 processors), and linux04 (with 4 processors); page 5 lines 6-7 Job 2 and Job 3 call for the node count to equal the processor count.).

As per claim 16, Slrum and Marchand teach the system of claim 12. Marchand teaches further comprising: a plurality of servers comprising the plurality of nodes, wherein a given server of the plurality of servers comprises the given node and another node of the plurality of nodes ([0036] computers connected by an interconnect fabric can be “compute nodes).

As per claim 17, it is a non-transitory storage medium claim of claim 1 so it is rejected for the same reasons. Additionally, Slrum teaches a non-transitory storage medium storing machine-readable instructions that, when executed by a machine, cause the machine to: receive a second request to process a second job on the cluster; and in response to the second request, schedule processing of the second job to coincide with the processing of the first job, wherein the scheduling of processing of the second job comprises distributing processing of the second job across the set of nodes (page 1 lines 12-14 Consumable resources has been enhanced with several new resources --namely CPU (same as in previous version), Socket, Core, Memory as well as any combination of the logical processors with Memory; page 8 line 3 Job 2, Job 3, and Job 4 are now running concurrently on the cluster; page 5 lines 15-19 The example cluster is composed of 4 nodes (10 CPUs in total): linux01 (with 2 processors), linux02 (with 2 processors), linux03 (with 2 processors), and linux04 (with 4 processors); page 7 line 4 Job 2 is running on nodes linux01 to linux04; page 7 lines 14-15 Once Job 2 is running, Job 3 is scheduled onto node linux01, linux02, and Linux03 (using one CPU on each of the nodes); page 6 line 3 The four jobs have been launched and 3 of the jobs are now pending).
Additionally, Marchand teaches distributing processing of the plurality of ranks (Fig. 10; [0056] #Ranks=1024 #Nodes=16 #cores/node=64; [0056] Here we see that 1024 MPI processes were spread on 16 compute nodes with 64 cores each).

As per claim 18, Slrum and Marchand teach the storage medium of claim 17. Slrum teaches wherein the instructions, when executed by the machine, further cause the machine to determine a first scheduling policy based on characteristics of the plurality of nodes and characteristics of the first job and the second job, and schedule processing of the first job and the second job in response to the first scheduling policy (page 6 line 3 The four jobs have been launched and 3 of the jobs are now pending; page 7 line 4 Job 2 is running on nodes linux01 to linux04; page 7 lines 14-16 Once Job 2 is running, Job 3 is scheduled onto node linux01, linux02, and Linux03 (using one CPU on each of the nodes) and Job 4 is scheduled onto one of the remaining idle CPUs on Linux04; page 8 lines 1-2 Once Job 2 finishes, Job 5, which was pending, is allocated available resources and is then running).

As per claim 19, Slrum and Marchand teach the storage medium of claim 18. Marchand teaches wherein the instructions, when executed by the machine, further cause the machine to: observe a performance of the cluster; modify the first scheduling policy based on the performance to provide a second scheduling policy; and schedule another job based on the second scheduling policy (Figs 8 and 10; [0052] Real-Time Acquired information 220, which can be obtained by polling compute nodes once MPI processes have been dispatched, or any other suitable mechanism which can provide information regarding the compute node configurations, interconnect configuration, etc. This information can be passed to Sub-gridding Optimization module 110, which can then proceed to Process Placement Optimization module 120 to determine an exemplary sub-gridding solution that can minimize inter-node communications based the operational environment; [0067] Figure 10 shows an exemplary diagram illustrating the data grid 700 from Figure 7 showing improved MPI process placement according to an exemplary embodiment of the present disclosure. As shown in Figure 10, the grouping of MPI processes is different from what is shown in Figure 8; [0086] Thus, it can be possible to optimize communication exchanges in more than one level of communications at a time within a single application. The exemplary system, method and computer-accessible medium, according to an exemplary embodiment of the present disclosure, can also utilize knowledge of the application and system characteristics in order to provide statistics gathering, performance monitoring and analysis).

As per claim 20, Slrum and Marchand teach the storage medium of claim 17. Slrum teaches wherein the instructions, when executed by the machine, further cause the machine to determine a scheduling policy (page 3 lines 10-11 using the default node selection scheme; page 7 line 4 Job 2 is running on nodes linux01 to linux04; page 7 lines 14-16 Once Job 2 is running, Job 3 is scheduled onto node linux01, linux02, and Linux03 (using one CPU on each of the nodes) and Job 4 is scheduled onto one of the remaining idle CPUs on Linux04; page 8 lines 1-2 Once Job 2 finishes, Job 5, which was pending, is allocated available resources and is then running.).
Additionally, Marchand teaches determine a scheduling policy based on at least one user-specified preference ([0006] As for MPI, the placement of processes can be left to the workload manager, although OpenMPI can provide the ability to manually override that placement through user- supplied rank files; [0058] the number of computer nodes utilized can be set by the user at job submission time, and can be adjusted lower by the workload manager utility.)

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Slrum and Marchand, as applied to claim 1 above, in view of Norton et al. (US 20180253315 A1 herein Norton).

As per claim 4, Slrum and Marchand teach the method of claim 1. Slrum specifically teaches the plurality of nodes; and scheduling the processing of the first job further comprises distributing processing (page 5 lines 15-19 The example cluster is composed of 4 nodes (10 CPUs in total): linux01 (with 2 processors), linux02 (with 2 processors), linux03 (with 2 processors), and linux04 (with 4 processors); page 7 line 4 Job 2 is running on nodes linux01 to linux04; page 7 lines 14-16 Once Job 2 is running, Job 3 is scheduled onto node linux01, linux02, and Linux03 (using one CPU on each of the nodes) and Job 4 is scheduled onto one of the remaining idle CPUs on Linux04.).
Additionally, Marchand teaches distributing processing of the multiple ranks of the plurality of ranks (Fig. 10; [0056] #Ranks=1024 #Nodes=16 #cores/node=64; [0056] Here we see that 1024 MPI processes were spread on 16 compute nodes with 64 cores each).

Slrum and Marchand fail to teach wherein: the plurality of nodes further comprises a plurality of non-uniform memory access (NUMA) domains; and scheduling the processing of the first job further comprises distributing processing with at least some of the NUMA domains.

	However, Norton teaches wherein: the plurality of nodes further comprises a plurality of non-uniform memory access (NUMA) domains; and scheduling the processing of the first job further comprises distributing processing with at least some of the NUMA domains (claim 12 wherein the computing system comprises a plurality of non-uniform memory architecture (NUMA) nodes, the system further to: launch the processes on the nodes based on the parameter specified to the launcher, wherein the parameter indicates a launch policy.).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined Slrum and Marchand with the teachings of Norton because Norton’s teaching of NUMA nodes allows for processes to access local memory for faster memory access.
	

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Slrum and Marchand, as applied to claim 1 above, in view of Zhao (CN107291546A).
The claim mappings of Zhao are made with a translation of CN107291546A.
As per claim 7, Slrum and Marchand teach the method of claim 1. Slrum specifically teaches wherein the first job is one of a plurality of jobs to be scheduled, and the scheduling further comprises: determining a first scheduling policy for the plurality of jobs; attempting to schedule the first job based on the first scheduling policy (page 5 lines 15-19 The example cluster is composed of 4 nodes (10 CPUs in total): linux01 (with 2 processors), linux02 (with 2 processors), linux03 (with 2 processors), and linux04 (with 4 processors); page 7 line 4 Job 2 is running on nodes linux01 to linux04; page 7 lines 14-16 Once Job 2 is running, Job 3 is scheduled onto node linux01, linux02, and Linux03 (using one CPU on each of the nodes) and Job 4 is scheduled onto one of the remaining idle CPUs on Linux04; page 3 lines 10-11 using the default node selection scheme).
Additionally, Marchand teaches determining a second scheduling policy based on characteristics of the first job (Figs. 8, 10; [0067] Figure 10 shows an exemplary diagram illustrating the data grid 700 from Figure 7 showing improved MPI process placement according to an exemplary embodiment of the present disclosure. As shown in Figure 10, the grouping of MPI processes is different from what is shown in Figure 8. Sub-gridding Optimization module 110 was used to determine that each node can hold a 2x3 sub-grid rather than a 1x6 sub-grid as shown in Figure 8; [0052] Real-Time Acquired information 220, which can be obtained by polling compute nodes once MPI processes have been dispatched, or any other suitable mechanism which can provide information regarding the compute node configurations, interconnect configuration, etc. This information can be passed to Sub-gridding Optimization module 110, which can then proceed to Process Placement Optimization module 120 to determine an exemplary sub-gridding solution that can minimize inter-node communications based the operational environment.).

Slrum and Marchand fail to teach determining that the first job cannot be scheduled pursuant to the first scheduling policy; determining a second scheduling policy based on the first scheduling policy; and scheduling the first job based on the second scheduling policy.

However, Zhao teaches determining that the first job cannot be scheduled pursuant to the first scheduling policy; determining a second scheduling policy based on the first scheduling policy; and scheduling the first job based on the second scheduling policy ([0137] 505、The resource manager schedules resources for the application according to the first threshold, the second threshold and the first scheduling policy; [0141-0142] 507、The resource manager uses a preset scheduling policy to schedule resources for the application. It should be noted that, if the comprehensive historical resource utilization, comprehensive historical resource usage, and comprehensive historical resource application volume of the application cannot be successfully determined, the first scheduling strategy cannot be used to schedule resources for the application. The scheduling policy is for the application to schedule resources, and the preset scheduling policy may be any scheduling policy among the fair scheduling policy, the capacity scheduling policy, and the first-in-first-out scheduling policy; [0080] Specifically, the first historical feature information needs to be used in the process of using the first scheduling strategy to schedule resources for N applications. Since the first historical feature information includes the historical features of the N applications, according to the first historical feature The information and the first scheduling policy can only schedule resources for N applications in the above R application programs, and the resources used by applications other than the N application programs in the above R application programs can be performed by using a preset scheduling policy. Scheduling, specifically, see below for the relevant description of the preset scheduling policy; [0085] when the resource manager fails to acquire the historical feature information successfully, the resource manager schedules resources for the R applications according to a preset scheduling policy.).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined Slrum and Marchand with the teachings of Zhao because Zhao’s teaching of a preset scheduling policy allows for fair scheduling to done when there isn’t there isn’t historical resource utilization information which is required for the first scheduling policy (see Zhao [0142] if the comprehensive historical resource utilization, comprehensive historical resource usage, and comprehensive historical resource application volume of the application cannot be successfully determined, the first scheduling strategy cannot be used to schedule resources for the application. The scheduling policy is for the application to schedule resources, and the preset scheduling policy may be any scheduling policy among the fair scheduling policy). 
	

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HSING CHUN LIN whose telephone number is (571)272-8522.  The examiner can normally be reached on Mon - Fri 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Meng-Ai An can be reached on (571)272-3756.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MENG AI T AN/Supervisory Patent Examiner, Art Unit 2195                                                                                                                                                                                                        



/H.L./Examiner, Art Unit 2195