DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant’s submission filed on May 17, 2022 has been entered.

Status
This instant application No. 16/177680 has claims 1-5, 7-11, 13-18, and 20 pending.  
Claims 6, 12, and 19 are cancelled.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3-4, 7, 9-10, 14, and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Correll et al. (Pub. No. US2014/0101636; hereinafter Correll) in view of Doerr et al. (Pub. No. US2014/0351551; hereinafter Doerr) in view of Boger et al. (Pub. No. US2006/0218543; hereinafter Boger) in view of Grover et al. (Pub. No. US2013/0117548; hereinafter Grover).
Regarding claim 1, Correll discloses the following: 
A method, comprising: 
performing a front-end compilation using application source code to generate a plurality of intermediate representations and connectivity information, wherein a particular intermediate representation of the plurality of intermediate representations corresponds to a particular task of a plurality of tasks, wherein the connectivity information includes a plurality of connections, and wherein a particular connection specifies a communication between a first task of the plurality of tasks and a second task of the plurality of tasks; 
(Correll teaches performing a front-end compilation using application source code [Abstract; 0039, 0062] to generate a plurality of intermediate representations [0062] and connectivity information, e.g. resource model which has “actor interconnect information” [0062], wherein a particular intermediate representation of the plurality of intermediate representations corresponds to a particular task of a plurality of tasks, e.g. “resource selection (806) … may be applicable to the process and the target (or end) application” [0168], wherein the connectivity information includes a plurality of connections – see interconnect [0044], and wherein a particular connection specifies a communication between a first task within a first node of the plurality of tasks and a second task within a second node of the plurality of tasks, e.g. “A Graphical Data Flow Program comprises a plurality of interconnected nodes (blocks), wherein at least a subset of the connections among the nodes visually indicate that data produced by one node is used by another node” [0044])
mapping logical objects included in the application source code to physical resources included in a multi-processor array using the plurality of intermediate representations and the connectivity information to generate a resource map; 
(Correll teaches mapping logical objects included in the application source code to physical resources [0155; TABLE 1; FIG. 7] included in a multi-processor array, e.g. “multi-processor systems” [0144], using the plurality of intermediate representations and the connectivity information to generate a resource map [0154-0155])
selecting a respective implementation for each connection in the plurality of connections; 
(Correll teaches selecting a respective implementation, e.g. “the user operates to select a target device from a plurality of possible target devices for programming or configuration using a graphical program or a program generated based on a graphical program” [0106], for each connection in the plurality of connections, e.g. “Blocks may be designed with dataflow semantics, and may communicate with each other via terminals (on the blocks) through wires connecting the blocks” [0125])
performing a first optimization operation using the plurality intermediate representations to generate a plurality of optimized intermediate representations; 
(Correll teaches performing a first optimization operation, e.g. “optimize the design at various stages of the compilation process according to analysis performed at those stages, to enhance future compilations of the desktop algorithm” [0005], using the plurality intermediate representations to generate a plurality of optimized intermediate representations, as evidenced by “automatic feedback to change compilation/optimization decisions is important” [0156])
re-mapping the logical objects based on results of the first optimization operation; 
(Correll teaches re-mapping the logical objects, e.g. “In one set of embodiments, real information from the compilation may be provided back to earlier stages in the compilation (compile) process, so different decisions may be made to obtain a more optimal result” [0156], “placements may be modified slightly to improve timing paths” [0159], and resource adjustments [0160-0161], based on results of the first optimization operation [0156])
generating executable code
(Correll teaches generating executable code [0008, 0103], e.g. “generated the end "application" (by compiling/synthesizing the target specific code to a corresponding hardware or FPGA implementation, for example)” [0008])
simulating the executable code; and 
(Correll teaches simulating the executable code, e.g. “performing Hardware in the Loop (HIL) simulation. Hardware in the Loop (HIL) refers to the execution of the plant model 94 in real time to test operation of a real controller 92. For example, once the controller 92 has been designed, it may be expensive and complicated to actually test the controller 92 thoroughly in a real plant, e.g., a real car. Thus, the plant model (implemented by a graphical program) is executed in real time to make the real controller 92 "believe" or operate as if it is connected to a real plant, e.g., a real engine” [0105])
modifying at least the application source code based on the simulation result.  
(Correll teaches modifying at least the application source code [0142-0143, 0146-0150] based on the simulation result [0141], “e.g., annotated or modified imperative code, such as C/CUDA (Compute Unified Device Architecture), or OpenCL” [0146])

However, Correll does not disclose the following:
(1)	wherein the multi-processor array includes a plurality of processors and a plurality of data memory routers arranged in an interspersed fashion, wherein the plurality of data memory routers includes a given data memory router configured to transfer instructions and data to a different data memory router;
(2)	generating executable code; and 
(3)	loading the executable code onto the multi-processor array.
Nonetheless, this feature would have been made obvious, as evidenced by Doerr.
(1) (Doerr teaches that the multi-processor array, e.g. a multi-processor system or MPS, includes a plurality of “a plurality of processing elements (PEs)” [0096] and “a plurality of data memory routers (DMRs)” [0103] arranged in an interspersed fashion, e.g. “The PEs and DMRs in one MPS embodiment may be interspersed in a substantially homogeneous fashion” [0116], wherein the plurality of data memory routers includes a given data memory router configured to transfer instructions and data to a different data memory router, e.g. “communicate data/instructions with neighboring DMRs, and optionally on through those DMRs to other DMRs” [0131])
(2) (Doerr teaches generating executable code, e.g. “to generate an executable program that is deployable to the multiprocessor system for efficient parallel execution” [0037], using the plurality of optimized intermediate representations – the optimized intermediate representations are cited as a Software Programming Model, e.g. “(dynamic) system definitions, i.e., that the tool flow can interpret the system definition” [0189] and “The example code of Code Portion A, filter_kernel.c, approaches the control model, the communication model, and supporting memory structures and processing explicitly, and may be required to be interpreted as such. This does not allow dynamic interactivity to be defined between control, communications, and memory structure in such a way as to intuitively represent a system, to define the system, or to interpret the system in an efficient way” [0190])
(3) (Doerr teaches loading the executable code [0047] onto the multi-processor array or multiprocessor system [0037, 0047], e.g. “operation of the application on the multiprocessor system, where these specified requirements or constraints may be used by a compiler (or other tool) to generate executable code that may be executed efficiently on the system” [0047])
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll with the teachings of Doerr. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Apply these teachings of Doerr to apply its properties on the multiprocessor array of Correll and the optimized intermediate representations of Correll. 
There are motivations to perform these steps as follows: 
(1) “to generate an executable program that is deployable to the multiprocessor system for efficient parallel execution” [0148]
(2) “to effectively represent the system properties in the software programming model, and there are any number of ways to do this, including, but not limited to, creating or expanding an API such as MPI to support ANSI C, creating specific class structures in C++, etc. However, as noted above, the specific lexical representation is not important. Rather, what is important is that the programming model recognizes these (dynamic) system definitions, i.e., that the tool flow can interpret the system definition and then effectively map the system to the target execution model and underlying hardware architecture” [0217 – Doerr].

However, Correll in view of Doerr does not disclose the following:
(1)	simulating the executable code to generate test results 
(2)	performing a second back-end compilation in response to determining further optimization is possible based on the test results; 
Nonetheless, this feature would have been made obvious, as evidenced by Boger.
(1) (Boger teaches simulating the executable code to generate test results [0053], e.g. “running a series of test cases and verifying the results observed” [0053])
(2) (Boger teaches performing a second back-end compilation [0052-0053; FIG. 5, Sequence of Elements 508 [Wingdings font/0xE0] 501 [Wingdings font/0xE0] 502 [Wingdings font/0xE0] 503] in response to determining further optimization is possible based on the test results [0053], e.g. “For example, step 508 might represent execution in a debug mode wherein program execution can be halted in the middle of the program by occurrence of various events, or might represent execution normally by running a series of test cases and verifying the results observed, or some other mode. Any of these processes may be repeated indefinitely (as indicated), or may cause the programmer to return to step 501 to again edit the source, or, in the case of certain compilers which use profiling data collected during execution as part of the optimization process, to return to step 503 to re-compile the code.” [0053]. 
Please see evidence on figure below – following sequence from 1 to 4: 
[AltContent: textbox ([Wingdings font/0x81])][AltContent: textbox ([Wingdings font/0x82])][AltContent: oval][AltContent: oval][AltContent: textbox ([Wingdings font/0x83])][AltContent: textbox ([Wingdings font/0x84])][AltContent: oval][AltContent: oval]
    PNG
    media_image1.png
    1231
    867
    media_image1.png
    Greyscale

[see FIG. 5, Elements 508 [Wingdings font/0xE0]	501 [Wingdings font/0xE0]	502 [Wingdings font/0xE0]	503])
Apply the teachings of Boger to perform further steps with respect to optimized intermediate code of Correll in view of Doerr, as well logical objects and resource architecture of Correll in view of Doerr.
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll in view of Doerr with the teachings of Boger. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Rationale G: Teachings, Suggestion, and Motivation.
The motivation would have been as follows: 
“Collection and analysis of sampled data is typically performed iteratively on different version of program 215, which is repeatedly modified and re-analyzed (using sampled data or otherwise) until a desired result is achieved” [0050 – Boger].

However, Correll in view of Doerr in view of Boger does not disclose the following:
(1)	performing a first back-end compilation using the plurality of optimized intermediate representations to generate assembler code; 
(2)	performing a second optimization operation using the assembler code to generate optimized assembler code; 
(3)	 generating executable code using the optimized assembler code;
Nonetheless, this feature would have been made obvious, as evidenced by Grover.
(1) (Grover teaches performing a back-end compilation using the plurality of optimized intermediate representations to generate assembler code, e.g. “the middle-end generates a second IR for the back-end to process. In particular, the back-end receives the second IR and translates the second IR into assembly-level code” [0006])
(2) (Grover teaches performing a second optimization operation using the assembler code [0007-0008] to generate optimized assembler code, e.g. “One advantage of the techniques disclosed herein is that compiled assembly instructions are automatically checked by the compiler 150 for additional optimization opportunities. Specifically, the compiler 150 is able to detect vectorizable assembly instructions that can be replaced with fewer or simpler vectorized assembly instructions.” [0039])
(3) (Grover teaches generating executable code using the optimized assembler code, e.g. “Moreover, fewer or more efficient assembly instructions results in fewer cycles of any processor tasked to execute the computer program, which directly correlates to energy savings” [0039])
It would be beneficial to apply the teaching of Grover on the executable code of Correll in view of Doerr in view of Boger.
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll in view of Doerr in view of Boger with the teachings of Grover. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Rationale G. Teaching, Suggestion, and Motivation.
The motivation would have been as follows: “In this manner, the total number of instructions of which the computer program is comprised may potentially be reduced, which increases overall execution efficiency of the computer program” [Grover – 0039]. 
Regarding claims 3, 9, and 16, Correll in view of Doerr in view of Boger in view of Grover disclose the following: 
wherein mapping the logical objects includes: 
assigning the particular task of the plurality of tasks to a particular processor of the plurality of processors; and 
(Correll teaches assigning a particular task of the plurality of tasks [0155] to the particular processor of the plurality of processors [0146, 0157], e.g. “a particular function or combination of functions or application(s) intended to be physically implemented” [0155] on a particular processor [0157] 

Next, Doerr discloses the following:
assigning a variable associated with the particular task to [[the]] a particular data memory router of the plurality of data memory routers.  
***EXAMINER INTERPRETS THIS CLAIM AS HAVING THE SAME SCOPE AS THAT OF CLAIMS 9 and 16.
(Doerr teaches assigning a variable associated with the particular task to a particular data memory router of the plurality of data memory routers or DMRs, e.g. “The variables u, v, w are declared communication variables in the program source code, and assigned to specific memory addresses in the adjacent DMRs” [0137])
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll with the teachings of Doerr. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Apply the assigning step of Doerr in accordance with the particular task of Correll. 
The motivation would have been to benefit from multiple communication aspects of the variable – for example: “The variable x is a declared communication variable and assigned to the DMR shown. A communication pathway associated with variable x runs from its assigned DMR via other DMR to an I/O port at the top row. As shown, the two example programs do not communicate with each other, but they can easily be made to communicate by addition of another communication variable to the task 71, and a pathway between its DMR and variable w in the DMR adjacent to task 62” [0137 – Doerr].
Regarding claims 4, 10, and 17, Correll in view of Doerr in view of Boger in view of Grover disclose the following: 
wherein selecting the respective implementation for each connection in the plurality of connections includes selecting a direct memory access for transferring data from a sender to a receiver included in a particular connection of the plurality of connections.  
(Doerr teaches selecting the respective implementation for each connection, e.g. “number of ports ("processor ports"), some of which may be configured for connection to DMRs and others that may be configured for connection to other PEs” [0105], in the plurality of connections [0129, 0134], e.g. “neighboring connections with balanced production and consumption of data-words” [0134] includes selecting a direct memory access for transferring data from a sender to a receiver included in a particular connection of the plurality of connections [0111, 0131-0132], e.g. “A DMA mechanism may allow a given DMR to copy data efficiently to or from other DMRs, or to or from locations external to MPS 10, while PEs are computing results” [0111])
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll with the teachings of Doerr. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Apply the selecting step of Doerr in accordance with the implementation of Correll.
The motivation would have been as follows: “A PE may also save a block of data to be transferred in an SM buffer in a neighbor DMR and then direct the neighbor DMR to begin a DMA operation through special SM addresses associated with such operations. This may permit the PE to proceed with other tasks while the neighbor DMR coordinates the DMA transfer of the data” [0132 – Doerr].
Regarding claim 7, Correll discloses the following: 
(Currently Amended) A computer system, comprising: 
one or more memories configured to store instructions; and 
one or more processors configured to receive instructions from the one or more memories and execute the instructions to cause the computer system to perform operations including: 
performing a front-end compilation using application source code to generate a plurality of intermediate representations and connectivity information, wherein a particular intermediate representation of the plurality of intermediate representations corresponds to a particular task of a plurality of tasks, wherein the connectivity information includes a plurality of connections, and wherein a particular connection specifies a communication between a first task of the plurality of tasks and a second task of the plurality of tasks; 
(Correll teaches performing a front-end compilation using application source code [Abstract; 0039, 0062] to generate a plurality of intermediate representations [0062] and connectivity information, e.g. resource model which has “actor interconnect information” [0062], wherein a particular intermediate representation of the plurality of intermediate representations corresponds to a particular task of a plurality of tasks, e.g. “resource selection (806) … may be applicable to the process and the target (or end) application” [0168], wherein the connectivity information includes a plurality of connections – see interconnect [0044], and wherein a particular connection specifies a communication between a first task within a first node of the plurality of tasks and a second task within a second node of the plurality of tasks, e.g. “A Graphical Data Flow Program comprises a plurality of interconnected nodes (blocks), wherein at least a subset of the connections among the nodes visually indicate that data produced by one node is used by another node” [0044])
mapping logical objects included in the application source code to physical resources included in a multi-processor array using the plurality of intermediate representations and the connectivity information to generate a resource map; 
(Correll teaches mapping logical objects included in the application source code to physical resources [0155; TABLE 1; FIG. 7] included in a multi-processor array, e.g. “multi-processor systems” [0144], using the plurality of intermediate representations and the connectivity information to generate a resource map [0154-0155])
selecting a respective implementation for each connection in the plurality of connections; 
(Correll teaches selecting a respective implementation, e.g. “the user operates to select a target device from a plurality of possible target devices for programming or configuration using a graphical program or a program generated based on a graphical program” [0106], for each connection in the plurality of connections, e.g. “Blocks may be designed with dataflow semantics, and may communicate with each other via terminals (on the blocks) through wires connecting the blocks” [0125])
performing a first optimization operation using the plurality intermediate representations to generate a plurality of optimized intermediate representations; 
(Correll teaches performing a first optimization operation, e.g. “optimize the design at various stages of the compilation process according to analysis performed at those stages, to enhance future compilations of the desktop algorithm” [0005], using the plurality intermediate representations to generate a plurality of optimized intermediate representations, as evidenced by “automatic feedback to change compilation/optimization decisions is important” [0156])
re-mapping the logical objects; 
(Correll teaches re-mapping the logical objects, e.g. “In one set of embodiments, real information from the compilation may be provided back to earlier stages in the compilation (compile) process, so different decisions may be made to obtain a more optimal result” [0156], “placements may be modified slightly to improve timing paths” [0159], and resource adjustments [0160-0161])
generating executable code
(Correll teaches generating executable code [0008, 0103], e.g. “generated the end "application" (by compiling/synthesizing the target specific code to a corresponding hardware or FPGA implementation, for example)” [0008])
simulating the executable code; and 
(Correll teaches simulating the executable code, e.g. “performing Hardware in the Loop (HIL) simulation. Hardware in the Loop (HIL) refers to the execution of the plant model 94 in real time to test operation of a real controller 92. For example, once the controller 92 has been designed, it may be expensive and complicated to actually test the controller 92 thoroughly in a real plant, e.g., a real car. Thus, the plant model (implemented by a graphical program) is executed in real time to make the real controller 92 "believe" or operate as if it is connected to a real plant, e.g., a real engine” [0105])
modifying at least the application source code based on the simulation result.  
(Correll teaches modifying at least the application source code [0142-0143, 0146-0150] based on the simulation result [0141], “e.g., annotated or modified imperative code, such as C/CUDA (Compute Unified Device Architecture), or OpenCL” [0146])

However, Correll does not disclose the following:
wherein mapping the logical objects includes: 
(1)	mapping the first task and the second task to a common physical resource specified in a plurality of constraints; and 
(2)	mapping a communication task of the plurality of tasks to a particular physical resource based on a bandwidth constraint included in the plurality of constraints;
(3)	wherein the multi-processor array includes a plurality of processors and a plurality of data memory routers arranged in an interspersed fashion, wherein the plurality of data memory routers includes a given data memory router configured to transfer instructions and data to a different data memory router;
(4)	generating executable code using the optimized assembler code; and 
(5)	otherwise, loading the executable code onto the multi-processor array.
Nonetheless, this feature would have been made obvious, as evidenced by Doerr.
wherein mapping the logical objects includes: 
(1) (Doerr teaches mapping the first task, see taskID=62 “assigned to a specific PE in the upper left corner of the array” [0137], and the second task, see taskID=71 “assigned to a specific PE in the interior of the array” [0137], to a common physical resource, namely “array of PE (squares) uniformly interspersed with a 9.times.9 array of DMR (circles)” [0137], specified in a plurality of constraints [0089, 0158], e.g. “allowing users to specify various requirements or constraints regarding operation of the system, e.g., operation of the application on the multiprocessor system, where these specified requirements or constraints may be used by a compiler (or other tool) to generate executable code that may be executed efficiently on the system” [0158])
(2) (Doerr teaches mapping a communication task of the plurality of tasks to a particular physical resource [0131, 0136-0137, 0188,], e.g. “communication between nodes may be under programmer control” [0131] with consideration that “Each DMR may communicate with neighboring DMR or chip I/O ports to setup communication pathways and send/receive messages on said pathways” [0136] and “A communication pathway associated with variable x runs from its assigned DMR via other DMR to an I/O port at the top row. As shown, the two example programs do not communicate with each other, but they can easily be made to communicate by addition of another communication variable to the task 71, and a pathway between its DMR and variable w in the DMR adjacent to task 62” [0137], based on a bandwidth constraint included in the plurality of constraints [0123 0190, 0197, 0208], e.g. “maximize the bandwidth of data movement between them as well as data movement on and off the chip” and “a DMR, embedded in an SRF that may provide significantly higher total bandwidth than a bus-oriented architecture” [0123]) 
(3) (Doerr teaches that the multi-processor array, e.g. a multi-processor system or MPS, includes a plurality of “a plurality of processing elements (PEs)” [0096] and “a plurality of data memory routers (DMRs)” [0103] arranged in an interspersed fashion, e.g. “The PEs and DMRs in one MPS embodiment may be interspersed in a substantially homogeneous fashion” [0116], wherein the plurality of data memory routers includes a given data memory router configured to transfer instructions and data to a different data memory router, e.g. “communicate data/instructions with neighboring DMRs, and optionally on through those DMRs to other DMRs” [0131])
(4) (Doerr teaches generating executable code using the optimized assembler code – see citations below: 
-  “to generate an executable program that is deployable to the multiprocessor system for efficient parallel execution” [0037] 
- “(dynamic) system definitions, i.e., that the tool flow can interpret the system definition” [0189] 
- “The example code of Code Portion A, filter_kernel.c, approaches the control model, the communication model, and supporting memory structures and processing explicitly, and may be required to be interpreted as such. This does not allow dynamic interactivity to be defined between control, communications, and memory structure in such a way as to intuitively represent a system, to define the system, or to interpret the system in an efficient way” [0190])
(5) (Doerr teaches otherwise loading the executable code [0047] onto the multi-processor array or multiprocessor system [0037, 0047], e.g. “operation of the application on the multiprocessor system, where these specified requirements or constraints may be used by a compiler (or other tool) to generate executable code that may be executed efficiently on the system” [0047])
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll with the teachings of Doerr. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Apply these teachings of Doerr to apply its properties on the multiprocessor array of Correll and the optimized intermediate representations of Correll. 
There are motivations to perform these steps as follows: 
(1) “to generate an executable program that is deployable to the multiprocessor system for efficient parallel execution” [0148]
(2) “to effectively represent the system properties in the software programming model, and there are any number of ways to do this, including, but not limited to, creating or expanding an API such as MPI to support ANSI C, creating specific class structures in C++, etc. However, as noted above, the specific lexical representation is not important. Rather, what is important is that the programming model recognizes these (dynamic) system definitions, i.e., that the tool flow can interpret the system definition and then effectively map the system to the target execution model and underlying hardware architecture” [0217 – Doerr].

However, Correll in view of Doerr does not disclose the following:
(1)	 simulating the executable code to generate a simulation result that includes performance information 
(2)	in response to determining performance optimizations are possible based on the performance information included in the simulation result, re-mapping the logical objects based on the performance information 
(3)	in response to determining that there is a behavioral issue with executable code 
Nonetheless, this feature would have been made obvious, as evidenced by Boger.
(1) (Boger teaches simulating the executable code to generate a simulation result [0053] that includes performance information, e.g. “certain performance characteristics of a computer program (the "monitored program") by executing the monitored program under simulated or actual conditions, while running the sample collector 221” [0046] and “running a series of test cases and verifying the results observed” [0053])
(2) (Boger teaches in response to determining performance optimizations are possible based on the performance information included in the simulation result [0029, 0053], e.g. “that the programmer might correct defects or write code in a more efficient manner. It also might be used, in some circumstances, as input to an optimizing compiler, which would have the capability to employ alternative coding techniques (e.g. inlining) to avoid performance bottlenecks arising from certain frequently executed procedures” [0029], re-mapping the logical objects via code optimizations [0006, 0039, 0052-0053] based on the performance information [0046, 0055])
(3) (Boger teaches in response to determining that there is a behavioral issue with executable code, e.g. “logical errors, inefficiencies, or other problems with the code” [0007], based on the simulation result, modifying/editing at least the application source code based on the simulation result [0052, 0053]

Please see evidence on figure below – following sequence from 1 to 4: 
[AltContent: textbox ([Wingdings font/0x81])][AltContent: textbox ([Wingdings font/0x82])][AltContent: oval][AltContent: oval][AltContent: textbox ([Wingdings font/0x83])][AltContent: textbox ([Wingdings font/0x84])][AltContent: oval][AltContent: oval]
    PNG
    media_image1.png
    1231
    867
    media_image1.png
    Greyscale

[see FIG. 5, Elements 508 [Wingdings font/0xE0]	501 [Wingdings font/0xE0]	502 [Wingdings font/0xE0]	503])
Apply the teachings of Boger to perform further steps with respect to optimized intermediate code of Correll in view of Doerr, as well logical objects and resource architecture of Correll in view of Doerr.
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll in view of Doerr with the teachings of Boger. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Rationale G: Teachings, Suggestion, and Motivation.
The motivation would have been as follows: 
“Collection and analysis of sampled data is typically performed iteratively on different version of program 215, which is repeatedly modified and re-analyzed (using sampled data or otherwise) until a desired result is achieved” [0050 – Boger].

However, Correll in view of Doerr in view of Boger does not disclose the following:
(1)	performing a first back-end compilation using the plurality of optimized intermediate representations to generate assembler code; 
(2)	performing a second optimization operation using the assembler code to generate optimized assembler code; 
(3)	 generating executable code using the optimized assembler code;
Nonetheless, this feature would have been made obvious, as evidenced by Grover.
(1) (Grover teaches performing a back-end compilation using the plurality of optimized intermediate representations to generate assembler code, e.g. “the middle-end generates a second IR for the back-end to process. In particular, the back-end receives the second IR and translates the second IR into assembly-level code” [0006])
(2) (Grover teaches performing a second optimization operation using the assembler code [0007-0008] to generate optimized assembler code, e.g. “One advantage of the techniques disclosed herein is that compiled assembly instructions are automatically checked by the compiler 150 for additional optimization opportunities. Specifically, the compiler 150 is able to detect vectorizable assembly instructions that can be replaced with fewer or simpler vectorized assembly instructions.” [0039])
(3) (Grover teaches generating executable code using the optimized assembler code, e.g. “Moreover, fewer or more efficient assembly instructions results in fewer cycles of any processor tasked to execute the computer program, which directly correlates to energy savings” [0039])
It would be beneficial to apply the teaching of Grover on the executable code of Correll in view of Doerr in view of Boger.
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll in view of Doerr in view of Boger with the teachings of Grover. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Rationale G. Teaching, Suggestion, and Motivation.
The motivation would have been as follows: “In this manner, the total number of instructions of which the computer program is comprised may potentially be reduced, which increases overall execution efficiency of the computer program” [Grover – 0039]. 
Regarding claim 14, Correll discloses the following: 
(Currently Amended) A non-transitory computer-accessible storage medium having programming instructions stored therein that, in response to execution by a computer system, causes the computer system to perform operations comprising: 
performing a front-end compilation using application source code to generate a plurality of intermediate representations and connectivity information, wherein a particular intermediate representation of the plurality of intermediate representations corresponds to a particular task of a plurality of tasks, wherein the connectivity information includes a plurality of connections, and wherein a particular connection specifies a communication between a first task of the plurality of tasks and a second task of the plurality of tasks; 
(Correll teaches performing a front-end compilation using application source code [Abstract; 0039, 0062] to generate a plurality of intermediate representations [0062] and connectivity information, e.g. resource model which has “actor interconnect information” [0062], wherein a particular intermediate representation of the plurality of intermediate representations corresponds to a particular task of a plurality of tasks, e.g. “resource selection (806) … may be applicable to the process and the target (or end) application” [0168], wherein the connectivity information includes a plurality of connections – see interconnect [0044], and wherein a particular connection specifies a communication between a first task within a first node of the plurality of tasks and a second task within a second node of the plurality of tasks, e.g. “A Graphical Data Flow Program comprises a plurality of interconnected nodes (blocks), wherein at least a subset of the connections among the nodes visually indicate that data produced by one node is used by another node” [0044])
mapping logical objects included in the application source code to physical resources included in a multi-processor array using the plurality of intermediate representations and the connectivity information to generate a resource map,
(Correll teaches mapping logical objects included in the application source code to physical resources [0155; TABLE 1; FIG. 7] included in a multi-processor array, e.g. “multi-processor systems” [0144], using the plurality of intermediate representations and the connectivity information to generate a resource map [0154-0155])
selecting a respective implementation for each connection in the plurality of connections; 
(Correll teaches selecting a respective implementation, e.g. “the user operates to select a target device from a plurality of possible target devices for programming or configuration using a graphical program or a program generated based on a graphical program” [0106], for each connection in the plurality of connections, e.g. “Blocks may be designed with dataflow semantics, and may communicate with each other via terminals (on the blocks) through wires connecting the blocks” [0125])
performing a first optimization operation using the plurality intermediate representations to generate a plurality of optimized intermediate representations; 
(Correll teaches performing a first optimization operation, e.g. “optimize the design at various stages of the compilation process according to analysis performed at those stages, to enhance future compilations of the desktop algorithm” [0005], using the plurality intermediate representations to generate a plurality of optimized intermediate representations, as evidenced by “automatic feedback to change compilation/optimization decisions is important” [0156])
re-mapping the logical objects based on results of the first optimization operation; 
(Correll teaches re-mapping the logical objects, e.g. “In one set of embodiments, real information from the compilation may be provided back to earlier stages in the compilation (compile) process, so different decisions may be made to obtain a more optimal result” [0156], “placements may be modified slightly to improve timing paths” [0159], and resource adjustments [0160-0161], based on results of the first optimization operation [0156])
generating executable code
(Correll teaches generating executable code [0008, 0103], e.g. “generated the end "application" (by compiling/synthesizing the target specific code to a corresponding hardware or FPGA implementation, for example)” [0008])
simulating the executable code; and 
(Correll teaches simulating the executable code, e.g. “performing Hardware in the Loop (HIL) simulation. Hardware in the Loop (HIL) refers to the execution of the plant model 94 in real time to test operation of a real controller 92. For example, once the controller 92 has been designed, it may be expensive and complicated to actually test the controller 92 thoroughly in a real plant, e.g., a real car. Thus, the plant model (implemented by a graphical program) is executed in real time to make the real controller 92 "believe" or operate as if it is connected to a real plant, e.g., a real engine” [0105])

However, Correll does not disclose the following:
(1)	wherein the multi-processor array includes a plurality of processors and a plurality of data memory routers arranged in an interspersed fashion, wherein the plurality of data memory routers includes a given data memory router configured to transfer instructions and data to a different data memory router;
(2)	generating executable code; and 
(3)	loading the executable code onto the multi-processor array.
Nonetheless, this feature would have been made obvious, as evidenced by Doerr.
(1) (Doerr teaches that the multi-processor array, e.g. a multi-processor system or MPS, includes a plurality of “a plurality of processing elements (PEs)” [0096] and “a plurality of data memory routers (DMRs)” [0103] arranged in an interspersed fashion, e.g. “The PEs and DMRs in one MPS embodiment may be interspersed in a substantially homogeneous fashion” [0116], wherein the plurality of data memory routers includes a given data memory router configured to transfer instructions and data to a different data memory router, e.g. “communicate data/instructions with neighboring DMRs, and optionally on through those DMRs to other DMRs” [0131])
(2) (Doerr teaches generating executable code, e.g. “to generate an executable program that is deployable to the multiprocessor system for efficient parallel execution” [0037], using the plurality of optimized intermediate representations – the optimized intermediate representations are cited as a Software Programming Model, e.g. “(dynamic) system definitions, i.e., that the tool flow can interpret the system definition” [0189] and “The example code of Code Portion A, filter_kernel.c, approaches the control model, the communication model, and supporting memory structures and processing explicitly, and may be required to be interpreted as such. This does not allow dynamic interactivity to be defined between control, communications, and memory structure in such a way as to intuitively represent a system, to define the system, or to interpret the system in an efficient way” [0190])
(3) (Doerr teaches loading the executable code [0047] onto the multi-processor array or multiprocessor system [0037, 0047], e.g. “operation of the application on the multiprocessor system, where these specified requirements or constraints may be used by a compiler (or other tool) to generate executable code that may be executed efficiently on the system” [0047])
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll with the teachings of Doerr. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Apply these teachings of Doerr to apply its properties on the multiprocessor array of Correll and the optimized intermediate representations of Correll. 
There are motivations to perform these steps as follows: 
(1) “to generate an executable program that is deployable to the multiprocessor system for efficient parallel execution” [0148]
(2) “to effectively represent the system properties in the software programming model, and there are any number of ways to do this, including, but not limited to, creating or expanding an API such as MPI to support ANSI C, creating specific class structures in C++, etc. However, as noted above, the specific lexical representation is not important. Rather, what is important is that the programming model recognizes these (dynamic) system definitions, i.e., that the tool flow can interpret the system definition and then effectively map the system to the target execution model and underlying hardware architecture” [0217 – Doerr].

However, Correll in view of Doerr does not disclose the following:
(1)	simulating the executable code to generate test results that include performance information 
(2)	performing a second back-end compilation in response to determining further optimization is possible based on the test results; 
Nonetheless, this feature would have been made obvious, as evidenced by Boger.
(1) (Boger teaches simulating the executable code to generate test results [0053] that include performance information [0046], e.g. “certain performance characteristics of a computer program (the "monitored program") by executing the monitored program under simulated or actual conditions, while running the sample collector 221” [0046] and “running a series of test cases and verifying the results observed” [0053])
(2) (Boger teaches performing a second back-end compilation [0052-0053; FIG. 5, Sequence of Elements 508 [Wingdings font/0xE0] 501 [Wingdings font/0xE0] 502 [Wingdings font/0xE0] 503] in response to determining further optimization is possible based on the test results [0053], e.g. “For example, step 508 might represent execution in a debug mode wherein program execution can be halted in the middle of the program by occurrence of various events, or might represent execution normally by running a series of test cases and verifying the results observed, or some other mode. Any of these processes may be repeated indefinitely (as indicated), or may cause the programmer to return to step 501 to again edit the source, or, in the case of certain compilers which use profiling data collected during execution as part of the optimization process, to return to step 503 to re-compile the code.” [0053]. 

Please see evidence on figure below – following sequence from 1 to 4: 
[AltContent: textbox ([Wingdings font/0x81])][AltContent: textbox ([Wingdings font/0x82])][AltContent: oval][AltContent: oval][AltContent: textbox ([Wingdings font/0x83])][AltContent: textbox ([Wingdings font/0x84])][AltContent: oval][AltContent: oval]
    PNG
    media_image1.png
    1231
    867
    media_image1.png
    Greyscale

[see FIG. 5, Elements 508 [Wingdings font/0xE0]	501 [Wingdings font/0xE0]	502 [Wingdings font/0xE0]	503])
Apply the teachings of Boger to perform further steps with respect to optimized intermediate code of Correll in view of Doerr, as well logical objects and resource architecture of Correll in view of Doerr.
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll in view of Doerr with the teachings of Boger. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Rationale G: Teachings, Suggestion, and Motivation.
The motivation would have been as follows: 
“Collection and analysis of sampled data is typically performed iteratively on different version of program 215, which is repeatedly modified and re-analyzed (using sampled data or otherwise) until a desired result is achieved” [0050 – Boger].

However, Correll in view of Doerr in view of Boger does not disclose the following:
(1)	performing a first back-end compilation using the plurality of optimized intermediate representations to generate assembler code; 
(2)	performing a second optimization operation using the assembler code to generate optimized assembler code; 
(3)	 generating executable code using the optimized assembler code;
Nonetheless, this feature would have been made obvious, as evidenced by Grover.
(1) (Grover teaches performing a back-end compilation using the plurality of optimized intermediate representations to generate assembler code, e.g. “the middle-end generates a second IR for the back-end to process. In particular, the back-end receives the second IR and translates the second IR into assembly-level code” [0006])
(2) (Grover teaches performing a second optimization operation using the assembler code [0007-0008] to generate optimized assembler code, e.g. “One advantage of the techniques disclosed herein is that compiled assembly instructions are automatically checked by the compiler 150 for additional optimization opportunities. Specifically, the compiler 150 is able to detect vectorizable assembly instructions that can be replaced with fewer or simpler vectorized assembly instructions.” [0039])
(3) (Grover teaches generating executable code using the optimized assembler code, e.g. “Moreover, fewer or more efficient assembly instructions results in fewer cycles of any processor tasked to execute the computer program, which directly correlates to energy savings” [0039])
It would be beneficial to apply the teaching of Grover on the executable code of Correll in view of Doerr in view of Boger.
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll in view of Doerr in view of Boger with the teachings of Grover. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Rationale G. Teaching, Suggestion, and Motivation.
The motivation would have been as follows: “In this manner, the total number of instructions of which the computer program is comprised may potentially be reduced, which increases overall execution efficiency of the computer program” [Grover – 0039]. 
Claim(s) 2, 8, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Correll in view of Doerr in view of Boger in view of Grover in view of Vasilliev et al. (Pub. No. US2017/0262567; hereinafter Vasilliev).
Regarding claims 2, 8, and 15, Correll in view of Doerr in view Boger in view of Grover disclose the following: 
wherein performing the front-end compilation includes: 
(1)	parsing the application source code to generate an initial intermediate representation; 
(2)	performing at least one second optimization operation using the initial intermediate representation to generate the plurality of intermediate representations; 
(3)	identifying, using the plurality of intermediate representations, connectivity between the plurality of tasks to generate the plurality of connections; and 
(4)	storing the plurality of intermediate representations and connectivity information in a project database.  
Nonetheless, this feature would have been made obvious, as evidenced by Vasilliev.
(1) (Vasilliev teaches parsing the application source code [0077] to generate an initial intermediate representation, e.g. “The IR code may be produced by a front-end compiler, followed by optimization by a llvm-opt or an external optimizer” [0247])
(2) (Vasilliev teaches performing at least one second optimization operation using the initial intermediate representation, e.g. “The LLVM compiler infrastructure may be implemented for indirect compilation and optimization, some details of which are described elsewhere herein. LLVM is a modular chain of software compilation, optimization and linking which utilizes target-independent codes called Intermediate Representation (IR) during the steps of software compilation” [0247], to generate the plurality of intermediate representations, e.g. “IR code may be produced by a front-end compiler” [0247])
(3) (Vasilliev teaches identifying, using the plurality of intermediate representations, connectivity between the plurality of tasks to generate the plurality of connections [0272-0273], e.g. “techniques may be applied to significantly reduce the time of backend compilation. Additionally, choosing the same type of FPGA for every member or nearly every member of the array and applying logical to physical mapping of the segmented kernels may further alleviate complexity of software code partitioning by mapping fixed physical connections of the array of FPGA devices into multiple different virtual topologies” [0272])
(4) (Vasilliev teaches storing the plurality of intermediate representations [0038] and connectivity information, e.g. a graph or control flow information [0306], in a project database [0265-0266, 0308-0309])
Add the teachings of Vasilliev to the teachings of Correll in view of Doerr in view of Boger in view of Grover in view of Beardslee.
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll in view of Doerr in view of Boger in view of Grover with the teachings of Vasilliev. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Rationale A. Combining prior art elements according to known methods to yield predictable results.
The predictable result would be to create “a database of the parameters of the processing specification and specific limitations” and further derive “the optimum partitioning solution” [0309 – Vasilliev].
Claim(s) 5, 11, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Correll in view of Doerr in view of Boger in view of Grover in view of Grover et al. (Pub. No. US2013/0198494 published on August 1, 2013; hereinafter Grover II).
Regarding claims 5, 11, and 18, Correll in view of Doerr in view of Boger in view of Grover disclose the following: 
wherein the multi-processor array includes a plurality of processors, 
(Correll teaches that the multi-processor array includes a plurality of processors [0144])

However, Correll in view of Doerr in view of Boger in view of Grover does not disclose the following:
and wherein performing the first optimization operation using the plurality intermediate representations to generate the plurality of optimized intermediate representations includes vectoring a loop of multiple instructions to utilize a subset of the of the plurality of processors.  
Nonetheless, this feature would have been made obvious, as evidenced by Grover II.
(Grover II teaches performing the first optimization operation using the plurality intermediate representations to generate the plurality of optimized intermediate representations [0032] includes vectoring a loop of multiple instructions to utilize a subset of the of the plurality of processors, e.g. “mapping vectorizable instructions into SSE instructions within a loop construct” [0032])
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll in view of Doerr in view of Boger in view of Grover with the teachings of Grover II. 
The modification would have been to modify the optimization operation of Correll in view of Doerr in view of Boger in view of Grover with the vectoring step of Grover II. 
The motivation would have been as follows: “Any technically feasible technique may be implemented to optimize CPU instructions generated from the intermediate representation of the PTX instructions. Such optimization may include mapping vectorizable instructions” [0032 – Grover II].
Claim(s) 13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Correll in view of Doerr in view of Boger in view of Grover in view of Tuck et al. (Pub. No. US2007/0294681 published on December 20, 2007; hereinafter Tuck).
Regarding claims 13 and 20, Correll in view of Doerr in view of Boger in view of Grover does not disclose the following: 
wherein generating the executable code using the plurality of optimized intermediate representations includes generating a respective object code for each task of the plurality of tasks using a corresponding optimized intermediate representation of the plurality of optimized intermediate representations.  
Nonetheless, this feature would have been made obvious, as evidenced by Tuck.
(Tuck teaches generating the executable code using the plurality of optimized intermediate representations, e.g. “the IR nodes include information used for generating optimized compute kernels for the different types of processing elements of the parallel-processing computer system” [0046], includes generating a respective object code, e.g. one of the objects [0384], for each task of the plurality of tasks/operations [0469] using a corresponding optimized intermediate representation of the plurality of optimized intermediate representations [0465])
At a time prior to the effective filing date of Applicant’s claimed invention, it would have been obvious to modify Correll in view of Doerr in view of Boger in view of Grover with the teachings of Tuck. 
One of ordinary skill in the art would recognize the desirability of performing the following modification: Apply this teaching of Tuck to generate object code for each task of the plurality of tasks of Correll in view of Doerr in view of Boger in view of Grover in view of Beardslee.
The motivation would have been as follows: “the program execution scheduler 1186 and the program executors 1187 work in concert to execute the compute kernels on selected types of processing elements of a parallel-processing computer system” [0465 – Tuck].

Response to Amendment
Applicant’s arguments, see “REMARKS”, filed May 17, 2022, with respect to claims 1-5, 7-11, 13-18, and 20. Those arguments have been respectfully considered. However, these arguments moot in view of a new grounds of rejection.
Examiner performed a further round of search and discovered the following prior art listed below: 
Boger et al. (Pub. No. US2006/0218543; hereinafter Boger) 
Therefore, the independent claims and the dependent claims are still unpatentable over 35 U.S.C. 103. 
The dependent claims are also still unpatentable over 35 U.S.C. 103. 
Examiner, as a result, maintains all claim rejections under 35 U.S.C. 103.
Examiner recommends that Applicant further amend the claims to overcome the rejection set forth, along with the prior art of record.

Conclusion
The prior arts used for this office action were the most substantial for this rejection.

Contact Information
Any inquiry concerning this communication or earlier communications from the Examiner should be directed to Gilles Kepnang whose telephone number is (571) 270-7417. Business hours for Examiner are Monday – Friday (8:00 AM – 5:00 PM).
If attempts to reach the Examiner by telephone are unsuccessful, please contact Lewis Bullock (571) 272-3759. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/GILLES R KEPNANG/Examiner, Art Unit 2199                                                                                                                                                                                                        February 8, 2022

/LEWIS A BULLOCK  JR/Supervisory Patent Examiner, Art Unit 2199