DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claims 1, 8, 10, 20 are objected to because of the following informalities:  
A) “pro-cessing” (Claim 1, lines 3, 6);
B) “por-tion” (Claim 8, line 3);
C) “com-prises” (Claim 10, line 6);
D) “mo-dule” (Claim 10, line 9);
E) “exe-cuted” (Claim 10, line 11. See also claim 20)
 Suggestion for correction: Delete the hyphen. Appropriate correction is required.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1,2,3,4,5,6,8,10,11,14,17,18,19 is/are rejected under 35 U.S.C. 102 a (1) as being anticipated by Li et al.  20080163183.
As to claim 1, Li teaches a method comprising (see [0020][0021]):
receiving (e.g. by the task assignment decision), by a processing device [system 100: task practitioner 200] (see task practitioner 200 is a component of compiler 104 of a system 100), a first request [M(v)=1] (see dependent on the solution to the task assignment decision, a subsequent instruction may be executed to offload task execution to the helper core(s) in [0019]; see also the control transfer message in [0020]); 
identifying (e.g. the instruction address for the task) one or more instructions of the program [task] to be offloaded (e.g. by the offloading) to a second processing device [helping core(s)] (see the content transfer message may be, for example, one or more of get, store, push, and/or pull messages to transfer instruction(s) and/or data from the main core local memory to the helper core(s) local memory, which may be in the same or different address space(s) in [0020][0021]), 
wherein the processing device (fig.1 for the system that includes the compiler; see also the machine executable code of the compiler with parameterized offloading which may be executed by the CMP system 500 in fig.5, [0045]) comprises a same instruction set architecture [CMP] as the second processing device [helper core(s)] (see CMP is referring to the single chip multiprocessor system in [0004] for the background teaching); 
providing the one or more instructions [task] to a memory module [cores 502a-n] comprising the second processing device [processing core 502n] (see fig.5, [0046],  for the structural components of the illustrated example of the main core [502a] and helping core 
responsive to detecting an indication [control message] to execute the one or more instructions [task], causing, by the processing device [task practitioner 200] (see task practitioner 200 is a component of compiler 104 of a system 100), the second processing device [helping core 502 b-n] to execute the one or more instructions [task] (see [0021], in addition to the content transfer message(s), the task practitioner 200 of the illustrated example also inserts a control transfer message(s) to signal a control transfer of one or more tasks to the helper core(s) after the conditional statement evaluates the task assignment decision and determines to offload the task execution (e.g., M(v)=1 to offload a task). The control message(s) may include, for example, an identification of the set or subset of the helper cores to execute the task(s)), the one or more instructions [task] to update (e.g. by the lock) a portion (e.g. by the memory address or pointer) of a memory space associated with the memory module (see [0021], the instruction address(es), the pointer to the memory address, which is unknown until the rum time for the task(s), for execution context (e.g. stack frame)).
As to claim 2, Li teaches wherein identifying the one or more instructions of the program to be offloaded (see [0017] showing the metadata information; see [0020][0021] for more details of the metadata) further comprises: 
identifying metadata information [M(v)] embedded in the program associated with the one or more instructions [task]; and 

As to claim 3, Li teaches wherein identifying the one or more instructions of the program [task] to be offloaded further comprises: 
determining (e.g. by the cost estimation) that a number of operations [execution count] performed by the one or more instructions [task] satisfies a threshold number [count] (see the computation cost C.sub.h(v) may be, for example, the sum of the products of the average time to execute an instruction i on the helper core(s) and the execution count of the instruction i in task v in [0033]).
As to claim 4. The method of claim 1, wherein providing the one or more instructions to the memory module [502] further comprises: 
determining (e.g. defined by the arguments or transfer/control messages) a memory space [main/local memory address] updated (e.g. being used, allocated, assigned) by the one or more instructions [task] (see [0020]; see also [0021], the pointer to the memory address); and 
determining (e.g. by identifying) a memory module [helper core identifier(s)]  associated with at least a portion of the memory space [size of the block to push/store, the main core memory address of the block to push/store, and/or the local address of the block], 
wherein the memory module [processing 502a-n]  comprises the second processing device [helper core 502b] ([0020]; see also [0045] that teaches for example core 502a implemented as a main core, and core 502b as a helper core; see [0045][0046], each core 502 may also include private memories [506][508][510]).

As to claim 6, Li teaches wherein causing the second processing device [helper core] to execute the one or more instructions [task] comprises: 
at least one of sending a request [control message(s)] to the second processing device [helper core 502] to execute the one or more instructions [task], or modifying  (e.g. address/pointer by the control message(s)) an area of memory [instruction address in the address space or pointer to the memory address known at run time] of the memory module [502] to cause the second processing device [helper core] to execute the one or more instructions [task]  (see [0021]).
As to claim 8, Li teaches further comprising: receiving a second notification [control message: complete] from the second processing device [helper core] that indicates that the one or more instructions [task] have been executed [task execution has been completed] by the second processing device [helper core] to update (e.g. by the address pointer at the rum time) the portion of the memory space. (See [0022]; see also [0021], a pointer to the memory address, which is unknown until run time for the task(s)).
As to claim 10, Li teaches a system comprising (see fig.5 [500] for the structural components; see also fig.1 for the compiler and the object code that can be executed by chip multiprocessor 500 in fig.5): 

a processing device [system 100: task practitioner 200] (see task practitioner 200 is a component of compiler 104 of a system 100 which includes at least the compiler 104 and object code 106 for offloading; FIG. 5, [0045], illustrates an example chip multiprocessor ("CMP") system 500 that may execute the object code 106 of FIG. 1 that includes parameterized offloading), operatively coupled to the memory [506a-n, 508a-n, 510a-n], to: 
receive (e.g. by the task assignment decision) a first request [M(v)=1] to execute one or more instructions of a program (see dependent on the solution to the task assignment decision, a subsequent instruction may be executed to offload task execution to the helper core(s) in [0019]; see also the control transfer message in [0020]); 
responsive to determining [M(v)=1] that the one or more instructions of the program [task] are to be offloaded to a second processing device [helping core(s)] (see the content transfer message may be, for example, one or more of get, store, push, and/or pull messages to transfer instruction(s) and/or data from the main core local memory to the helper core(s) local memory, which may be in the same or different address space(s) in [0020][0021]), 
wherein the second processing device [helper core(s)] comprises a same instruction set architecture [CMP] as the processing device (See fig.1 for the system that includes the compiler; see also the machine executable code of the compiler with parameterized offloading which may be executed by the CMP system 500 in fig.5; see also CMP is referring to the single chip multiprocessor system in [0004] for the background teaching);
determine (e.g. by the control message) a memory module [cores 502n] associated with at least a portion of a memory space [instruction address/address pointer] updated (e.g. by the 
wherein the memory module [cores 502n] comprises the second processing device 502n (see fig.5, [0046],  for the structural components of the main core [502a] and helping core [502b/n], each  core [502a-n] includes a processor core and private caches [I 506][D 508][L2 unified cache 510]); 
provide, to the memory module [502n], the one or more instructions to be executed by the second processing device [helper cores 502] (see [0020], the content transfer message to offload or transfer the task/instruction(s) to the helper core(s)); and 
provide, to the second processing device [helper core], an indication [control message] to cause the second processing device [helper core] to execute the one or more instructions [task] to update (e.g. by the pointer to the memory address at run time) the portion of the memory space of the memory module [helper core 502]. (see [0021], in addition to the content transfer message(s), the task practitioner 200 of the illustrated example also inserts a control transfer message(s) to signal a control transfer of one or more tasks to the helper core(s) after the conditional statement evaluates the task assignment decision and determines to offload the task execution (e.g., M(v)=1 to offload a task). The control message(s) may include, for example, an identification of the set or subset of the helper cores to execute the task(s); see in 
As to claim 11, Li teaches wherein the processing device is further to: responsive to determining (e.g. by indicating M(v)=0) that the one or more instructions are not to be offloaded to the second processing device [helper core] (see M(v)=0), execute the one or more instructions [task] to update (e.g. being assigned) the portion of the memory space [address of the task] (see each task is assigned to execute on a main core or helper core based on set of parameters in [0019]; see [0016], during execution, a task may be fused, aligned, and/or split for optimal use of local memory. That it, tasks need not be consecutive addresses of machine readable instructions in local memory).
As to claim 14, Li teaches wherein to determine that the one or more instructions of the program [task] are to be offloaded [to offload] to a second processing device [helper core], the processing device is to: 
determine that the portion of the memory space [consecutive segment/sequential loop] updated by the one or more instructions [task] comprises a linear range [consecutive/sequential] of memory addresses for the memory module [consecutive segment/sequential loop in local memory]. (See a task may be a consecutive segment of the source code, and the tasks may not be (or may be) in consecutive addresses of the local memory in [0016]. Note: examiner holds that both consecutive addresses and non-consecutive addresses are applicable in Li).
As to claim 17, claim 17 includes similar limitations of claim 1 except it is directed to a non-transitory computer readable medium comprising instructions (see claim 17, preamble). Li 
Similarly, dependent claims 18, 19 correspond to dependent claims 4, 8, respectively, and are rejected under the same reasons as in claims 4, 8 above.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al.  20080163183 in view of Koike et al. 20050151996.
As to claim 7, Li does not but Koike teaches:
receiving a second notification [transmit link rejected from T303] from the second processing device [slave machine] that indicates that the one or more instructions [instruction to execute link copying] are not to be executed by the second processing device [slave machine] to update the portion of the memory space [the accumulated image from the memory being transferred to the slave machine]  (see when the response from the salve machine is Link Rejected, the master machine singly executes the copy operation for the link copy job in [0194]; for the update portion of memory space, see [0147], the memory unit 404 accumulates the image data and also transfers the image to the slave machine.  See also for the introductory teaching in [0046], the slave machine has a link copy function that the slave machine receives an image of a document scanned and transferred by the master machine and shares the job of printing the scanned document image with the master machine).
executing the one or more instructions [copy operation] by the processing device [master machine] to update the portion of the memory space [the accumulated image data stored in the memory for copy operation]. (See when the response from the salve machine is Link Rejected, the master machine singly executes the copy operation for the link copy job in [0194]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to receive a second notification from the second processing device that indicates that the one or more instructions are not to be executed by the second .
Claims 12, 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al.  20080163183 in view Codrescu et al. 20160092238.
As to claim 12, Li does not but Codrescu teaches wherein to determine that the one or more instructions of the program [load instruction(s)] are to be offloaded to a second processing device [coprocessor], the processing device is to: 
determine an instruction type [load] for the one or more instructions [load instruction] (see identifying a first load instruction in [0049]); and 
determine that the one or more instructions [load instruction] satisfy an offloading eligibility threshold [without waiting for the first load data] in view of the instruction type [load].(see the determination that the first load instruction may be offloaded to the coprocessor without waiting for the corresponding first load data in [0049]).

As to claim 13, Li does not but Codrescu teaches wherein the instruction type comprises at least one of a vector instruction type [vector load] or a looping instruction type. (see [0005], certain load instructions, such as vector load instructions are offloaded from the main processor to the coprocessor, after the load instructions are committed in the main processor without receiving corresponding load data).
The reason of obviousness in claim 12 is also applicable to claim 13 and not being repeated herein.
Claim 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al.  20080163183 in view of Horvath et al. 20070250681.

detect an update (e.g. via the defined starting and ending addresses/locations)  to at least one memory address of the linear range [start-end: serial instructions] of memory addresses [starting to ending address(es)] for the memory module [task queue 210][instruction store 207] (see the definition of a task is simply a starting and ending address that the vector processor needs to execute from and to in its instruction store in [0069]; see also in [0069], the vector processor executes serial instruction sequences starting and ending at locations determined by the scalar processor 206; see also scalar/sequence processor 206 is to interpret control block from the host processor based on the desired action  to load and initiate various tasks that the vector processor needs to perform in [0068]); and 
send a second notification [to load and initiate] to the second processing device [vector processor 209] to cause the second processing device [vector processor 209] to execute the one or more instructions (See scalar/sequence processor 206 is to interpret control block from the host processor based on the desired action to load and initiate various tasks that the vector processor needs to perform in [0068]), 
the one or more instructions [task] to update the remaining (e.g. how many tasks are left) memory addresses [start-end] of the linear range [serial sequence] of memory addresses for the memory module. (See [0069], the Sequence Processor 206 can monitor the progress of the Vector Processor based on how many tasks are left in the Task Queue.  When the Task Queue is empty the Vector Processor remains idle.  When there is at least one task in the Task Queue the Vector Processor begins processing at the starting address in its Instruction Store; serial instruction sequences starting and ending at locations determined by the scalar processor 206).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the detection of an update to at least one memory address of the linear range of memory addresses for the memory module; and send a second notification to the second processing device to cause the second processing device to execute the one or more instructions, the one or more instructions to update the remaining memory addresses of the linear range of memory addresses for the memory module, as claimed (see the details of the claim mapping above), because one of ordinary skill in the art should be able to recognize the application of a known technique, such as the Horvath’s loading and initiation of the tasks to the vector coprocessor at the starting to ending locations, to a known device/method, such as the offloading the task from the main core to the helper core by Li as cited above, in order to  allow for full overlapping of the control and data processing within the Vector Coprocessor complex.  It also allows the Vector Coprocessor to operate independently from the Host Processor which provides for a greater potential for high system level utilization then in other coprocessor environments.  (See Horvath [0070. MPEP 2143 KSR Example D).

Allowable Subject Matter
Claims 9, 16, 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. None of the prior art of record teaches:

receiving a fourth notification from the third processing device that indicates that the one more instructions have been executed by the third processing device to update the second portion of the memory space. (Claim 9, see also similarly recited claim 20).
b) The determination of the second memory module associated with a second portion of the memory space updated by the one or more instructions, wherein the second memory module comprises a third processing device; provide the one or more instructions to the second memory module to be executed by the third processing device; send a second notification to the third processing device to cause the third processing device to execute the one or more instructions, the one or more instructions to update the second portion of the memory space of the second memory module; receive a second response from the third processing device that indicates that the one more instructions have been executed by the third processing device to update the second portion of the memory space. (Claim 16).
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  

b) Thaler et al. 20090300642 is cited for the teaching of the main processor 105A may instruct the auxiliary processor 105B to load coded instructions 701 for implanting the data transform into the auxiliary processor's local memory 106B.     (See [0112]   ).
c) Amin Ghasemazar et al. “Embedded Complex Floating Point Hardware Accelerator” IEEE, 2014 is cited for the teaching of offloading task to a coprocessor (see Abstract).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL H PAN whose telephone number is (571)272-4172. The examiner can normally be reached M-F 8:30 am -5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached on 571 272 4169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


DANIEL H. PAN
Examiner
Art Unit 2182



/DANIEL H PAN/Primary Examiner, Art Unit 2182