1.	The present application is being examined under the pre-AIA  first to invent provisions. 
DETAILED ACTION
2. 	This Office Action is taken in response to Applicants’ application 17/396,716 filed on 8/8/2021.  
	Claims 1-20 are pending for consideration.

3.					Examiner’s Note
(1) In the case of amending the Claimed invention, Applicant is respectfully requested to indicate the portion(s) of the specification which dictate(s) the structure relied on for proper interpretation and also to verify and ascertain the metes and bounds of the claimed invention. This will assist in expediting compact prosecution.  MPEP 714.02 recites: “Applicant should also specifically point out the support for any amendments made to the disclosure. See MPEP § 2163.06. An amendment which does not comply with the provisions of 37 CFR 1.121(b), (c), (d), and (h) may be held not fully responsive. See MPEP § 714.”  Amendments not pointing to specific support in the disclosure may be deemed as not complying with provisions of 37 C.F.R.  1.131(b), (c), (d), and (h) and therefore held not fully responsive.  Generic statements such as “Applicants believe no new matter has been introduced” may be deemed insufficient.
(2) Examiner has cited particular columns/paragraph and line numbers in the references applied to the claims above for the convenience of the applicant. Although the specified citations are representative of the teachings of the art and are applied to specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.

Double Patenting
4.	The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the "right to exclude" granted by a patent and to prevent possible harassment by multiple assignees.  See In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and, In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent is shown to be commonly owned with this application.  See 37 CFR 1.130(b).
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer.  A terminal disclaimer signed by the assignee must fully comply with 37 CFR 3.73(b).
5.	Claims 1, 8, and 18 are provisionally rejected under the judicially created doctrine of obvious-type double patenting as being unpatentable over claims 1-20 of US Patent number 11,106,592. 
Although not all of the conflicting claims are exactly identical, they are extremely similar and are not patentably distinct from each other as explained below (the limitations presented in bold font are those taught by the second reference):
17/396,716
11,106,592
1. A method comprising:

 fetching, by at least one processor, at least one instruction of a first instruction set from a cache memory for execution by the at least one processor, wherein the at least one instruction of the first instruction set is loaded to the cache memory in a fixed-size data block fetched from main memory via a block oriented cache-access path that provides fixed-size data block access to the main memory; and 

offloading, by the at least one processor, at least one instruction of a second instruction set for execution by at least one heterogeneous functional unit, wherein the at least one instruction of the second instruction set is fetched directly from the main memory to the at least one heterogeneous functional unit via an address oriented cache-bypass path that provides individually-addressed data access to the main memory.
1. A memory system providing memory access for a multiple processor system including at least one processor and at least one heterogeneous functional unit, the at least one processor configured to execute at least one instruction of a first instruction set and the at least one heterogeneous functional unit configured to execute at least one instruction of a second instruction set that is different than the first instruction set, the memory system comprising: 

a main memory configured for individually-addressed data access; 

a block oriented cache-access path coupling the main memory and a cache memory, wherein the block oriented cache-access path is configured to provide fixed-size data block access to the main memory, and wherein the cache memory provides storage of fixed-size data blocks of main memory data for data access by the at least one processor; and 

an address oriented cache-bypass path coupling the main memory and the at least one heterogeneous functional unit, wherein the address oriented cache-bypass path provides individually-addressed data access to the main memory.


6.	Provisional Rejection, Obviousness Type Double Patenting – With Secondary Reference(s)
7.	Claims 1, 8, and 18 are provisionally rejected under the judicially created doctrine of obvious-type double patenting as being unpatentable over claims 1-55 of US Patent number 9,710,384, and in view of Castolli et al. (US patent Application Publication 2002/0099907, hereinafter Castelli). 
Although not all of the conflicting claims are exactly identical, they are extremely similar and are not patentably distinct from each other as explained below (the limitations presented in bold font are those taught by the second reference):
17/396,716
9,710,384
1. A method comprising:

 fetching, by at least one processor, at least one instruction of a first instruction set from a cache memory for execution by the at least one processor, wherein the at least one instruction of the first instruction set is loaded to the cache memory in a fixed-size data block fetched from main memory via a block oriented cache-access path that provides fixed-size data block access to the main memory; and 

offloading, by the at least one processor, at least one instruction of a second instruction set for execution by at least one heterogeneous functional unit, wherein the at least one instruction of the second instruction set is fetched directly from the main memory to the at least one heterogeneous functional unit via an address oriented cache-bypass path that provides individually-addressed data access to the main memory.
1.  A system comprising: 

non-sequential access memory;  

a processor that is operable to process a first portion of instructions included in an executable file;  

a communication bus via which the processor sends a second portion of instructions with specific syntax as appearing in the executable file to a heterogeneous functional unit, wherein the first portion of instructions includes first instructions that are recognized by a first instruction set of the processor and the second portion of instructions includes second instructions that are not recognized by the first instruction set of the processor;  

the heterogeneous functional unit that is operable to execute the second portion of instructions according to the specific syntax;  

cache memory;  

a cache-access path in which block data is communicated between said non-sequential access memory and said cache memory for accesses of said block data by said processor for processing said first portion of instructions; and 

a direct-access path in which individually-addressed data is communicated to/from said non-sequential access memory for accesses of said individually-addressed data by said heterogeneous functional unit for processing said second portion of instructions.


Regarding claim 1 of application 13/540,643, claim 1 of US patent 9,710,384 teach all the limitations recited in claim 1 (see table above), except the limitation “fixed-size data blocks.”
However, Castelli teaches fixed-size data blocks [FIG. 5 illustrates an example processor cache and main memory organization for the computer architecture of FIG. 4, where the compressed main memory contains a compressed memory directory and a plurality of fixed-size blocks (¶ 0037); The computer system as claimed in claim 23, wherein said array structure is allocated as a chain of fixed-size data blocks each for storing header and trailer data, said second field entry comprising a pointer for pointing directly to a data block for accessing said header and trailer data (claim 26)].
Therefore, it would have been obvious for one of ordinary skills in the art at the time of Applicant’s invention to use fixed-size data blocks, as demonstrated by Castelli, and to incorporate it into the existing apparatus and scheme disclosed by claim 1 of 17/396,716, in order to simplify the data structure and data accessing operations.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

8.	Claims 1-17 are rejected under 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
	Claim 1 recites “A method comprising: fetching, by at least one processor, at least one instruction of a first instruction set from a cache memory for execution by the at least one processor, wherein the at least one instruction of the first instruction set is loaded to the cache memory in a fixed-size data block fetched from main memory via a block oriented cache-access path that provides fixed-size data block access to the main memory.”
	However, the Specification of the current Application only recites fetching data from the cache memory and loading a fixed-size data block from main memory via a block oriented cache-access path that provides fixed-size data block access to the main memory. The Specification of the current Application never recites fetching instructions from the cache memory and loading a fixed-size data block including instructions from main memory via a block oriented cache-access path that provides fixed-size data block access to the main memory. Thus, the limitations recited in claim 1 lacks the support from the Specification of the current Application.
	Claims 2-7 are rejected by virtue of their dependency from claim 1. In addition, claims 4 and 6 also recite loading and/or fetching instructions with cache memory, hence suffering from the same deficiency as in claim 1.
	Claim 8 suffers from the same deficiency as in claim 1, and is rejected by the same reason as in claim 1. In addition, claim 13 also recites loading and/or fetching instructions with cache memory, hence suffering from the same deficiency as in claim 1.
	Claims 9-17 are rejected by virtue of their dependency from claim 8.
	Clarifications/corrections are needed

Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

9.	Claims 1-6, 8-13, and 15-20 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Fossum et al. (US Patent 4,888,679, hereinafter Fossum), and in view of Yoshizawa et al. (US Patent 5,749,089, hereinafter Yoshizawa).
	As to claim 18, Fossum teaches A method comprising: 
accessing, by at least one processor [as shown in figure 7, where the system includes an instruction processing unit (107), a scalar processing unit (108), and a vector processing unit (116)], a first portion of data from a main memory [main memory unit, figure 1, 23; figure 7, 101] via a block oriented cache-access path [as shown in figure 1, where blocks of data is transferred from the main memory (23) to the cache (24); also see figure 7; … In response to a prefetch request, the cache is checked to determine whether it includes the required block, and if the cache does not include the required block, a refill request is sent to the main memory … (c2 L47-65); … The prefetch request generator 25 issues the block prefetch requests at a high rate to ensure that the required memory blocks are transferred from the main memory 23 to the cache 24 as soon as possible … (c5 L33-37)], wherein the block oriented cache-access path is configured to provide fixed-size data block access to the main memory [… The prefetch request generator 25 issues the block prefetch requests at a high rate to ensure that the required memory blocks are transferred from the main memory 23 to the cache 24 as soon as possible … (c5 L33-37); … The block size, for example, is 64 bytes … (c5 L59)]; and 
accessing, by at least one heterogeneous functional unit [either the scalar processor (figure 1, 21; figure 7, 108) or the vector processor (figure 1, 22; figure 7, 116) may be considered as the at least one processor; and then the other processor is considered as the at least one heterogeneous functional unit], a second portion of data from the main memory via an address oriented cache-bypass path, wherein the address oriented cache-bypass path provides individually-addressed data access to the main memory [cache bypass mux, figure 7, 135; … In accordance with an additional feature of the invention, a cache bypass is provided to transmit data directly to the vector processor as the data from the main memory are being stored in the cache.  The bypass saves the time that would otherwise be required in reading the vector data from the cache (c3 1-6); In accordance with an additional feature of the present invention, the cache unit 106 is provided with bypasses which speed the transfer of data from the main memory unit 101 to the vector processing unit 116 when desired vector element data are not found in the data store 114, and consequently have to be fetched from the main memory unit 101 … These gates include a gate 131 for transmitting memory commands from the cache control logic 123 to the main memory unit 101, and a gate 132 for passing physical addresses from the virtual-to physical address translation buffer 112 to the main memory unit 101 (c13 L8-22); … It is also advantageous to provide a cache bypass multiplexer 135 which permits data to be transferred to the scalar processing unit 108 or the vector processing unit 116 at the same time that it is being written into the data store 114.  In other words, the cache control logic 115 activates the cache bypass multiplexer 135 to transfer data from an internal bus 136 to an output bus 137 whenever the physical address of data being written to the data store 114 matches the physical address corresponding to a virtual address desired by the scalar processing unit 108 or the vector processing unit 116.  The cache bypass multiplexer 135 therefore eliminates in this case the delay that would be required to read the desired data out of the data store 114 (c13 L28 to c14 L13);
Yoshizawa more expressively teaches this limitation – as shown in figure 1, there is a direct access path between the processor (30) and the main memory (31) via address bus (34) and data bus (35), completely bypassing the cache memory (32); figure 4 also shows a direct path between the processor (30) and the main memory controller (33); figure 7 shows cache bypassing mode to access main memory directly ST2, ST3, ST5, ST6, ST4, ST7, and ST8; Next, as shown in step ST2 in the main process flow of FIG. 7, the access mode of the access request issued by the processor 30 is judged. There are two access modes in which the processor 30 issues an access request: a bypass mode in which the cache memory 40 is bypassed and an access is made directly to the main memory 31 … (c5 L66 to c6 L12); … 34 is a buffer for switching between addresses (one from the processor 30 and the other from the two-dimensional cache-memory system 32) for supply to the main memory 31 in response to a control signal (SW1) from the two-dimensional cache-memory system 32; and 35 is a buffer for switching data (one from the main memory 31 and the other from the two-dimensional cache-memory system 32) for supply to the processor 30 in response to a control signal (SW0) from the two-dimensional cache-memory system 32 (c3 L55-63); … a 21-bit 2-D plane address register 55 for latching an address used to access the main memory 31 … (c4 L56-58)], and wherein the second portion of data includes at least one instruction offloaded from the at least one processor to the at least one heterogeneous functional unit for execution by the at least one heterogeneous functional unit [either the scalar processor (figure 1, 21; figure 7, 108) or the vector processor (figure 1, 22; figure 7, 116) may be considered as the at least one processor; and then the other processor is considered as the at least one heterogeneous functional unit].
	Regarding claim 18, Fossum teaches a cache bypassing mode to bypass the cace memory and to access the main memory directly [cache bypass mux, figure 7, 135; … In accordance with an additional feature of the invention, a cache bypass is provided to transmit data directly to the vector processor as the data from the main memory are being stored in the cache.  The bypass saves the time that would otherwise be required in reading the vector data from the cache (c3 1-6)], but does not expressively teach that the cache-bypass path provides individually-addressed data access to the main memory.
	However, Yoshizawa specifically teaches a cache bypassing path to access the main memory directly using individually-addressed data access [as shown in figure 1, there is a direct access path between the processor (30) and the main memory (31) via address bus (34) and data bus (35), completely bypassing the cache memory (32); figure 4 also shows a direct path between the processor (30) and the main memory controller (33); figure 7 shows cache bypassing mode to access main memory directly ST2, ST3, ST5, ST6, ST4, ST7, and ST8; Next, as shown in step ST2 in the main process flow of FIG. 7, the access mode of the access request issued by the processor 30 is judged. There are two access modes in which the processor 30 issues an access request: a bypass mode in which the cache memory 40 is bypassed and an access is made directly to the main memory 31 … (c5 L66 to c6 L12); … 34 is a buffer for switching between addresses (one from the processor 30 and the other from the two-dimensional cache-memory system 32) for supply to the main memory 31 in response to a control signal (SW1) from the two-dimensional cache-memory system 32; and 35 is a buffer for switching data (one from the main memory 31 and the other from the two-dimensional cache-memory system 32) for supply to the processor 30 in response to a control signal (SW0) from the two-dimensional cache-memory system 32 (c3 L55-63); … a 21-bit 2-D plane address register 55 for latching an address used to access the main memory 31 … (c4 L56-58)].
	Therefore, it would have been obvious for one of ordinary skills in the art prior to Applicants’ invention to have a cache bypassing path to access the main memory directly using individually-addressed data access, as demonstrated by Yoshizawa, and to incorporate it into the existing apparatus and scheme disclosed by Fossum, in order to support both block-oriented data access and individual-oriented data access.
As to claim 19, Fossum in view of Yoshizawa teaches The method of claim 18, wherein accessing the first portion of data from the main memory via the block oriented cache-access path includes: determining whether the first portion of data is present in cache memory [Yoshizawa – figure 7, ST5, ST14, and ST10, “cache hit judgement;” … a cache comparator for judging whether a copy of the data specified by a combination of a plurality of specific data addresses is stored in the cache memory, that is, whether a cache hit or a cache miss has occurred; and a control block for controlling the main memory and the cache memory in accordance with the result of the judgement made by the cache comparator … (c2 L16-40)]; fetching a fixed-size data block including the first portion of data from the main memory to the cache memory via the block oriented cache-access path [Fossum -- as shown in figure 1, where blocks of data is transferred from the main memory (23) to the cache (24); also see figure 7; … In response to a prefetch request, the cache is checked to determine whether it includes the required block, and if the cache does not include the required block, a refill request is sent to the main memory … (c2 L47-65); … The prefetch request generator 25 issues the block prefetch requests at a high rate to ensure that the required memory blocks are transferred from the main memory 23 to the cache 24 as soon as possible … (c5 L33-37); … The prefetch request generator 25 issues the block prefetch requests at a high rate to ensure that the required memory blocks are transferred from the main memory 23 to the cache 24 as soon as possible … (c5 L33-37); … The block size, for example, is 64 bytes … (c5 L59)]; and loading the first portion of data from the cache memory to the at least one processor [Fossum -- either the scalar processor (figure 1, 21; figure 7, 108) or the vector processor (figure 1, 22; figure 7, 116) may be considered as the at least one processor; and then the other processor is considered as the at least one heterogeneous functional unit].
As to claim 20, Fossum in view of Yoshizawa teaches The method of claim 18, wherein accessing the second portion of data from the main memory via the address oriented cache-bypass path includes: offloading, by the at least one processor, execution of the at least one instruction included in the second portion of data to the at least one heterogeneous functional unit [Fossum -- either the scalar processor (figure 1, 21; figure 7, 108) or the vector processor (figure 1, 22; figure 7, 116) may be considered as the at least one processor; and then the other processor is considered as the at least one heterogeneous functional unit]; and fetching the at least one instruction directly from the main memory to the at least one heterogeneous functional unit via the address oriented cache-bypass path by referencing an individual address of the at least one instruction in the main memory to individually-addressed data access the address of the at least one instruction [Yoshizawa – as shown in figure 1, there is a direct access path between the processor (30) and the main memory (31) via address bus (34) and data bus (35), completely bypassing the cache memory (32); figure 4 also shows a direct path between the processor (30) and the main memory controller (33); figure 7 shows cache bypassing mode to access main memory directly ST2, ST3, ST5, ST6, ST4, ST7, and ST8; Next, as shown in step ST2 in the main process flow of FIG. 7, the access mode of the access request issued by the processor 30 is judged. There are two access modes in which the processor 30 issues an access request: a bypass mode in which the cache memory 40 is bypassed and an access is made directly to the main memory 31 … (c5 L66 to c6 L12); … 34 is a buffer for switching between addresses (one from the processor 30 and the other from the two-dimensional cache-memory system 32) for supply to the main memory 31 in response to a control signal (SW1) from the two-dimensional cache-memory system 32; and 35 is a buffer for switching data (one from the main memory 31 and the other from the two-dimensional cache-memory system 32) for supply to the processor 30 in response to a control signal (SW0) from the two-dimensional cache-memory system 32 (c3 L55-63); … a 21-bit 2-D plane address register 55 for latching an address used to access the main memory 31 … (c4 L56-58)].
	As to claim 1, it recites substantially the same limitations as in claim 18, and is rejected for the same reasons set forth in the analysis of claim 18. Refer to “As to claim 18” presented earlier in this Office Action for details.
As to claim 2, Fossum in view of Yoshizawa teaches The method of claim 1, wherein the at least one processor is configured to execute instructions of the first instruction set, and the at least one heterogeneous functional unit is configured to execute instructions of the second instruction set that is different from the first instruction set [Fossum – as shown in figures 1 and 7, where the vector processor executes vector-oriented instructions, and the scalar processor executes scalar-oriented instructions; scalar-oriented instructions: Program instructions are fed from the cache unit 106 to an instruction processing unit 107 which parses the instructions and sends instruction data to a scalar processing unit 108.  Specifically, the scalar processing unit includes a micro-sequencer and issue logic 109 which executes prestored microcode 110 to interpret and execute the parsed instructions from the instruction processing unit 107.  These instructions include scalar instructions which the micro-sequencer and issue logic executes by operating a register file and an arithmetic logic unit 111.  These scalar instructions include, for example, an instruction to fetch scalar data from the cache unit 106 and load the data in the register file 111 (c11 L35-47); vector-oriented instructions: Upon receipt of the vector load command, the vector processor sends requests to the cache for the individual vector elements.  The requests for the individual vector elements are processed independently of the vector prefetch requests … (c3 L28-42); As shown in FIG. 1, the vector processor 22 also accesses the cache 24 to obtain vector elements.  The vector processor 22, for example, receives a vector load instruction which commands the vector processor to send vector element addresses to the cache 24 … (c4 L55-67)].
	As to claim 3, it recites substantially the same limitations as in claim 19, and is rejected for the same reasons set forth in the analysis of claim 19. Refer to “As to claim 19” presented earlier in this Office Action for details.
	As to claim 4, Fossum in view of Yoshizawa teaches The method of claim 1, wherein the fixed-size data block loaded to the cache memory from the main memory includes instructions in addition to the at least one instruction of the first instruction set, and wherein the at least one instruction of the second instruction set is fetched directly from the main memory to the at least one heterogeneous functional unit is fetched individually without other instructions [Fossum – as shown in figures 1 and 7, where the vector processor executes vector-oriented instructions, and the scalar processor executes scalar-oriented instructions; scalar-oriented instructions: Program instructions are fed from the cache unit 106 to an instruction processing unit 107 which parses the instructions and sends instruction data to a scalar processing unit 108.  Specifically, the scalar processing unit includes a micro-sequencer and issue logic 109 which executes prestored microcode 110 to interpret and execute the parsed instructions from the instruction processing unit 107.  These instructions include scalar instructions which the micro-sequencer and issue logic executes by operating a register file and an arithmetic logic unit 111.  These scalar instructions include, for example, an instruction to fetch scalar data from the cache unit 106 and load the data in the register file 111 (c11 L35-47); vector-oriented instructions: Upon receipt of the vector load command, the vector processor sends requests to the cache for the individual vector elements.  The requests for the individual vector elements are processed independently of the vector prefetch requests … (c3 L28-42); As shown in FIG. 1, the vector processor 22 also accesses the cache 24 to obtain vector elements.  The vector processor 22, for example, receives a vector load instruction which commands the vector processor to send vector element addresses to the cache 24 … (c4 L55-67); Yoshizawa – as shown in figure 1, there is a direct access path between the processor (30) and the main memory (31) via address bus (34) and data bus (35), completely bypassing the cache memory (32); figure 4 also shows a direct path between the processor (30) and the main memory controller (33); figure 7 shows cache bypassing mode to access main memory directly ST2, ST3, ST5, ST6, ST4, ST7, and ST8; Next, as shown in step ST2 in the main process flow of FIG. 7, the access mode of the access request issued by the processor 30 is judged. There are two access modes in which the processor 30 issues an access request: a bypass mode in which the cache memory 40 is bypassed and an access is made directly to the main memory 31 … (c5 L66 to c6 L12); … 34 is a buffer for switching between addresses (one from the processor 30 and the other from the two-dimensional cache-memory system 32) for supply to the main memory 31 in response to a control signal (SW1) from the two-dimensional cache-memory system 32; and 35 is a buffer for switching data (one from the main memory 31 and the other from the two-dimensional cache-memory system 32) for supply to the processor 30 in response to a control signal (SW0) from the two-dimensional cache-memory system 32 (c3 L55-63); … a 21-bit 2-D plane address register 55 for latching an address used to access the main memory 31 … (c4 L56-58)].
	As to claim 5, Fossum in view of Yoshizawa teaches The method of claim 1, further comprising determining whether an instruction to be executed is an instruction of the first instruction set or an instruction of the second instruction set [Fossum – as shown in figures 1 and 7, where the vector processor executes vector-oriented instructions, and the scalar processor executes scalar-oriented instructions; scalar-oriented instructions: Program instructions are fed from the cache unit 106 to an instruction processing unit 107 which parses the instructions and sends instruction data to a scalar processing unit 108.  Specifically, the scalar processing unit includes a micro-sequencer and issue logic 109 which executes prestored microcode 110 to interpret and execute the parsed instructions from the instruction processing unit 107.  These instructions include scalar instructions which the micro-sequencer and issue logic executes by operating a register file and an arithmetic logic unit 111.  These scalar instructions include, for example, an instruction to fetch scalar data from the cache unit 106 and load the data in the register file 111 (c11 L35-47); vector-oriented instructions: Upon receipt of the vector load command, the vector processor sends requests to the cache for the individual vector elements.  The requests for the individual vector elements are processed independently of the vector prefetch requests … (c3 L28-42); As shown in FIG. 1, the vector processor 22 also accesses the cache 24 to obtain vector elements.  The vector processor 22, for example, receives a vector load instruction which commands the vector processor to send vector element addresses to the cache 24 … (c4 L55-67)].
	As to claim 6, it recites substantially the same limitations as in claim 18, and is rejected for the same reasons set forth in the analysis of claim 18. Refer to “As to claim 18” presented earlier in this Office Action for details.
	As to claim 8, it recites substantially the same limitations as in claim 18, and is rejected for the same reasons set forth in the analysis of claim 18. Refer to “As to claim 18” presented earlier in this Office Action for details.
As to claim 9, it recites substantially the same limitations as in claim 2, and is rejected for the same reasons set forth in the analysis of claim 2. Refer to “As to claim 2” presented earlier in this Office Action for details.
As to claim 10, Fossum in view of Yoshizawa teaches The system of claim 8, wherein the block oriented cache-access path couples the main memory to a cache memory, and wherein the configuration of the at least one processor to fetch the at least one instruction of the first instruction set via the block oriented cache-access path includes configuration of the at least one processor to: cause at least one fixed-size data block to be fetched from the main memory and to be loaded to the cache memory when the at least one instruction of the first instruction set is absent from the cache memory, the at least one fixed-size data block including at least one instruction of the first instruction set; and fetch the at least one instruction of the first instruction set from the cache memory [Yoshizawa – as shown in figure 7, where a cache hit/miss is determined, and if it’s a cache miss, data/instruction is fetched from the main memory into the cache memory; figure 7, ST5, ST14, and ST10, “cache hit judgement;” … a cache comparator for judging whether a copy of the data specified by a combination of a plurality of specific data addresses is stored in the cache memory, that is, whether a cache hit or a cache miss has occurred; and a control block for controlling the main memory and the cache memory in accordance with the result of the judgement made by the cache comparator … (c2 L16-40); Fossum -- as shown in figure 1, where blocks of data is transferred from the main memory (23) to the cache (24); also see figure 7; … In response to a prefetch request, the cache is checked to determine whether it includes the required block, and if the cache does not include the required block, a refill request is sent to the main memory … (c2 L47-65); … The prefetch request generator 25 issues the block prefetch requests at a high rate to ensure that the required memory blocks are transferred from the main memory 23 to the cache 24 as soon as possible … (c5 L33-37); … The prefetch request generator 25 issues the block prefetch requests at a high rate to ensure that the required memory blocks are transferred from the main memory 23 to the cache 24 as soon as possible … (c5 L33-37); … The block size, for example, is 64 bytes … (c5 L59)].
As to claim 11, Fossum in view of Yoshizawa teaches The system of claim 8, wherein the fixed-size data block access to the main memory returns data in addition to data referenced by a cache memory access by the at least one processor, and wherein the individually-addressed data access returns only data referenced by a physical address access by the at least one heterogeneous functional unit [Fossum -- as shown in figure 1, where blocks of data is transferred from the main memory (23) to the cache (24); also see figure 7; … In response to a prefetch request, the cache is checked to determine whether it includes the required block, and if the cache does not include the required block, a refill request is sent to the main memory … (c2 L47-65); … The prefetch request generator 25 issues the block prefetch requests at a high rate to ensure that the required memory blocks are transferred from the main memory 23 to the cache 24 as soon as possible … (c5 L33-37); Yoshizawa – as shown in figure 1, there is a direct access path between the processor (30) and the main memory (31) via address bus (34) and data bus (35), completely bypassing the cache memory (32); figure 4 also shows a direct path between the processor (30) and the main memory controller (33); figure 7 shows cache bypassing mode to access main memory directly ST2, ST3, ST5, ST6, ST4, ST7, and ST8; Next, as shown in step ST2 in the main process flow of FIG. 7, the access mode of the access request issued by the processor 30 is judged. There are two access modes in which the processor 30 issues an access request: a bypass mode in which the cache memory 40 is bypassed and an access is made directly to the main memory 31 … (c5 L66 to c6 L12); … 34 is a buffer for switching between addresses (one from the processor 30 and the other from the two-dimensional cache-memory system 32) for supply to the main memory 31 in response to a control signal (SW1) from the two-dimensional cache-memory system 32; and 35 is a buffer for switching data (one from the main memory 31 and the other from the two-dimensional cache-memory system 32) for supply to the processor 30 in response to a control signal (SW0) from the two-dimensional cache-memory system 32 (c3 L55-63); … a 21-bit 2-D plane address register 55 for latching an address used to access the main memory 31 … (c4 L56-58)].
	As to claim 12, it recites substantially the same limitations as in claim 5, and is rejected for the same reasons set forth in the analysis of claim 5. Refer to “As to claim 5” presented earlier in this Office Action for details.
	As to claim 13, it recites substantially the same limitations as in claim 20, and is rejected for the same reasons set forth in the analysis of claim 20. Refer to “As to claim 20” presented earlier in this Office Action for details.
	As to claim 15, Fossum in view of Yoshizawa teaches The system of claim 8, further comprising: a cache interrogation path coupling a cache memory to the at least one heterogeneous functional unit, wherein the cache interrogation path is configured to provide information regarding encached data to the at least one heterogeneous functional unit in response to an interrogation by the at least one heterogeneous functional unit regarding referenced data to be accessed by the at least one heterogeneous functional unit, the referenced data including the at least one instruction of the second instruction set [Yoshizawa – as shown in figure 7, where a cache hit/miss is determined, and if it’s a cache miss, data/instruction is fetched from the main memory into the cache memory; figure 7, ST5, ST14, and ST10, “cache hit judgement;” … a cache comparator for judging whether a copy of the data specified by a combination of a plurality of specific data addresses is stored in the cache memory, that is, whether a cache hit or a cache miss has occurred; and a control block for controlling the main memory and the cache memory in accordance with the result of the judgement made by the cache comparator … (c2 L16-40); Fossum -- … In response to a prefetch request, the cache is checked to determine whether it includes the required block, and if the cache does not include the required block, a refill request is sent to the main memory … (c2 L47-65); In accordance with an additional feature of the present invention, the cache unit 106 is provided with bypasses which speed the transfer of data from the main memory unit 101 to the vector processing unit 116 when desired vector element data are not found in the data store 114, and consequently have to be fetched from the main memory unit 101 … These gates include a gate 131 for transmitting memory commands from the cache control logic 123 to the main memory unit 101, and a gate 132 for passing physical addresses from the virtual-to physical address translation buffer 112 to the main memory unit 101 (c13 L8-22)]. 
	As to claim 16, Fossum in view of Yoshizawa teaches The system of claim 15, wherein the cache interrogation path is configured to initiate loading a fixed-size cache block containing the referenced data to the main memory for individually-addressed data access of the referenced data from the main memory by the at least one heterogeneous functional unit using the address oriented cache-bypass path [Yoshizawa – as shown in figure 1, there is a direct access path between the processor (30) and the main memory (31) via address bus (34) and data bus (35), completely bypassing the cache memory (32); figure 4 also shows a direct path between the processor (30) and the main memory controller (33); figure 7 shows cache bypassing mode to access main memory directly ST2, ST3, ST5, ST6, ST4, ST7, and ST8; Next, as shown in step ST2 in the main process flow of FIG. 7, the access mode of the access request issued by the processor 30 is judged. There are two access modes in which the processor 30 issues an access request: a bypass mode in which the cache memory 40 is bypassed and an access is made directly to the main memory 31 … (c5 L66 to c6 L12); … 34 is a buffer for switching between addresses (one from the processor 30 and the other from the two-dimensional cache-memory system 32) for supply to the main memory 31 in response to a control signal (SW1) from the two-dimensional cache-memory system 32; and 35 is a buffer for switching data (one from the main memory 31 and the other from the two-dimensional cache-memory system 32) for supply to the processor 30 in response to a control signal (SW0) from the two-dimensional cache-memory system 32 (c3 L55-63); … a 21-bit 2-D plane address register 55 for latching an address used to access the main memory 31 … (c4 L56-58); Fossum -- as shown in figure 1, where blocks of data is transferred from the main memory (23) to the cache (24); also see figure 7; … In response to a prefetch request, the cache is checked to determine whether it includes the required block, and if the cache does not include the required block, a refill request is sent to the main memory … (c2 L47-65); … The prefetch request generator 25 issues the block prefetch requests at a high rate to ensure that the required memory blocks are transferred from the main memory 23 to the cache 24 as soon as possible … (c5 L33-37); … The prefetch request generator 25 issues the block prefetch requests at a high rate to ensure that the required memory blocks are transferred from the main memory 23 to the cache 24 as soon as possible … (c5 L33-37); … The block size, for example, is 64 bytes … (c5 L59)].
	As to claim 17, Fossum in view of Yoshizawa teaches The system of claim 15, wherein the cache interrogation path is configured to invalidate the referenced data in the cache memory in association with individually-addressed data access of the referenced data from the main memory by the at least one heterogeneous functional unit using the address oriented cache-bypass path [Yoshizawa -- On the other hand, if it is determined that the hit flag value is "1", then as shown in step ST8 in the main process flow of FIG. 7, a cache invalidation operation is performed to maintain data matching, and then, a memory write request is issued to the main memory 31, after which the process is terminated. In the cache invalidation process, as shown in detail in the process flow of FIG. 10 … (c6 L60 to c7 L22)].
10.	Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Fossum in view of Yoshizawa, and further in view of Sheaffer (US Patent Application Publication 2004/0088524).
	As to claim 7, Fossum in view of Yoshizawa does not teach the main memory comprises a scatter/gather module.
	However, scatter/gather memory is well known in the art, has been widely used in commercial products, and its advantages are well documented.
	For example, Sheaffer teaches using a scatter/gather memory in a computer memory system to facilitate single-instruction-multiple-data (SIMD) operations [scatter/gather memory (figure 2, 202; ¶ 0016-0017, 0026)].
Therefore, it would have been obvious for one of ordinary skills in the art at the time of Applicants’ invention to have used a FPGA as a scatter/gather memory, as demonstrated by Sheaffer, and to incorporate it into the existing apparatus and scheme disclosed by Fossum in view of Yoshizawa, in order to facilitate single-instruction-multiple-data (SIMD) operations.
As to claim 14, it recites substantially the same limitations as in claim 7, and is rejected for the same reasons set forth in the analysis of claim 7. Refer to “As to claim 7” presented earlier in this Office Action for details.

Conclusion
11.	Claims 1-20 are rejected as explained above. 
12.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHENG JEN TSAI whose telephone number is 571-272-4244.  The examiner can normally be reached on Monday-Friday, 9-6.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Rones can be reached on 571-272-4085. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
/SHENG JEN TSAI/Primary Examiner, Art Unit 2136                                                                                                                                                                                                        
August 19, 2022