DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on June 16, 2022, has been entered.
 
Claims 1-2, 4-8, 10-14, 16-20, and 22 are pending in this office action and presented for examination. Claims 1, 4-7, 11-14, 16-19, and 22 are newly amended, and claims 3 and 15 are cancelled, by the response received June 16, 2022. 

Drawings
The drawings are objected to because: 
In Figure 3A as amended, the top row in Memory 120 reads “0x1001 … 0x1008”. However, in the original Figure 3A, the top row in Memory 120 read “0x1000 … 0x1007”. Examiner presumes the change was inadvertently made; however, if on purpose, Examiner notes that this amendment would be new matter and conflict with other portions of the disclosure. 
Figure 5B as amended discloses an “Instruction translation lookaside buffer 940”. However, the original Figure 5B disclosed “Instruction translation lookaside buffer 936”. Examiner presumes the change was inadvertently made; however, if on purpose, Examiner notes that reference character 940 now refers to two different blocks, and reference character 936 in the specification is no longer in the drawings.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Objections
Claims 2, 4-8, 10-14, 16-20, and 22 are objected to because of the following informalities.  Appropriate correction is required.
In claim 2, line 1, “The method” should be “The instruction executing method” for antecedent basis clarity.

In claim 4, line 1, “The method” should be “The instruction executing method” for antecedent basis clarity.

In claim 5, line 1, “The method” should be “The instruction executing method” for antecedent basis clarity.
In claim 5, line 2, “the method” should be “the instruction executing method” for antecedent basis clarity.

In claim 6, line 1, “The method” should be “The instruction executing method” for antecedent basis clarity.
Claim 7 is objected to for failing to alleviate the objection of claim 6 above.

In claim 7, line 1, “The method” should be “The instruction executing method” for antecedent basis clarity.

In claim 8, line 1, “The method” should be “The instruction executing method” for antecedent basis clarity.

In claim 10, line 1, “The method” should be “The instruction executing method” for antecedent basis clarity.

In claim 11, lines 15-16, “each second part data” should be “each second part of data”.
Claims 12-14 and 16-20 are objected to for failing to alleviate the objection of claim 11 above.

In claim 22, lines 17-18, “each second part data” should be “each second part of data”.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2, 5, 10, and 17 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 2 recites the limitation “the buffered data” in line 4. However, there is insufficient antecedent basis for this limitation in the claims. Note that “at least one piece of buffered data” is instead recited in claim 2, lines 1-2. 

Claim 5 recites the limitation “the acquired data” in line 3. However, it is indefinite as to whether this limitation has antecedent basis to a) an acquired first part of data; b) an acquired plurality of second parts of data; c) both a) and b); or d) something else. 

Claim 10 recites the limitation “the first part of data comprises a plurality of first parts of data” in lines 1-2. However, the metes and bounds of this limitation are indefinite. For example, it is unclear as to how an entity can comprise multiple instances of itself, as this would seem to involve some kind of indefinite recursive relationship.
Claim 10 recites the limitation “the first part of data” in line 5. However, it is indefinite as to whether the antecedent basis for this limitation is “the first part of data” as recited in claim 10, line 1, or a particular first part of data of “each first part of data” as recited in claim 10, line 4.
Claim 10 recites the limitation “the buffered data” in line 5. However, there is insufficient antecedent basis for this limitation in the claims. Note that “at least one piece of buffered data” is instead recited in claim 10, line 2. 
Claim 10 recites the limitation “the first part of data” in lines 5-6. However, it is indefinite as to whether the antecedent basis for this limitation is “the first part of data” as recited in claim 10, line 1, or a particular first part of data of “each first part of data” as recited in claim 10, line 4.

Claim 17 recites the limitation “the acquired data” in line 3. However, it is indefinite as to whether this limitation has antecedent basis to a) an acquired first part of data; b) an acquired plurality of second parts of data; c) both a) and b); or d) something else. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-2, 4-8, 10-14, 16-20, and 22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sheaffer (US 20120047311 A1) in view of Batley (US 20170109165 A1).
Consider claim 1, Sheaffer discloses an instruction executing method, comprising: receiving an address-unaligned data load instruction, the address-unaligned data load instruction instructing to read target data from a memory ([0038], lines 5-8, instruction 510 that indicates a load operation to access arrays that are contiguous in the virtual address space but are misaligned with the boundary of a cache memory line and/or a page memory); acquiring a first part of data of the target data from a buffer, the first part of data being data of a first plurality of bits in the target data ([0025], lines 5-7, the stored data of the preceding cache memory line of the particular cache memory line in the stored data array 320); acquiring a second part of data of the target data from the memory ([0025], lines 3-5, the incoming data of a particular cache memory line 310 from the L1 data cache memory 250), comprising: for the second part of data of the target data, accessing the memory based on an address of the second part of data and a bit width of the memory to obtain the second part of data, wherein the second part of data is located in a bit width of the memory (FIG. 4A, [0029], lines 6-8, each cache memory line of the L1 data cache memory 250 is assumed to have a data width of 64 bytes (as an example); [0030], lines 1-7, the cache memory line split access logic 235 is assumed to receive an instruction or request that requires 48 bytes of data from the cache memory line n-1 402 and 16 bytes of data from the cache memory line n 404. The stored data array 320 is assumed to store the data of the cache memory line n-1 402 during a prior misaligned cache memory access); and merging the first part of data and the second part of data to obtain the target data ([0025], lines 1-9, when the cache memory line split access logic 235 receives a non-aligned cache memory access request, the merge logic 330 combines or merges the incoming data of a particular cache memory line 310 from the L1 data cache memory 250 with the stored data of the preceding cache memory line of the particular cache memory line in the stored data array 320. The output 340 of the combination by the merge logic 330 fulfills the non-aligned cache memory access request).
However, Sheaffer does not disclose that the aforementioned acquiring from the memory entails a plurality of second parts, with the aforementioned accessing and merging entailing each second part, and with each second part of data being located in a separate bit width of the memory.
On the other hand, Batley discloses acquiring from memory entails a plurality of second parts, with accessing and merging entailing each second part, and with each second part of data being located in a separate bit width of a memory ([0038], lines 1-10, however, sometimes the apparatus may require access to an unaligned block of data which is unaligned with respect to data word boundaries of the data store. For example the unaligned block of data may start part-way through one data word. In this case, handling the load instruction can be more complex because it may require an initial load operation to load an initial portion of the unaligned block of data from one data word, and then a number of subsequent load operations for loading subsequent portions of the unaligned block of data; [0072], lines 5-10, the first load operation would load an initial portion of the unaligned block to be loaded in response to the overall instruction and place it in the stream buffer 58. This would then be followed by one or more subsequent load operations which load subsequent portions of the unaligned block).
Batley’s teaching of loading a plurality of second parts of data increases system performance relative to loading a second part. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Batley with the invention of Sheaffer in order to increase system performance. Additionally, this modification merely entails the use of a known technique (Batley’s teaching cited above) to improve similar devices (methods, or products) (the invention of Sheaffer, which is also directed to address-unaligned data load instructions) in the same way (Batley’s teaching cited above, when applied to the invention of Sheaffer, results in Sheaffer being improved in the same way by likewise supporting loading a plurality of second parts of data), which is an example of a rationale that may support a conclusion of obviousness, as per MPEP 2143. Note that Batley’s teaching as cited above, when applied to the invention of Sheaffer, results in the overall claim language of ‘acquiring a “plurality of” second “parts” of data of the target data from the memory, comprising: for “each” second part of data of the target data, accessing the memory based on an address of “each” second part of data and a bit width of the memory to obtain “each” second part of data, wherein “each” second part of data is located in a “separate” bit width of the memory; and merging the first part of data and “each” second part of data to obtain the target data.’

Consider claim 2, the overall combination entails the method of claim 1 (see above), wherein the buffer stores at least one piece of buffered data, and acquiring the first part of data of the target data from the buffer comprises: searching for, based on an address interval of the first part of data and an address interval of the buffered data, buffered data that comprises the first part of data of the target data (Sheaffer, [0024], lines 6-9, in one embodiment of the invention, the respective addresses stored in the tag array 325 are the addresses of the cache memory lines that are stored in the stored data array 320).

Consider claim 4, the overall combination entails the method of claim 1 (see above), wherein accessing the memory based on an address of each second part of data and the bit width of the memory comprises: specifying, based on the bit width of the memory, a data length of data to be acquired; and specifying, based on the address of each second part of data, an address of the data in the memory to be acquired, the address in the memory being aligned to the data length (Sheaffer, FIG. 4A, [0029], lines 6-8, each cache memory line of the L1 data cache memory 250 is assumed to have a data width of 64 bytes (as an example); [0030], lines 1-7, the cache memory line split access logic 235 is assumed to receive an instruction or request that requires 48 bytes of data from the cache memory line n-1 402 and 16 bytes of data from the cache memory line n 404. The stored data array 320 is assumed to store the data of the cache memory line n-1 402 during a prior misaligned cache memory access; Batley, [0038], lines 1-10, however, sometimes the apparatus may require access to an unaligned block of data which is unaligned with respect to data word boundaries of the data store. For example the unaligned block of data may start part-way through one data word. In this case, handling the load instruction can be more complex because it may require an initial load operation to load an initial portion of the unaligned block of data from one data word, and then a number of subsequent load operations for loading subsequent portions of the unaligned block of data; [0072], lines 5-10, the first load operation would load an initial portion of the unaligned block to be loaded in response to the overall instruction and place it in the stream buffer 58. This would then be followed by one or more subsequent load operations which load subsequent portions of the unaligned block).

Consider claim 5, the overall combination entails the method of claim 1 (see above), wherein after accessing the memory, the method further comprises: storing at least a part of the acquired data into the buffer as buffered data (Sheaffer, [0032], lines 1-8, after the merge logic 330 receives the stored data of the cache memory line n-1 402, the cache memory line split access logic 235 replaces the stored data of the cache memory line n-1 402 in the stored data array 320 with the data of the cache memory line n 404 in one embodiment of the invention. This facilitates contiguous cache memory line split accesses to achieve full throughput operation within a single machine or clock cycle).

Consider claim 6, the overall combination entails the method of claim 1 (see above), further comprising: determining, based on an address interval of the target data and the bit width of the memory, at least one of the first part of data or the plurality of second parts of data of the target data (Sheaffer, FIG. 4A, [0029], lines 6-8, each cache memory line of the L1 data cache memory 250 is assumed to have a data width of 64 bytes (as an example); [0030], lines 1-7, the cache memory line split access logic 235 is assumed to receive an instruction or request that requires 48 bytes of data from the cache memory line n-1 402 and 16 bytes of data from the cache memory line n 404. The stored data array 320 is assumed to store the data of the cache memory line n-1 402 during a prior misaligned cache memory access; Batley, [0038], lines 1-10, however, sometimes the apparatus may require access to an unaligned block of data which is unaligned with respect to data word boundaries of the data store. For example the unaligned block of data may start part-way through one data word. In this case, handling the load instruction can be more complex because it may require an initial load operation to load an initial portion of the unaligned block of data from one data word, and then a number of subsequent load operations for loading subsequent portions of the unaligned block of data; [0072], lines 5-10, the first load operation would load an initial portion of the unaligned block to be loaded in response to the overall instruction and place it in the stream buffer 58. This would then be followed by one or more subsequent load operations which load subsequent portions of the unaligned block).

Consider claim 7, the overall combination entails the method of claim 6 (see above), wherein determining, based on the address interval of the target data and the bit width of the memory, at least one of the first part of data or the plurality of second parts of data of the target data comprises: determining, based on the address interval of the target data and the bit width of the memory, a bit width boundary spanned by the target data; and dividing, based on the spanned bit width boundary, the target data into the first part of data and the plurality of second parts of data (Sheaffer, FIG. 4A, [0029], lines 6-8, each cache memory line of the L1 data cache memory 250 is assumed to have a data width of 64 bytes (as an example); [0030], lines 1-7, the cache memory line split access logic 235 is assumed to receive an instruction or request that requires 48 bytes of data from the cache memory line n-1 402 and 16 bytes of data from the cache memory line n 404. The stored data array 320 is assumed to store the data of the cache memory line n-1 402 during a prior misaligned cache memory access; Batley, [0038], lines 1-10, however, sometimes the apparatus may require access to an unaligned block of data which is unaligned with respect to data word boundaries of the data store. For example the unaligned block of data may start part-way through one data word. In this case, handling the load instruction can be more complex because it may require an initial load operation to load an initial portion of the unaligned block of data from one data word, and then a number of subsequent load operations for loading subsequent portions of the unaligned block of data; [0072], lines 5-10, the first load operation would load an initial portion of the unaligned block to be loaded in response to the overall instruction and place it in the stream buffer 58. This would then be followed by one or more subsequent load operations which load subsequent portions of the unaligned block).

Consider claim 8, the overall combination entails the method of claim 1 (see above), wherein an address of the target data in the address-unaligned data load instruction is not equal to an integer multiple of a data length of the target data (Sheaffer, [0004], lines 3-11, a cache memory line split access of 4 bytes 130 occurs when the access is shifted 4 bytes from the aligned cache memory access 120, i.e., the required data is the data A2 to A16 from the 64-byte cache memory line n 110 and the data Z1 from the 64-byte cache memory line n+1 115. The cache memory line split access of 8 bytes 140 and the cache memory line split access of 12 bytes 150 illustrate two other examples of non-aligned cache memory accesses).

Consider claim 10, the overall combination entails the method of claim 1 (see above), wherein the first part of data comprises a plurality of first parts of data, the buffer stores at least one piece of buffered data, and acquiring the first part of data of the target data from the buffer comprises: for each first part of data of the target data, searching for, based on an address interval of the first part of data and an address interval of the buffered data, buffered data that comprises the first part of data (Sheaffer, [0024], lines 6-9, in one embodiment of the invention, the respective addresses stored in the tag array 325 are the addresses of the cache memory lines that are stored in the stored data array 320; Batley, [0038], lines 1-10, however, sometimes the apparatus may require access to an unaligned block of data which is unaligned with respect to data word boundaries of the data store. For example the unaligned block of data may start part-way through one data word. In this case, handling the load instruction can be more complex because it may require an initial load operation to load an initial portion of the unaligned block of data from one data word, and then a number of subsequent load operations for loading subsequent portions of the unaligned block of data; [0072], lines 5-10, the first load operation would load an initial portion of the unaligned block to be loaded in response to the overall instruction and place it in the stream buffer 58. This would then be followed by one or more subsequent load operations which load subsequent portions of the unaligned block).

Consider claim 11, Sheaffer discloses a processing apparatus communicatively coupled to a memory ([0025], lines 4-5, L1 data cache memory 250), the processing apparatus comprising: a buffer configured to store buffered data ([0024], line 6, stored data array 320); an instruction executing circuit ([0021], line 1, execution unit 230) configured to execute an address-unaligned data load instruction, wherein the address-unaligned data load instruction is used to read target data from the memory ([0038], lines 5-8, instruction 510 that indicates a load operation to access arrays that are contiguous in the virtual address space but are misaligned with the boundary of a cache memory line and/or a page memory) and the instruction executing circuit is coupled to the buffer and the memory (Figure 2); a data acquisition circuit configured to: acquire a first part of data of the target data from the buffer, the first part of data being data of a first plurality of bits in the target data ([0025], lines 5-7, the stored data of the preceding cache memory line of the particular cache memory line in the stored data array 320); acquire a second part of data of the target data ([0025], lines 3-5, the incoming data of a particular cache memory line 310 from the L1 data cache memory 250); and access the memory for the second part of data of the target data based on an address of the second part of data and a bit width of the memory to acquire the second part of data, wherein the second part of data is located in a bit width of the memory (FIG. 4A, [0029], lines 6-8, each cache memory line of the L1 data cache memory 250 is assumed to have a data width of 64 bytes (as an example); [0030], lines 1-7, the cache memory line split access logic 235 is assumed to receive an instruction or request that requires 48 bytes of data from the cache memory line n-1 402 and 16 bytes of data from the cache memory line n 404. The stored data array 320 is assumed to store the data of the cache memory line n-1 402 during a prior misaligned cache memory access); and a data processing circuit configured to merge the first part of data and the second part of data to obtain the target data ([0025], lines 1-9, when the cache memory line split access logic 235 receives a non-aligned cache memory access request, the merge logic 330 combines or merges the incoming data of a particular cache memory line 310 from the L1 data cache memory 250 with the stored data of the preceding cache memory line of the particular cache memory line in the stored data array 320. The output 340 of the combination by the merge logic 330 fulfills the non-aligned cache memory access request).
However, Sheaffer does not disclose that the aforementioned acquiring (in claim 11, line 11) entails a plurality of second parts, with the aforementioned accessing and merging entailing each second part, and with each second part of data being located in a separate bit width of the memory.
On the other hand, Batley discloses acquiring entails a plurality of second parts, with accessing and merging entailing each second part, and with each second part of data being located in a separate bit width of a memory ([0038], lines 1-10, however, sometimes the apparatus may require access to an unaligned block of data which is unaligned with respect to data word boundaries of the data store. For example the unaligned block of data may start part-way through one data word. In this case, handling the load instruction can be more complex because it may require an initial load operation to load an initial portion of the unaligned block of data from one data word, and then a number of subsequent load operations for loading subsequent portions of the unaligned block of data; [0072], lines 5-10, the first load operation would load an initial portion of the unaligned block to be loaded in response to the overall instruction and place it in the stream buffer 58. This would then be followed by one or more subsequent load operations which load subsequent portions of the unaligned block).
Batley’s teaching of loading a plurality of second parts of data increases system performance relative to loading a second part. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Batley with the invention of Sheaffer in order to increase system performance.
Additionally, this modification merely entails the use of a known technique (Batley’s teaching cited above) to improve similar devices (methods, or products) (the invention of Sheaffer, which is also directed to address-unaligned data load instructions) in the same way (Batley’s teaching cited above, when applied to the invention of Sheaffer, results in Sheaffer being improved in the same way by likewise supporting loading a plurality of second parts of data), which is an example of a rationale that may support a conclusion of obviousness, as per MPEP 2143. Note that Batley’s teaching as cited above, when applied to the invention of Sheaffer, results in the overall claim language of ‘acquire a “plurality of” second “parts” of data of the target data; and access the memory for “each” second part of data of the target data based on an address of “each” second part of data and a bit width of the memory to acquire “each” second part of data, wherein “each” second part of data is located in a “separate” bit width of the memory; and a data processing circuit configured to merge the first part of data and “each” second part of data to obtain the target data.’

Consider claim 12, the overall combination entails the processing apparatus of claim 11 (see above), wherein the buffer stores at least one piece of buffered data, and the data acquisition circuit is configured to: search for, based on an address interval of the first part of data and an address interval of the buffered data, buffered data that comprises the first part of data of the target data (Sheaffer, [0024], lines 6-9, in one embodiment of the invention, the respective addresses stored in the tag array 325 are the addresses of the cache memory lines that are stored in the stored data array 320).

Consider claim 13, the overall combination entails the processing apparatus of claim 11 (see above), wherein the data acquisition circuit is configured to: acquire the first part of data from the memory in response to a determination that the first part of data has not been found in the buffer (Sheaffer, [0050], lines 4-7, If no match is found in step 750, the flow ends. If a match is found in step 750, the cache memory cache memory line split access logic 235 merges the data retrieved; [0024], lines 1-4, the stored data array 320 holds or stores one or more cache memory lines of the L1 data cache memory 250 that are previously accessed through a misaligned access of the L1 data cache memory 250; Figure 1; [0027], lines 4-5; in other words, when no match is found in the buffer, for example when there were no relevant previous accesses, no merging using the stored data array 320 occurs, and the data is acquired from the cache memory instead in a manner which uses double the bandwidth of the L1 data cache memory).

Consider claim 14, the overall combination entails the processing apparatus of claim 13 (see above), wherein the data acquisition circuit is configured to: access the memory based on an address of the first part of data and the bit width of the memory to acquire the first part of data (Sheaffer, [0050], lines 4-7, If no match is found in step 750, the flow ends. If a match is found in step 750, the cache memory cache memory line split access logic 235 merges the data retrieved; [0024], lines 1-4, the stored data array 320 holds or stores one or more cache memory lines of the L1 data cache memory 250 that are previously accessed through a misaligned access of the L1 data cache memory 250; Figure 1; [0027], lines 4-5; in other words, when no match is found in the buffer, for example when there were no relevant previous accesses, no merging using the stored data array 320 occurs, and the data is acquired from the cache memory instead in a manner which uses double the bandwidth of the L1 data cache memory).

Consider claim 16, the overall combination entails the processing apparatus of claim 11 (see above), wherein the data acquisition circuit is configured to: specify, based on the bit width of the memory, a data length of data to be acquired; and specify, based on the address of each second part of data, an address of the data in the memory to be acquired, the address in the memory being aligned to the data length (Sheaffer, FIG. 4A, [0029], lines 6-8, each cache memory line of the L1 data cache memory 250 is assumed to have a data width of 64 bytes (as an example); [0030], lines 1-7, the cache memory line split access logic 235 is assumed to receive an instruction or request that requires 48 bytes of data from the cache memory line n-1 402 and 16 bytes of data from the cache memory line n 404. The stored data array 320 is assumed to store the data of the cache memory line n-1 402 during a prior misaligned cache memory access; Batley, [0038], lines 1-10, however, sometimes the apparatus may require access to an unaligned block of data which is unaligned with respect to data word boundaries of the data store. For example the unaligned block of data may start part-way through one data word. In this case, handling the load instruction can be more complex because it may require an initial load operation to load an initial portion of the unaligned block of data from one data word, and then a number of subsequent load operations for loading subsequent portions of the unaligned block of data; [0072], lines 5-10, the first load operation would load an initial portion of the unaligned block to be loaded in response to the overall instruction and place it in the stream buffer 58. This would then be followed by one or more subsequent load operations which load subsequent portions of the unaligned block).

Consider claim 17, the overall combination entails the processing apparatus of claim 11 (see above), wherein the data acquisition circuit is configured to: after accessing the memory, store at least a part of the acquired data into the buffer as buffered data (Sheaffer, [0032], lines 1-8, after the merge logic 330 receives the stored data of the cache memory line n-1 402, the cache memory line split access logic 235 replaces the stored data of the cache memory line n-1 402 in the stored data array 320 with the data of the cache memory line n 404 in one embodiment of the invention. This facilitates contiguous cache memory line split accesses to achieve full throughput operation within a single machine or clock cycle).

Consider claim 18, the overall combination entails the processing apparatus of claim 11 (see above), wherein the data acquisition circuit is configured to: determine, based on an address interval of the target data and the bit width of the memory, at least one of the first part of data or the plurality of second parts of data of the target data (Sheaffer, FIG. 4A, [0029], lines 6-8, each cache memory line of the L1 data cache memory 250 is assumed to have a data width of 64 bytes (as an example); [0030], lines 1-7, the cache memory line split access logic 235 is assumed to receive an instruction or request that requires 48 bytes of data from the cache memory line n-1 402 and 16 bytes of data from the cache memory line n 404. The stored data array 320 is assumed to store the data of the cache memory line n-1 402 during a prior misaligned cache memory access; Batley, [0038], lines 1-10, however, sometimes the apparatus may require access to an unaligned block of data which is unaligned with respect to data word boundaries of the data store. For example the unaligned block of data may start part-way through one data word. In this case, handling the load instruction can be more complex because it may require an initial load operation to load an initial portion of the unaligned block of data from one data word, and then a number of subsequent load operations for loading subsequent portions of the unaligned block of data; [0072], lines 5-10, the first load operation would load an initial portion of the unaligned block to be loaded in response to the overall instruction and place it in the stream buffer 58. This would then be followed by one or more subsequent load operations which load subsequent portions of the unaligned block).

Consider claim 19, the overall combination entails the processing apparatus of claim 18 (see above), wherein the data acquisition circuit is configured to: determine, based on the address interval of the target data and the bit width of the memory, a bit width boundary spanned by the target data; and divide, based on the spanned bit width boundary, the target data into the first part of data and the plurality of second parts of data (Sheaffer, FIG. 4A, [0029], lines 6-8, each cache memory line of the L1 data cache memory 250 is assumed to have a data width of 64 bytes (as an example); [0030], lines 1-7, the cache memory line split access logic 235 is assumed to receive an instruction or request that requires 48 bytes of data from the cache memory line n-1 402 and 16 bytes of data from the cache memory line n 404. The stored data array 320 is assumed to store the data of the cache memory line n-1 402 during a prior misaligned cache memory access; Batley, [0038], lines 1-10, however, sometimes the apparatus may require access to an unaligned block of data which is unaligned with respect to data word boundaries of the data store. For example the unaligned block of data may start part-way through one data word. In this case, handling the load instruction can be more complex because it may require an initial load operation to load an initial portion of the unaligned block of data from one data word, and then a number of subsequent load operations for loading subsequent portions of the unaligned block of data; [0072], lines 5-10, the first load operation would load an initial portion of the unaligned block to be loaded in response to the overall instruction and place it in the stream buffer 58. This would then be followed by one or more subsequent load operations which load subsequent portions of the unaligned block).

Consider claim 20, the overall combination entails the processing apparatus of claim 11 (see above), wherein an address of the target data in the address-unaligned data load instruction is not equal to an integer multiple of a data length of the target data (Sheaffer, [0004], lines 3-11, a cache memory line split access of 4 bytes 130 occurs when the access is shifted 4 bytes from the aligned cache memory access 120, i.e., the required data is the data A2 to A16 from the 64-byte cache memory line n 110 and the data Z1 from the 64-byte cache memory line n+1 115. The cache memory line split access of 8 bytes 140 and the cache memory line split access of 12 bytes 150 illustrate two other examples of non-aligned cache memory accesses).

Consider claim 22, Sheaffer discloses a System on Chip ([0053], lines 9-10, a system on a chip (SOC) system), comprising: a memory ([0025], lines 4-5, L1 data cache memory 250); and a processing apparatus communicatively coupled to the memory, the processing apparatus comprising: a buffer configured to store buffered data ([0024], line 6, stored data array 320); an instruction executing circuit ([0021], line 1, execution unit 230) configured to execute an address-unaligned data load instruction, wherein the address-unaligned data load instruction is used to read target data from the memory ([0038], lines 5-8, instruction 510 that indicates a load operation to access arrays that are contiguous in the virtual address space but are misaligned with the boundary of a cache memory line and/or a page memory) and the instruction executing circuit is coupled to the buffer and the memory (Figure 2); a data acquisition circuit configured to: acquire a first part of data of the target data from the buffer, the first part of data being data of a first plurality of bits in the target data ([0025], lines 5-7, the stored data of the preceding cache memory line of the particular cache memory line in the stored data array 320); acquire a second part of data of the target data ([0025], lines 3-5, the incoming data of a particular cache memory line 310 from the L1 data cache memory 250); and access the memory for the second part of data of the target data based on an address of the second part of data and a bit width of the memory to acquire the second part of data, wherein the second part of data is located in a bit width of the memory (FIG. 4A, [0029], lines 6-8, each cache memory line of the L1 data cache memory 250 is assumed to have a data width of 64 bytes (as an example); [0030], lines 1-7, the cache memory line split access logic 235 is assumed to receive an instruction or request that requires 48 bytes of data from the cache memory line n-1 402 and 16 bytes of data from the cache memory line n 404. The stored data array 320 is assumed to store the data of the cache memory line n-1 402 during a prior misaligned cache memory access); and a data processing circuit configured to merge the first part of data and the second part of data to obtain the target data ([0025], lines 1-9, when the cache memory line split access logic 235 receives a non-aligned cache memory access request, the merge logic 330 combines or merges the incoming data of a particular cache memory line 310 from the L1 data cache memory 250 with the stored data of the preceding cache memory line of the particular cache memory line in the stored data array 320. The output 340 of the combination by the merge logic 330 fulfills the non-aligned cache memory access request).
However, Sheaffer does not disclose that the aforementioned acquiring (in claim 11, line 11) entails a plurality of second parts, with the aforementioned accessing and merging entailing each second part, and with each second part of data being located in a separate bit width of the memory.
On the other hand, Batley discloses acquiring entails a plurality of second parts, with accessing and merging entailing each second part, and with each second part of data being located in a separate bit width of a memory ([0038], lines 1-10, however, sometimes the apparatus may require access to an unaligned block of data which is unaligned with respect to data word boundaries of the data store. For example the unaligned block of data may start part-way through one data word. In this case, handling the load instruction can be more complex because it may require an initial load operation to load an initial portion of the unaligned block of data from one data word, and then a number of subsequent load operations for loading subsequent portions of the unaligned block of data; [0072], lines 5-10, the first load operation would load an initial portion of the unaligned block to be loaded in response to the overall instruction and place it in the stream buffer 58. This would then be followed by one or more subsequent load operations which load subsequent portions of the unaligned block).
Batley’s teaching of loading a plurality of second parts of data increases system performance relative to loading a second part. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Batley with the invention of Sheaffer in order to increase system performance.
Additionally, this modification merely entails the use of a known technique (Batley’s teaching cited above) to improve similar devices (methods, or products) (the invention of Sheaffer, which is also directed to address-unaligned data load instructions) in the same way (Batley’s teaching cited above, when applied to the invention of Sheaffer, results in Sheaffer being improved in the same way by likewise supporting loading a plurality of second parts of data), which is an example of a rationale that may support a conclusion of obviousness, as per MPEP 2143. Note that Batley’s teaching as cited above, when applied to the invention of Sheaffer, results in the overall claim language of ‘acquire a “plurality of” second “parts” of data of the target data; and access the memory for “each” second part of data of the target data based on an address of “each” second part of data and a bit width of the memory to acquire “each” second part of data, wherein “each” second part of data is located in a “separate” bit width of the memory; and a data processing circuit configured to merge the first part of data and “each” second part of data to obtain the target data.’

Response to Arguments
Applicant on page 12 argues: “Applicant has changed the title as noted above to be more clearly indicative of the subject matter of the claims. Applicant respectfully requests that the objection to the title be withdrawn.”
In view of the aforementioned amendment, the previously presented objection to the title is withdrawn. 

Applicant on page 12 argues: ‘Applicant has amended the drawings to address the informalities indicated in the Office Action, pages 2-3. In regard to Figure 3A, Applicant notes that the changes were inadvertently made and has reverted the Figure to only address the informalities previously identified by the Examiner. In regard to Figure 5B, Applicant notes that the change to the reference number associated with "Instruction translation lookaside buffer" was inadvertently made and has reverted the Figure to only address the informalities previously identified by the Examiner. Applicant respectfully requests that the objection to Figures 3A and 5B be withdrawn.’
However, the aforementioned amended drawings do not appear to be included in the RCE dated June 16, 2022; therefore, the objections remain applicable. (Examiner notes that the drawings dated May 24, 2022, if resubmitted as part of an entered response, would overcome the objections to the drawings.)

Applicant on page 12 argues: ‘Claims 12-19 have been amended to remove the term "further." Applicant respectfully requests that the objection to claims 12-19 be withdrawn.’
In view of the aforementioned amendments, the previously presented objections to the claims are withdrawn.

Applicant on page 12 argues: ‘Per the Office's recommendation (Office Action, page 38, item 86), Applicant has amended claims 11 and 22 to recite "instruction executing circuit," "data acquisition circuit," and "data processing circuit." As noted by the Office, such amendments would preclude interpretation under 35 U.S.C. § 112(f). Applicant respectfully requests that the interpretation of claims 11 and 22 under 35 U.S.C. §112(f) be withdrawn.’
In view of the aforementioned amendments, the previously presented interpretations under 112(f) of the aforementioned amended claim limitations are withdrawn. 

Applicant on page 13 argues: ‘As noted by the Office (Office Action, page 11, item 27(a)), Applicant has amended claims 11-19 and 22 to preclude interpretation under 35 U.S.C. §112(f). Regarding the rejection of claim 15, Applicant submits that the current wording of claim 15 ("data acquisition circuit is configured to") provides proper support for the elements of claim 15 in the context of claim 11. Applicant respectfully requests that the rejections under 35 U.S.C. § 112(a) be withdrawn.’
In view of the aforementioned amendments, the previously presented rejections under 35 U.S.C. § 112(a) are withdrawn.

Applicant across pages 13-14 argues: “As noted by the Office (Office Action, page 15, item 37(a)), Applicant has amended claims 11-19 and 22 to preclude interpretation under 35 U.S.C. §112(f). Applicant has amended claims 3, 4, 6, 7, 14-16, and 18 to address the rejections identified in the Office Action, pages 16-20. Applicant respectfully requests that the rejections under 35 U.S.C. § 112(b) be withdrawn.”
In view of the aforementioned amendments, the previously presented rejections under 35 U.S.C. § 112(b) are withdrawn.

Applicant on page 14 argues: "Sheaffer fails to teach all the elements of amended claim 1". Applicant across pages 14-15 provides further reasoning for this argument, and argues on page 15 that "Because Sheaffer fails to teach all the elements of amended claim 1, claim 1 is distinguishable over Sheaffer. While claims 11 and 22 are of different scope, claims 11 and 22 have been amended to recite similar elements as claim 1 and are therefore also distinguishable over Sheaffer. Applicant respectfully requests that the 35 U.S.C. §102(a)(1) rejection of the claims be withdrawn.”
In view of the aforementioned amendments, Examiner is newly relying upon the Batley reference — see the Claim Rejections - 35 USC § 103 section above.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEITH E VICARY whose telephone number is (571)270-1314. The examiner can normally be reached Monday to Friday, 9:00 AM to 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached on (571)270-3995. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KEITH E VICARY/Primary Examiner, Art Unit 2182