DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
All information disclosure statements were submitted prior to the first action and are incompliance with the provisions of 37 C.F.R. § 1.97.  Accordingly, they have been considered. 

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-8 and 17-20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for pre-AIA  the inventor(s), at the time the application was filed, had possession of the claimed invention.
All independent claims and dependent claim 4 substantially recite “decomposing” of a memory request or a “decomposition unit”.  “It is not enough that one skilled in the art could write a program to achieve the claimed function because the specification must explain how the inventor intends to achieve the claimed function to satisfy the written description requirement. See, e.g., Vasudevan Software, Inc. v. MicroStrategy, Inc., 782 F.3d 671, 681-683, 114 USPQ2d 1349, 1356, 1357 (Fed. Cir. 2015) (reversing and remanding the district court’s grant of summary judgment of invalidity for lack of adequate written description where there were genuine issues of material fact regarding "whether the specification show[ed] possession by the inventor of how accessing disparate databases is achieved"). If the specification does not provide a disclosure of the computer and algorithm in sufficient detail to demonstrate to one of ordinary skill in the art that the inventor possessed the invention a rejection under 35 U.S.C. 112(a)  or pre-AIA  35 U.S.C. 112, first paragraph, for lack of written description must be made.”  MPEP § 2161.01(I). “An original claim may lack written description support when (1) the claim defines the invention in functional language specifying a desired result but the disclosure fails to sufficiently identify how the function is performed or the result is achieved[.] See Ariad Pharms., Inc. v. Eli Lilly & Co., 598 F.3d 1336, 1349-50 (Fed. Cir. 2010) (en banc). The written description requirement is not necessarily met when the claim language appears in ipsis verbis in the specification. ‘Even if a claim is supported by the specification, the language of the specification, to the extent possible, must describe the claimed invention so that one skilled in the art can recognize what is claimed. The appearance of mere indistinct words in a specification or a claim, even an original claim, does not necessarily satisfy that requirement.’”  MPEP § 2163.03.  The closest support in the specification is: “In some embodiments, the memory access operation is decomposed or unrolled into one or more partial access requests by one or more request processing units such as request processing units 113, 123, 133, and/or 143.  For example, a 
All dependent claims are rejected as containing the limitations of the claims from which they depend.  



The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-8 and 17-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
All independent claims and dependent claim 4 substantially recite “decomposing” of a memory request or a “decomposition unit”.  “The meaning of every term used in a claim should be apparent from the prior art or from the specification and drawings at the time the application is filed. Claim language may not be “ambiguous, vague, incoherent, opaque, or otherwise unclear in describing and defining the claimed invention.” Packard, 751 F.3d at 1311. Applicants need not confine themselves to the terminology used in the prior art, but are required to make clear and precise the terms that are used to define the invention whereby the metes and bounds of the claimed invention can be ascertained. . . . The requirements for clarity and precision must be balanced with the limitations of the language and the science. If the claims, read in light of the specification, reasonably apprise those skilled in the art both of the utilization and scope of the invention, and if the language is as precise as the subject matter permits, the statute (35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph) demands no more. Packard, 751 F.3d at 1313 ("[H]ow much clarity is required necessarily invokes some standard of reasonable precision in the use of language in the context of the circumstances."). This does not mean that the examiner must accept the best effort of applicant. If the language is not considered as precise as the subject matter permits, the examiner should provide reasons to support the conclusion of indefiniteness and is encouraged to suggest alternatives that would not be subject to rejection.” MPEP § 2173.05(a).  The recited “decomposition” does not appear to be a known term of art and is not explained in the specification in a way that would to define the invention whereby the metes and bounds of the claimed invention can be ascertained.  Note that there is no specific structural or functional distinction for instance between “decomposing” an address and merely using an address to access several regions of memory.  
All dependent claims are rejected as containing the limitations of the claims from which they depend.  



Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-3 are rejected under 35 U.S.C. 103 as being unpatentable over Roberts (US 2018/0143905), Tahhan (US 2019/0394081, filed 2018, different assignee) and Emory (The Crossbar Switch 2018.).
1. (Original) A system, comprising: 
a plurality of memory units, wherein each of the plurality of memory units includes a request processing unit and a plurality of memory banks, (See Roberts figure 1.  Roberts teaches: “Each processor core 110 is able to access a combined memory space including its own local memory (e.g., local cache 116 and main memory co-located at the same node) and remote memory formed by main memory residing at the other nodes. The memory accesses between nodes are non-uniform (that is, have a different latency) with intra-node memory accesses because accesses to remote memory take longer to complete than accesses to local memory due to the requests traveling across the interconnect 112. Thus, the hardware within each node is characterized by being able to communicate more efficiently with modules of the same node than interacting or communicating with hardware of other nodes. In some embodiments, processor core 110(1) within node 1 may have lower latency access to the memory that is local (i.e., a memory resident in the same node as the processor core 110(1)) to that node (e.g., main memory 118) as compared to access to a remote (i.e., non-local) memory.”  Roberts paragraph 0018.
Roberts does not explain that memory is made up of banks.
Tahhan teaches: “[0134] For example, NUMA node 0 includes memory banks DIMM 0 through DIMM N, as well as PCIe device x.” Tahhan paragraph 0134.  “[0135] NUMA node 1 includes CPU N+1 through CPU M, which share LLC N. NUMA node 1 also includes PCIe device y, as well as memory banks DIMM N+1 through DIMM M.”  Tahhan paragraph 0135.”
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Tahhan before the effective filing date because banks are slower than processors (so more banks per for a single processor prevents starving the processor).) and wherein each request processing unit includes a plurality of decomposition units and a crossbar switch, the crossbar switch communicatively connecting each of the plurality of decomposition units to each of the plurality of memory banks; and a processor coupled to the plurality of memory units, (“The nodes are coupled to one another over a collection of point-to-point interconnects, thereby permitting processors in one node to access data stored in another node.”   Roberts Abstract.  Note that reciting a “decomposition unit” does not require steps to be performed or limit to a particular structure.  See MPEP § 2111.04.
The previously cited art does not expressly teach a crossbar switch to connect nodes.
Emory teaches: “The first solution was the cross bar switch . . . It is fast, but very expensive to make”  Emory page 1.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Emory (to substitute a crossbar switch for point to point interconnects) because crossbar switches are fast.) wherein the processor includes a plurality of processing elements and a communication network communicatively connecting the plurality of processing elements to the plurality of memory units, (See Roberts figure 1.) and wherein at least a first processing element of the plurality of processing elements includes a control logic unit and a matrix compute engine, and the control logic unit is configured to access the plurality of memory units using a dynamically programmable distribution scheme.  (The recited “control logic unit” and “matrix compute engine” do not require steps to be performed or limit to a particular structure.  See MPEP §§ 2103 and 2144.04.  Note also that processors/cores are “dynamically programmable” read on “control logic units” and a “matrix compute engines”.  Roberts teaches: “The relative latencies of accesses to main memory and caches can be determined at each node for determining whether it would be more efficient to access data via cache lines or main memory. In some embodiments, each node 102-106 includes a directory (not shown in FIG. 1) that is used to identify which nodes have cached copies of data corresponding to a particular address in main memory. The directories maintain information regarding the current sharers of a cache line in system 100 and track latency times for memory access requests to main memory and to caches containing copies of data cached from the main memory of the processing node in which it resides.”  Roberts paragraph 0021.  “Based on a determination that the inter-cache latency is higher than the main-memory-to-cache latency (e.g., it would be faster to access data from main memory instead of a cached copy), a copy of data associated with the memory access request can be retrieved from main memory in its home node instead of from a cached location. In some embodiments, a directory residing in node 1 102 can determine that requesting memory access to a cached copy of data in local cache 116(N) of node N would have a higher latency than requesting the data from its copy in the main memory 122 of node 2. Based on that determination, the memory access request can be fulfilled faster by retrieving the data from main memory in its home node (e.g., node 2) than from its cached copy in node N.” Roberts paragraph 0022.)
2. (Original) The system of claim 1, wherein
the request processing unit of each of the plurality of memory units is configured to receive a broadcasted memory request.  (“In one embodiment, each node in the NUMA system 200 broadcasts cache probe requests (e.g., a read and/or a write probe) to the cache memory and main memory of all other nodes.”  Roberts paragraph 0035.  Note that a “request processing unit” does not require steps to be performed or limit to a particular structure.  See MPEP §§ 2103 and 2111.04.)
3. (Original) The system of claim 2, wherein 
the broadcasted memory request references data stored in each of the plurality of memory units.  (“In one embodiment, each node in the NUMA system 200 broadcasts cache probe requests (e.g., a read and/or a write probe) to the cache memory and main memory of all other nodes.”  Roberts paragraph 0035.)
Claims 4-7 and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Roberts, Tahhan, Emory, and Miller (Algorithmic Techniques For Regular Networks of Processors, 2009).
4. (Original) The system of claim 3, wherein 
the decomposition units of the request processing units are configured to decompose the broadcasted memory request into a corresponding plurality of partial requests.  (“In response to the cache probe requests, the cache memory and main memory of each node will return one or more return responses to the requesting node. For example, for a read probe, the caches can return a cache hit or a cache miss to indicate whether the requested data is found within cache memory.” Roberts paragraph 0035.
The previously cited art does not expressly teach partial requests.  
Miller (Algorithmic Techniques For Regular Networks of Processors, 2009) teaches: “Broadcast is another common global operation, used to send data from one processor to all other processors. Extensions of the broadcast operation include simultaneously performing a broadcast within every (predetermined and distinct) subset of processors. For example, suppose matrix A has been partitioned into submatrices allocated to different processors, and one needs to broadcast the first row of A so that if a processor contains any elements of column i then it obtains the value of A(1, i). In this situation, the more general form of a subset-based broadcast can be used.”  Miller page 9, last paragraph, continued onto page 10.    
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Miller because this avoids the need for the sender to determine where a subset if data exists to retrieve the data subset.)  
5. (Original) The system of claim 4, wherein 
the request processing unit of each of the plurality of memory units is configured to determine whether each of the corresponding plurality of partial requests corresponds to data stored in a corresponding one of the plurality of memory banks associated with the corresponding request processing unit. (With respect to “partial” requests, see Miller above.  With respect to accessing memory banks (in a given part of a NUMA system), see Tahhan cited above.   Roberts teaches: “In response to the cache probe requests, the cache memory and main memory of each node will return one or more return responses to the requesting node. For example, for a read probe, the caches can return a cache hit or a cache miss to indicate whether the requested data is found within cache memory.” Roberts paragraph 0035.  See also Roberts figure 1.)  
6. (Original) The system of claim 4, wherein 
the crossbar switch of the request processing unit of each of the plurality of memory units (See rejection of claim 1.) is configured to direct a partial request (See rejection of claim 4.) for data stored in a corresponding one of the plurality of memory banks to the corresponding memory bank and receive a retrieved data payload from the corresponding memory bank.  (See rejection of claim 1 noting that Roberts teaches retrieving data from a different NUMA node and that different NUMA areas made up of different banks are taught Tahhan.)  
7. (Original) The system of claim 6, wherein 
the request processing unit of each of the plurality of memory units is configured to prepare a partial response using the retrieved data payload and provide the prepared partial response to a processing element of the plurality of processing elements.  (“In response to the cache probe requests, the cache memory and main memory of each node will return one or more return responses to the requesting node. For example, for a read probe, the caches can return a cache hit or a cache miss to indicate whether the requested data is found within cache memory.” Roberts paragraph 0035.  With respect to returning “partial” data, see Miller cited in claim 4.)
17. (Original) A method comprising: 
receiving a memory request provided from a first processing element at a first decomposition unit of a memory unit, wherein the memory unit includes a plurality of memory banks and a crossbar switch, the crossbar switch communicatively connecting each of a plurality of decomposition units to each of the plurality of memory banks; (See rejection of claim 1.) decomposing the memory request into a first partial request and a second partial request; (See rejection of claim 1.) determining that a requested first data of the first partial request resides in a first memory bank of the plurality of memory banks; determining that a requested second data of the second partial request resides in a second memory bank of the plurality of memory banks; (See rejection of claim 1 citing the teaching of Roberts, showing requested data being sent form one NUMA node to another and Tahhan’s teaching of different NUMA nodes comprising different banks.) directing the first partial request (See rejection of claim 4.) to the first memory bank via the crossbar switch; directing the second partial request to the second memory bank via the crossbar switch; retrieving the requested first data from the first memory bank via the crossbar switch; retrieving the requested second data from the second memory bank via the crossbar switch; preparing a partial response that includes the requested first and second data; and providing the partial response to the first processing element.  (See rejection of claim 1.  Note that the cited art are generally to repeating processes.)
18. (Original) The method of claim 17, wherein 
the memory unit includes a plurality of connections communicatively connecting the memory unit to a processor, the processor includes the first processing element among a plurality of processing elements, and the received memory request is received at a first connection of the plurality of connections. (See Roberts figure 1.)
19. (Original) The method of claim 18, wherein 
the partial response (See rejection of claim 4.) is provided to the first processing element via the first connection. (See rejection of claim 1 and Roberts figure 1.)
Claim 8 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Roberts, Tahhan, Emory, Miller, and Huang (US 2007/0294426)
8. (Original) The system of claim 7, wherein 
the prepared partial response includes a corresponding sequence identifier ordering the partial response among a plurality of partial responses. (The previously cited art does not expressly teach sending sequence identifiers.  
Huang teaches: “For example, a method of reliably transmitting data packets from a first computer (source node) to a second computer (destination node) may include a step of sending data packets numbered with consecutive sequence numbers from the first computer to the second computer; retaining a copy of each sent data packet in a retransmit queue of said first computer; receiving the data packets in the second computer; tracking the sequence numbers of the data packets received in said second computer; sending an acknowledgement message for each received data packet from said second computer to said first computer; and sending a selective bitmap message where the bitmap indicates the reception status of the last N consecutively numbered data packets, only if at least one of the N (e.g., 8 or 16) data packets was not correctly received within a predetermined time.”  Huang paragraph 0043.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Huang as an instance of (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; The prior art contained a "base" device (method, or product) upon which the claimed invention can be seen as an "improvement” (using sequence identifiers allows the sending and receiving nodes to coordinate resends for lost or damaged data sent between the nodes).  The prior art contained a known technique that is applicable to the base device (method, or product) (including sequence identifiers is applicable to the primary reference). One of ordinary skill in the art would have recognized that applying the known technique would have yielded predictable results and resulted in an improved system. See MPEP § 2143(I)(D).)
20. (Original) A method comprising: 
receiving a first memory request provided from a first processing element at a first decomposition unit of a memory unit, wherein the memory unit includes a plurality of memory banks and a crossbar switch, the crossbar switch communicatively connecting each of a plurality of decomposition units to each of the plurality of memory banks; receiving a second memory request provided from a second processing element at a second decomposition unit of the memory unit; (See rejection of claim 1. Note that the art cited in the rejection of claim 1 is to repeating processes.) decomposing the first memory request into a first plurality of partial requests and the second memory request into a second plurality of partial requests; (See rejection of claim 4. Note that all references are to repeating processes.) determining for each partial request of the first plurality of partial requests and the second plurality of partial requests whether the partial request is to be served from the plurality of memory banks; (See rejection of claim 1.  Note that Roberts’ teaching of serving requests from the NUMA nodes (banks of Tehhan) includes determining to serve the requests.) discarding a first group of partial requests from the first plurality of partial requests and the second plurality of partial requests that is not to be served from the plurality of memory banks; (Huang teaches: “For example, a method of reliably transmitting data packets from a first computer (source node) to a second computer (destination node) may include a step of sending data packets numbered with consecutive sequence numbers from the first computer to the second computer; retaining a copy of each sent data packet in a retransmit queue of said first computer; receiving the data packets in the second computer; tracking the sequence numbers of the data packets received in said second computer; sending an acknowledgement message for each received data packet from said second computer to said first computer; and sending a selective bitmap message where the bitmap indicates the reception status of the last N consecutively numbered data packets, only if at least one of the N (e.g., 8 or 16) data packets was not correctly received within a predetermined time.”  Huang paragraph 0043.  Note that after data does not arrive for some amount of time new data is send and the old data is “discarded”.   
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Huang because repeating the send of the discarded data results in the data eventually arriving at the desired location.) for each partial request of a second group of partial requests from the first plurality of partial requests that is to be served from the plurality of memory banks, retrieving data of the partial request via the crossbar switch, preparing a first partial response using the retrieved data, and providing the first partial response to the first processing element; (See rejection of claim 1.) and for each partial request of a third group of partial requests from the second plurality of partial requests that is to be served from the plurality of memory banks, retrieving data of the partial request via the crossbar switch, preparing a second partial response using the retrieved data, and providing the second partial response to the second processing element.  (See rejection of claim 1.  Note that all references are to repeating processes.)
Withdrawn: Claims 9-16.





Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Title
Document I.D.
Reason Included
Node controller for performing cache coherence control and memory-shared multiprocessor system
US 6789173 B1
" The sequenced requests are transmitted to the CPUs, main memory and I/O controller through a selector within the crossbar. A system for realizing snooping cache by providing the crossbar with a function of sequencing the memory access requests to broadcast the memory access requests to all the CPUs in this way, will be referred to as the multicasting system, hereinafter." paragraph 15.   
Multi-processor system and its network
US 6728258 B1
"For an access request to a memory mapped register belonging to any one of the processor units, memory units, and input/output units, the access request is broadcast to all units via the crossbar switch." abstract.
NESTED CACHE COHERENCY PROTOCOL IN A TIERED MULTI-NODE COMPUTER SYSTEM
US 20170228317 A1
"in response to the broadcast cache request, each of said all other CP clusters of the first node send respective partial responses to other CP clusters and the SC function of the first node, and wherein, in response to the broadcast cache request, the SC function sends respective partial responses to the one or more CP clusters of the first node" paragraph 8.
Parallel management system for a file data storage structure
US 6088704 A
"Accordingly, data in one file would be recorded across a plurality of secondary storage devices 1, 2 and 3. When one secondary storage device is viewed, it would contains partial files." paragraph 5.  
NETWORK-AWARE CACHE COHERENCE PROTOCOL ENHANCEMENT
US 20180143905 A1
"In one embodiment, each node in the NUMA system 200 broadcasts cache probe requests (e.g., a read and/or a write probe) to the cache memory and main memory of all other nodes. In response to the cache probe requests, the cache memory and main memory of each node will return one or more return responses to the requesting node. For example, for a read probe, the caches can return a cache hit or a cache miss to indicate whether the requested data is found within cache memory. The latency tables 250-256 are populated by latency entries that record latency times from the start of a cache probe request to when the response is received. Based on the values stored within the latency tables 250-256, it can be determined whether it would be more efficient to retrieve a copy of the requested data from main memory in its home node or from a cached copy in an owning node." paragraph 0035.  
Graphics-processing system and method of broadcasting write requests to multiple graphics devices
US 20070245046 A1
"A read request to an address in the broadcast address range causes data to be read from one of the graphics devices (e.g., a designated primary graphics device). " paragraph 22.

US 10180919 B1
"In one embodiment, the broadcast read request is a burst request. When a logic module 133 is targeted by a burst broadcast read request, the targeted logic module 133 reads a plurality of values from sequentially addressed registers from the targeted logic module 133." paragraph 41. "In one implementation, the logic 470 may receive the broadcast request from the buffer 465 and may decode the broadcast request. If the broadcast request is targeted at any of sub-modules 443A-443N (the broadcast address is within the address range of the sub-modules), then the distribution decoder block 472 generates its own sub-requests and sends those sub-requests to sub-modules 443A-443N, where N is the number of sub-modules connected to distribution decoder block 472. The sub-requests may be write sub-requests or read sub-requests, depending on whether the broadcast request is a broadcast write request or a broadcast read request. Distribution decoder block 472 may also pass the broadcast request downstream to the next logic module (e.g. logic module 333 or 433) on the ring bus 120, either before or after processing the request." paragraph 43.  "If the sub-request is a read sub-request, the read sub-request may include a register address of a register to be read. The register address may have been included in the broadcast read request received by the distribution decoder block 472. In some implementations, in the case of a read sub-request, the converter 445N reads a value at the register address of one of the registers 449N. In these implementations, the converter 445N may convert the read sub-request from input buffer 444N to be compatible with a RAM interface, where the plurality of registers 449N are RAM, in one embodiment. When the converter 445N receives the value from the addressed configuration register, the value is sent to response buffer 474N. Logic 470 then prepares a read response that includes the value, and transmits the read response onto the ring bus 120, at the appropriate time, using the response mux 476. The read response may be sent to the ring bus controller 130 for accumulation over ring bus 120. Sub-modules 443A and 443B may be similarly configured to send values to response buffers 474A and 474B for transmission onto the ring bus 120 at the appropriate time, using the response mux 476." paragraph 46.  
Ternary Content Addressable Memory Scan-Engine
US 20160284425 A1
"[0044] Memory databases may ignore broadcast read instructions made to addresses outside the range of that particular memory database (612), or for other reasons. The memory databases may recognize the broadcast read instruction and perform a parity test responsive to the broadcast read instruction. In doing so, each TCAM constituent instance in a given memory database may execute the parity check at the specified address (614). " paragraph 44.
LOCALIZED SERVICE RESILIENCY
US 20190394081 A1
"[0134] For example, NUMA node 0 includes memory banks DIMM 0 through DIMM N, as well as PCIe device x." paragraph 0134.  "[0135] NUMA node 1 includes CPU N+1 through CPU M, which share LLC N. NUMA node 1 also includes PCIe device y, as well as memory banks DIMM N+1 through DIMM M."  paragraph 0135.

US 20070294426 A1
"sending from the source computer to the destination computer a data packet having a header which includes the context reference, and a payload data; receiving the data packet at the destination computer; extracting the context address from the context reference in the packet header; and storing the received payload data in the memory of the destination computer in accordance with the at least one application object. [0041] According to further embodiments, the context reference may further include a sequence number and a signature. 
" paragraph 0040 - 0041.  "For example, a method of reliably transmitting data packets from a first computer (source node) to a second computer (destination node) may include a step of sending data packets numbered with consecutive sequence numbers from the first computer to the second computer; retaining a copy of each sent data packet in a retransmit queue of said first computer; receiving the data packets in the second computer; tracking the sequence numbers of the data packets received in said second computer; sending an acknowledgement message for each received data packet from said second computer to said first computer; and sending a selective bitmap message where the bitmap indicates the reception status of the last N consecutively numbered data packets, only if at least one of the N (e.g., 8 or 16) data packets was not correctly received within a predetermined time." paragraph 0043.  
Data processing system, method and interconnect fabric having an address-based launch governor
US 20060176885 A1
" [0109] Referring now to FIG. 13A, there is depicted a time-space diagram illustrating the tenure of an exemplary system-wide broadcast operation with respect to the exemplary data structures depicted in FIG. 9 through FIG. 12B. As shown at the top of FIG. 13A and as described previously with reference to FIG. 4A, the operation is issued by local master 100a0c to each local hub 100, including local hub 100a0b. Local hub 100a0b forwards the operation to remote hub 100b0a, which in turn forwards the operation to its remote leaves, including remote leaf 100b0d. The partial responses to the operation traverse the same series of links in reverse order back to local hubs 100a0a-100a0d, which broadcast the accumulated partial responses to each of local hubs 100a0a-100a0d. Local hubs 100a0a-100a0c, including local hub 100a0b, then distribute the combined response following the same transmission paths as the request. Thus, local hub 100a0b transmits the combined response to remote hub 100b0a, which transmits the combined response to remote leaf 100b0d. " paragraph 0109.  "Referring now to block 1632, processing of partial response(s) received by a local hub 100 on one or more first tier links begins when the partial response(s) is/are received by combining logic 1510. As shown at block 1634, combining logic 1510 may apply small tuning delays to the partial response(s) received on the inbound first tier links in order to synchronize processing of the partial response(s) with each other and the locally broadcast partial response. Thereafter, the partial response(s) are processed as depicted at block 1640 and following blocks, which have been described." paagraph 0152.




NA
“For years the computation rate of processors has been much faster than the access rate of memory banks and this divergence in speeds has been constantly
increasing in recent years As a result several shared memory multiprocessors consist of more memory banks than processors.”



Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL M KNIGHT whose telephone number is (571)272-8646.  The examiner can normally be reached on Monday - Friday 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Reginald Bragdon can be reached on 571 272 4204.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


PAUL M. KNIGHT
Examiner
Art Unit 2139



/PAUL M KNIGHT/Examiner, Art Unit 2139