摘要:
Methods, systems, and media for reducing memory latency seen by processors by providing a measure of control over on-chip memory (OCM) management to software applications, implicitly and/or explicitly, via an operating system are contemplated. Many embodiments allow part of the OCM to be managed by software applications via an application program interface (API), and part managed by hardware. Thus, the software applications can provide guidance regarding address ranges to maintain close to the processor to reduce unnecessary latencies typically encountered when dependent upon cache controller policies. Several embodiments utilize a memory internal to the processor or on a processor node so the memory block used for this technique is referred to as OCM.
摘要:
A method for managing packet traffic in a data processing network includes collecting data indicative of the amount of packet traffic traversing each of the links in the network's interconnect. The collected data includes source and destination information indicative of the source and destination of corresponding packets. A heavily used links are then identified from the collected data. Packet data associated with the heavily used link is then analyzed to identify a packet source and packet destination combination that is a significant contributor to the packet traffic on the heavily used link. In response, a process associated with the identified packet source and packet destination combination is migrated, such as to another node of the network, to reduce the traffic on the heavily used link. In one embodiment, an agent installed on each interconnect switch collects the packet data for interconnect links connected to the switch.
摘要:
A method and system for memory address translation and pinning are provided. The method includes attaching a memory address space identifier to a direct memory access (DMA) request, the DMA request is sent by a consumer and using a virtual address in a given address space. The method further includes looking up for the memory address space identifier to find a translation of the virtual address in the given address space used in the DMA request to a physical page frame. Provided that the physical page frame is found, pinning the physical page frame al song as the DMA request is in progress to prevent an unmapping operation of said virtual address in said given address space, and completing the DMA request, wherein the steps of attaching, looking up and pinning are centrally controlled by a host gateway.
摘要:
A method for retrieving information from a storage unit, the method includes: receiving, by an input output memory management unit second-level translation information representative of a partition of a storage unit address space; receiving, by a input output memory management unit, a direct memory access request that comprises a consumer identifier and a second memory address that was first-level translated by a communication circuit translation entity; performing, by the input output memory management unit, a second-level translation of the second memory address such as to provide a third memory address, in response to the identity of the consumer; and accessing the storage unit using the third memory address.
摘要:
A technique of monitoring the cache footprint of relevant threads on a given processor and its associated cache, thus enabling operating systems to perform better cache sensitive scheduling. A function of the footprint of a thread in a cache can be used as an indication of the affinity of that thread to that cache's processor. For instance, the larger the number of cachelines already existing in a cache, the smaller the number of cache misses the thread will experience when scheduled on that processor, and hence the greater the affinity of the thread to that processor. Besides a thread's priority and other system defined parameters, scheduling algorithms can take cache affinity into account when assigning execution of threads to particular processors. This invention describes an apparatus that accurately measures the cache footprint of a thread on a given processor and its associated cache by keeping a state and ownership count of cachelines based on ownership registration and a cache usage as determined by a cache monitoring unit.
摘要:
A data processing system includes a microprocessor having access to multiple levels of cache memories. The microprocessor executes a main thread compiled from a source code object. The system includes a processor for executing an assist thread also derived from the source code object. The assist thread includes memory reference instructions of the main thread and only those arithmetic instructions required to resolve the memory reference instructions. A scheduler configured to schedule the assist thread in conjunction with the corresponding execution thread is configured to execute the assist thread ahead of the execution thread by a determinable threshold such as the number of main processor cycles or the number of code instructions. The assist thread may execute in the main processor or in a dedicated assist processor that makes direct memory accesses to one of the lower level cache memory elements.
摘要:
A method for managing packet traffic in a data processing network includes collecting data indicative of the amount of packet traffic traversing each of the links in the network's interconnect. The collected data includes source and destination information indicative of the source and destination of corresponding packets. A heavily used links are then identified from the collected data. Packet data associated with the heavily used link is then analyzed to identify a packet source and packet destination combination that is a significant contributor to the packet traffic on the heavily used link. In response, a process associated with the identified packet source and packet destination combination is migrated, such as to another node of the network, to reduce the traffic on the heavily used link. In one embodiment, an agent installed on each interconnect switch collects the packet data for interconnect links connected to the switch.
摘要:
A data processing system includes a microprocessor having access to multiple levels of cache memories. The microprocessor executes a main thread compiled from a source code object. The system includes a processor for executing an assist thread also derived from the source code object. The assist thread includes memory reference instructions of the main thread and only those arithmetic instructions required to resolve the memory reference instructions. A scheduler configured to schedule the assist thread in conjunction with the corresponding execution thread is configured to execute the assist thread ahead of the execution thread by a determinable threshold such as the number of main processor cycles or the number of code instructions. The assist thread may execute in the main processor or in a dedicated assist processor that makes direct memory accesses to one of the lower level cache memory elements.
摘要:
A data processing system includes a microprocessor having access to multiple levels of cache memories. The microprocessor executes a main thread compiled from a source code object. The system includes a processor for executing an assist thread also derived from the source code object. The assist thread includes memory reference instructions of the main thread and only those arithmetic instructions required to resolve the memory reference instructions. A scheduler configured to schedule the assist thread in conjunction with the corresponding execution thread is configured to execute the assist thread ahead of the execution thread by a determinable threshold such as the number of main processor cycles or the number of code instructions. The assist thread may execute in the main processor or in a dedicated assist processor that makes direct memory accesses to one of the lower level cache memory elements.
摘要:
A method and system for memory address translation and pinning are provided. The method includes attaching a memory address space identifier to a direct memory access (DMA) request, the DMA request is sent by a consumer and using a virtual address in a given address space. The method further includes looking up for the memory address space identifier to find a translation of the virtual address in the given address space used in the DMA request to a physical page frame. Provided that the physical page frame is found, pinning the physical page frame as long as the DMA request is in progress to prevent an unmapping operation of said virtual address in said given address space, and completing the DMA request, wherein the steps of attaching, looking up and pinning are centrally controlled by a host gateway.