Abstract:
In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for implementing efficient communication between caches in hierarchical caching design. For example, in one embodiment, such means may include an integrated circuit having a data bus; a lower level cache communicably interfaced with the data bus; a higher level cache communicably interfaced with the data bus; one or more data buffers and one or more dataless buffers. The data buffers in such an embodiment being communicably interfaced with the data bus, and each of the one or more data buffers having a buffer memory to buffer a full cache line, one or more control bits to indicate state of the respective data buffer, and an address associated with the full cache line. The dataless buffers in such an embodiment being incapable of storing a full cache line and having one or more control bits to indicate state of the respective dataless buffer and an address for an inter-cache transfer line associated with the respective dataless buffer. In such an embodiment, inter-cache transfer logic is to request the inter-cache transfer line from the higher level cache via the data bus and is to further write the inter-cache transfer line into the lower level cache from the data bus.
Abstract:
In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for implementing a balanced P-LRU tree for a “multiple of 3” number of ways cache. For example, in one embodiment, such means may include an integrated circuit having a cache and a plurality of ways. In such an embodiment the plurality of ways include a quantity that is a multiple of three and not a power of two, and further in which the plurality of ways are organized into a plurality of pairs. In such an embodiment, means further include a single bit for each of the plurality of pairs, in which each single bit is to operate as an intermediate level decision node representing the associated pair of ways and a root level decision node having exactly two individual bits to point to one of the single bits to operate as the intermediate level decision nodes representing an associated pair of ways. In this exemplary embodiment, the total number of bits is N−1, wherein N is the total number of ways in the plurality of ways. Alternative structures are also presented for full LRU implementation, a “multiple of 5” number of cache ways, and variations of the “multiple of 3” number of cache ways.
Abstract:
The technologies provided herein relate to protecting the integrity of original code that has been optimized. For example, a processor may perform a fetch operation to obtain specified code from a memory. During execution, the code may be optimized and stored in a portion of the memory. The processor may obtain the optimized code from the portion of the memory. An entry of a first table may be modified to indicate a relationship between the particular code and the optimized code. One or more entries of a second table may be modified to specify the one or more physical memory locations. Each of the one or more entries of the second table may correspond to the entry of the first table. The processor may execute the optimized code when each of the one or more entries of the second table are valid.
Abstract:
A method and apparatus for providing a memory model for hardware attributes to support transactional execution is herein described. Upon encountering a load of a hardware attribute, such as a test monitor operation to load a read monitor, write monitor, or buffering attribute, a fault is issued in response to a loss field indicating the hardware attribute has been lost. Furthermore, dependency actions, such as blocking and forwarding, are provided for the attribute access operations based on address dependency and access type dependency. As a result, different scenarios for attribute loss and testing thereof are allowed and restricted in a memory model.
Abstract:
A method and apparatus for providing a memory model for hardware attributes to support transactional execution is herein described. Upon encountering a load of a hardware attribute, such as a test monitor operation to load a read monitor, write monitor, or buffering attribute, a fault is issued in response to a loss field indicating the hardware attribute has been lost. Furthermore, dependency actions, such as blocking and forwarding, are provided for the attribute access operations based on address dependency and access type dependency. As a result, different scenarios for attribute loss and testing thereof are allowed and restricted in a memory model.
Abstract:
In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for implementing efficient communication between caches in hierarchical caching design. For example, in one embodiment, such means may include an integrated circuit having a data bus; a lower level cache communicably interfaced with the data bus; a higher level cache communicably interfaced with the data bus; one or more data buffers and one or more dataless buffers. The data buffers in such an embodiment being communicably interfaced with the data bus, and each of the one or more data buffers having a buffer memory to buffer a full cache line, one or more control bits to indicate state of the respective data buffer, and an address associated with the full cache line. The dataless buffers in such an embodiment being incapable of storing a full cache line and having one or more control bits to indicate state of the respective dataless buffer and an address for an inter-cache transfer line associated with the respective dataless buffer. In such an embodiment, inter-cache transfer logic is to request the inter-cache transfer line from the higher level cache via the data bus and is to further write the inter-cache transfer line into the lower level cache from the data bus.
Abstract:
A method and apparatus for avoiding live-lock during transaction execution is herein described. Counting logic is utilized to track successfully committed transactions for each processing element. When a data conflict is detected between transactions on multiple processing elements, priority is provided to the processing element with the lower counting logic value. Furthermore, if the values are the same, then the processing element with the lower identification value is given priority, i.e. allowed to continue while the other transaction is aborted. To avoid live-lock between processing elements that both have predetermined counting logic values, such as maximum counting values, when one processing element reaches the predetermined counting value all counters are reset. In addition, a failure at maximum value (FMV) counter may be provided to count a number of aborts of a transaction when counting logic is at a maximum value. When the FMV counter is at a predetermined number of aborts the counting logic is reset to avoid live lock.
Abstract:
A method and apparatus for providing a memory model for hardware attributes to support transactional execution is herein described. Upon encountering a load of a hardware attribute, such as a test monitor operation to load a read monitor, write monitor, or buffering attribute, a fault is issued in response to a loss field indicating the hardware attribute has been lost. Furthermore, dependency actions, such as blocking and forwarding, are provided for the attribute access operations based on address dependency and access type dependency. As a result, different scenarios for attribute loss and testing thereof are allowed and restricted in a memory model.
Abstract:
Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.
Abstract:
Embodiments of the present invention provide a secure programming paradigm, and a protected cache that enable a processor to handle secret/private information while preventing, at the hardware level, malicious applications from accessing this information by circumventing the other protection mechanisms. A protected cache may be used as a building block to enhance the security of applications trying to create, manage and protect secure data. Other embodiments are described and claimed.