INSTRUCTION AND LOGIC FOR SUPPRESSION OF HARDWARE PREFETCHERS
    1.
    发明申请
    INSTRUCTION AND LOGIC FOR SUPPRESSION OF HARDWARE PREFETCHERS 审中-公开
    用于抑制硬件预制器的指令和逻辑

    公开(公告)号:US20160179544A1

    公开(公告)日:2016-06-23

    申请号:US14580999

    申请日:2014-12-23

    IPC分类号: G06F9/38 G06F9/30

    摘要: A processor includes a core, a hardware prefetcher, and a prefetcher control module. The hardware prefetcher includes logic to make speculative prefetch requests, through a memory subsystem, for elements for execution by the core, and logic to store prefetched elements in a cache. The prefetcher control module includes logic to selectively suppress, based on a hardware-prefetch suppression instruction executed by the core, a speculative prefetch request to be made by the hardware prefetcher.

    摘要翻译: 处理器包括核心,硬件预取器和预取器控制模块。 硬件预取器包括用于通过存储器子系统进行推测预取请求的逻辑,用于由核心执行的元素以及将预取元素存储在高速缓存中的逻辑。 预取器控制模块包括用于基于由核心执行的硬件预取抑制指令来选择性地抑制由硬件预取器进行的推测预取请求的逻辑。

    METHOD AND APPARATUS FOR SELECTING CACHE LOCALITY FOR ATOMIC OPERATIONS
    2.
    发明申请
    METHOD AND APPARATUS FOR SELECTING CACHE LOCALITY FOR ATOMIC OPERATIONS 有权
    选择用于原子操作的缓存本地化的方法和装置

    公开(公告)号:US20150178086A1

    公开(公告)日:2015-06-25

    申请号:US14137218

    申请日:2013-12-20

    IPC分类号: G06F9/38 G06F12/08

    摘要: An apparatus and method for determining whether to execute an atomic operation locally or remotely. For example, one embodiment of a processor comprises: a decoder to decode an atomic operation on a local core; prediction logic on the local core to estimate a cost associated with execution of the atomic operation on the local core and a cost associated with execution of the atomic operation on a remote core; and the remote core to execute the atomic operation remotely if the prediction logic determines that the cost for execution on the local core is relatively greater than the cost for execution on the remote core; and the local core to execute the atomic operation locally if the prediction logic determines that the cost for local execution on the local core is relatively less than the cost for execution on the remote core.

    摘要翻译: 一种用于确定是在本地还是远程执行原子操作的装置和方法。 例如,处理器的一个实施例包括:解码器,用于解码局部核心上的原子操作; 本地核心上的预测逻辑来估计与本地核心上的原子操作的执行相关的成本以及与在远程核心上执行原子操作相关联的成本; 以及所述远程核心,如果所述预测逻辑确定所述本地核上的执行成本相对大于所述远程核上的执行成本,则远程执行所述原子操作; 如果预测逻辑确定本地核心上的本地执行成本相对低于在远程核心上执行的成本,本地核心将在本地执行原子操作。

    OBJECT LIVENESS TRACKING FOR USE IN PROCESSING DEVICE CACHE
    4.
    发明申请
    OBJECT LIVENESS TRACKING FOR USE IN PROCESSING DEVICE CACHE 有权
    用于处理设备高速缓存的对象生活跟踪

    公开(公告)号:US20140304477A1

    公开(公告)日:2014-10-09

    申请号:US13993034

    申请日:2013-03-15

    IPC分类号: G06F12/08

    摘要: A processing device comprises a processing device cache and a cache controller. The cache controller initiates a cache line eviction process and determines determine an object liveness value associated with a cache line in the processing device cache. The cache controller applies the object liveness value to a cache line eviction policy and evicts the cache line from the processing device cache based on the object liveness value and the cache line eviction policy.

    摘要翻译: 处理设备包括处理设备高速缓存和高速缓存控制器。 高速缓存控制器启动高速缓存线驱逐过程并且确定确定与处理设备高速缓存中的高速缓存线相关联的对象活动值。 高速缓存控制器将对象活动值应用于高速缓存行驱逐策略,并基于对象活动性值和高速缓存行驱逐策略将缓存行从处理设备高速缓存中排除。

    APPARATUS AND METHOD FOR IMPLEMENTING A SCRATCHPAD MEMORY
    5.
    发明申请
    APPARATUS AND METHOD FOR IMPLEMENTING A SCRATCHPAD MEMORY 有权
    用于实现SCRATCHPAD存储器的装置和方法

    公开(公告)号:US20140189247A1

    公开(公告)日:2014-07-03

    申请号:US13730507

    申请日:2012-12-28

    IPC分类号: G06F12/12

    摘要: An apparatus and method for implementing a scratchpad memory within a cache using priority hints. For example, a method according to one embodiment comprises: providing a priority hint for a scratchpad memory implemented using a portion of a cache; determining a page replacement priority based on the priority hint; storing the page replacement priority in a page table entry (PTE) associated with the page; and using the page replacement priority to determine whether to evict one or more cache lines associated with the scratchpad memory from the cache.

    摘要翻译: 一种使用优先提示在高速缓存中实现暂存器存储器的装置和方法。 例如,根据一个实施例的方法包括:为使用高速缓存的一部分实现的暂存器存储器提供优先提示; 基于优先提示确定页面替换优先级; 将所述页面替换优先级存储在与所述页面相关联的页面表项(PTE)中; 以及使用页面替换优先级来确定是否从高速缓存驱逐与暂存器存储器相关联的一个或多个高速缓存行。

    SPECULATIVE NON-FAULTING LOADS AND GATHERS
    6.
    发明申请
    SPECULATIVE NON-FAULTING LOADS AND GATHERS 有权
    非分散负载和加速度

    公开(公告)号:US20140181580A1

    公开(公告)日:2014-06-26

    申请号:US13725907

    申请日:2012-12-21

    IPC分类号: G06F9/30 G06F11/07

    摘要: According to one embodiment, a processor includes an instruction decoder to decode an instruction to read a plurality of data elements from memory, the instruction having a first operand specifying a storage location, a second operand specifying a bitmask having one or more bits, each bit corresponding to one of the data elements, and a third operand specifying a memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the instruction, to read one or more data elements speculatively, based on the bitmask specified by the second operand, from a memory location based on the memory address indicated by the third operand, and to store the one or more data elements in the storage location indicated by the first operand.

    摘要翻译: 根据一个实施例,处理器包括指令解码器,用于解码从存储器读取多个数据元素的指令,该指令具有指定存储位置的第一操作数,指定具有一个或多个位的位掩码的第二操作数,每个位 对应于数据元素之一,以及指定存储多个数据元素的存储器地址的第三操作数。 所述处理器还包括执行单元,响应于所述指令,所述执行单元基于所述第二操作数指定的位掩码,从存储器位置推测性地读取一个或多个数据元素,所述执行单元基于由所述存储器地址 并且将一个或多个数据元素存储在由第一操作数指示的存储位置中。

    APPARATUS AND METHOD FOR SELECTING ELEMENTS OF A VECTOR COMPUTATION
    7.
    发明申请
    APPARATUS AND METHOD FOR SELECTING ELEMENTS OF A VECTOR COMPUTATION 审中-公开
    选择矢量计算要素的装置和方法

    公开(公告)号:US20130332701A1

    公开(公告)日:2013-12-12

    申请号:US13996521

    申请日:2011-12-23

    IPC分类号: G06F9/30

    摘要: An apparatus and method are described for selecting elements to be used in a vector computation. For example, a method according to one embodiment includes the following operations: specifying whether to identify the first, last or next after last active element of an input mask register using an immediate value; identifying the first, last or next after last active element in the input mask register according to the immediate value; reading a value from an input vector register corresponding to the identified first, last or next after last active element in the input mask register; and writing the value to an output vector register.

    摘要翻译: 描述了用于选择要在向量计算中使用的元素的装置和方法。 例如,根据一个实施例的方法包括以下操作:使用立即值来指定是否识别输入屏蔽寄存器的第一,最后或下一个有效元素; 根据立即值识别输入屏蔽寄存器中的最后一个或最后一个有效元素; 从输入矢量寄存器读取对应于输入屏蔽寄存器中识别的第一,最后或下一个最后有效元件的值; 并将该值写入输出向量寄存器。

    Method, medium, and system encoding/decoding video data using bitrate adaptive binary arithmetic coding
    10.
    发明申请
    Method, medium, and system encoding/decoding video data using bitrate adaptive binary arithmetic coding 失效
    使用比特率自适应二进制算术编码的方法,中等和系统编码/解码视频数据

    公开(公告)号:US20070171985A1

    公开(公告)日:2007-07-26

    申请号:US11490021

    申请日:2006-07-21

    IPC分类号: H04N7/12

    摘要: A method, medium, and system encoding/decoding video data using a binary arithmetic coding adaptive to a compression bit rate of the video data. The system may include a bitrate adaptation unit determining a maximum length of a prefix using a compression bitrate of the video data, a binarization unit dividing the video data into a prefix and a suffix according to the determined maximum length of the prefix and binarizing the video data, and an arithmetic encoding unit performing an arithmetic encoding on the binarized video data. The video data may be encoded/decoded using binary arithmetic encoding/decoding by determining the maximum length of the prefix, an order of an exponential Golomb code, and the number of contexts based on the compression bitrate. Accordingly, it is possible to obtain high encoding efficiency regardless of a range of the desired compression bitrate.

    摘要翻译: 使用适应于视频数据的压缩比特率的二进制算术编码的方法,媒体和系统对视频数据进行编码/解码。 该系统可以包括比特率适配单元,其使用视频数据的压缩比特率来确定前缀的最大长度;二进制化单元,根据所确定的前缀的最大长度将视频数据划分成前缀和后缀,并且二进制化视频 数据和对二值化视频数据执行算术编码的算术编码单元。 可以使用二进制算术编码/解码,通过基于压缩比特率确定前缀的最大长度,指数Golomb码的顺序和上下文的数量来对视频数据进行编码/解码。 因此,无论期望的压缩比特率的范围如何,都可以获得高的编码效率。