Method and apparatus for vector-matrix comparison

    公开(公告)号:US10782971B1

    公开(公告)日:2020-09-22

    申请号:US16370922

    申请日:2019-03-30

    Abstract: Methods and apparatus for vector-matrix comparison are disclosed. In one embodiment, a processor comprises decoding and execution circuitry. The decoding circuitry decodes an instruction, where operands of the instruction specifies an output location to store output results, a vector of data element values, and a matrix of data element values. The execution circuitry executes the decoded instruction. The execution includes to map each of the data element values of the vector to one of consecutive rows of the matrix; for each data element value of the vector, to compare that data element value of the vector with data element values in a respective row of the matrix and obtain data element match results. The execution further includes to store the output results based on the data element match results, where each output result maps to a respective data element column position and indicates a vector match result.

    Mechanism To Avoid Hot-L1/Cold-L2 Events In An Inclusive L2 Cache Using L1 Presence Bits For Victim Selection Bias
    3.
    发明申请
    Mechanism To Avoid Hot-L1/Cold-L2 Events In An Inclusive L2 Cache Using L1 Presence Bits For Victim Selection Bias 有权
    机制避免热L1 /冷L2事件在一个包含L2缓存使用L1存在位受害者选择偏差

    公开(公告)号:US20160283380A1

    公开(公告)日:2016-09-29

    申请号:US14671411

    申请日:2015-03-27

    Abstract: A processor includes a processing core, an L1 cache, operatively coupled to the processing core, the L1 cache comprising an L1 cache entry to store a data item, an L2 cache, inclusive with respect to the L1 cache, the L2 cache comprising an L2 cache entry corresponding to the L1 cache entry, an activity flag associated with the L2 cache entry, the activity flag indicating an activity status of the L1 cache entry, and a cache controller to, in response to detecting an access operation with respect to the L1 cache entry, set the flag to an active status.

    Abstract translation: 处理器包括可操作地耦合到所述处理核心的处理核心,L1高速缓存器,所述L1高速缓存器包括相对于所述L1高速缓存存储数据项的L1高速缓存条目,L2高速缓存,所述L2高速缓存包括L2 对应于L1高速缓存条目的缓存条目,与L2高速缓存条目相关联的活动标志,指示L1高速缓存条目的活动状态的活动标志,以及高速缓存控制器,响应于检测到关于L1的访问操作 缓存条目,将标志设置为活动状态。

    INSTRUCTION AND LOGIC FOR PREFETCHER THROTTLING BASED ON DATA SOURCE
    6.
    发明申请
    INSTRUCTION AND LOGIC FOR PREFETCHER THROTTLING BASED ON DATA SOURCE 有权
    基于数据源的预处理器曲线的指令和逻辑

    公开(公告)号:US20160062768A1

    公开(公告)日:2016-03-03

    申请号:US14471261

    申请日:2014-08-28

    Abstract: A processor includes a core, a prefetcher, and a prefetcher control module. The prefetcher includes logic to make speculative prefetch requests through a memory subsystem for an element for execution by the core, and logic to store prefetched elements in a cache. The prefetcher control module includes logic to determine counts of memory accesses to two types of memory and, based upon the counts and the type of memory, reduce the speculative prefetch requests of the prefetcher.

    Abstract translation: 处理器包括核心,预取器和预取器控制模块。 预取器包括用于通过存储器子系统进行推测预取请求的逻辑,用于由核心执行的元素以及将预取元素存储在高速缓存中的逻辑。 预取器控制模块包括用于确定对两种类型的存储器的存储器访问的计数的逻辑,并且基于计数和存储器的类型,减少预取器的推测性预取请求。

    Method and apparatus for vector-matrix comparison

    公开(公告)号:US10817297B2

    公开(公告)日:2020-10-27

    申请号:US16370922

    申请日:2019-03-30

    Abstract: Methods and apparatus for vector-matrix comparison are disclosed. In one embodiment, a processor comprises decoding and execution circuitry. The decoding circuitry decodes an instruction, where operands of the instruction specifies an output location to store output results, a vector of data element values, and a matrix of data element values. The execution circuitry executes the decoded instruction. The execution includes to map each of the data element values of the vector to one of consecutive rows of the matrix; for each data element value of the vector, to compare that data element value of the vector with data element values in a respective row of the matrix and obtain data element match results. The execution further includes to store the output results based on the data element match results, where each output result maps to a respective data element column position and indicates a vector match result.

    Minimizing snoop traffic locally and across cores on a chip multi-core fabric

    公开(公告)号:US10102129B2

    公开(公告)日:2018-10-16

    申请号:US14976678

    申请日:2015-12-21

    Abstract: A processor includes a processing core, a L1 cache comprising a first processing core and a first L1 cache comprising a first L1 cache data entry of a plurality of L1 cache data entries to store data. The processor also includes an L2 cache comprising a first L2 cache data entry of a plurality of L2 cache data entries. The first L2 cache data entry corresponds to the first L1 cache data entry and each of the plurality of L2 cache data entries are associated with a corresponding presence bit (pbit) of a plurality of pbits. Each of the plurality of pbits indicates a status of a corresponding one of the plurality of L2 cache data entries. The processor also includes a cache controller, which in response to a first request among a plurality of requests to access the data at the first L1 cache data entry, determines that a copy of the data is stored in the first L2 cache data entry; and retrieves the copy of the data from the L2 cache data entry in view of the status of the pbit.

    MINIMIZING SNOOP TRAFFIC LOCALLY AND ACROSS CORES ON A CHIP MULTI-CORE FABRIC

    公开(公告)号:US20170177483A1

    公开(公告)日:2017-06-22

    申请号:US14976678

    申请日:2015-12-21

    CPC classification number: G06F12/0815 G06F12/0811 G06F2212/621

    Abstract: A processor includes a processing core, a L1 cache comprising a first processing core and a first L1 cache comprising a first L1 cache data entry of a plurality of L1 cache data entries to store data. The processor also includes an L2 cache comprising a first L2 cache data entry of a plurality of L2 cache data entries. The first L2 cache data entry corresponds to the first L1 cache data entry and each of the plurality of L2 cache data entries are associated with a corresponding presence bit (pbit) of a plurality of pbits. Each of the plurality of pbits indicates a status of a corresponding one of the plurality of L2 cache data entries. The processor also includes a cache controller, which in response to a first request among a plurality of requests to access the data at the first L1 cache data entry, determines that a copy of the data is stored in the first L2 cache data entry; and retrieves the copy of the data from the L2 cache data entry in view of the status of the pbit.

Patent Agency Ranking